Machine Learning Production Systems: Engineering Machine Learning Models and Pipelines

Machine Learning Production Systems: Engineering Machine Learning Models and Pipelines

Автор: Caveness Emily , Crowe Robert , Hapke Hannes , Zhu Di

Дата выхода: 2025

Издательство: O’Reilly Media, Inc.

Количество страниц: 475

Размер файла: 3,6 МБ

Тип файла: PDF

Добавил: codelibs

Проверить на вирусы

Table of Contents....4

Foreword....16

Preface....20

Who Should Read This Book....20

Why We Wrote This Book....21

Navigating This Book....21

Conventions Used in This Book....22

Using Code Examples....22

O’Reilly Online Learning....23

How to Contact Us....23

Acknowledgments....24

Robert....25

Hannes....25

Emily....25

Di....26

Chapter 1. Introduction to Machine Learning Production Systems....28

What Is Production Machine Learning?....28

Benefits of Machine Learning Pipelines....30

Focus on Developing New Models, Not on Maintaining Existing Models....30

Prevention of Bugs....30

Creation of Records for Debugging and Reproducing Results....31

Standardization....31

The Business Case for ML Pipelines....31

When to Use Machine Learning Pipelines....32

Steps in a Machine Learning Pipeline....32

Data Ingestion and Data Versioning....33

Data Validation....33

Feature Engineering....33

Model Training and Model Tuning....34

Model Analysis....34

Model Deployment....35

Looking Ahead....35

Chapter 2. Collecting, Labeling, and Validating Data....36

Important Considerations in Data Collection....36

Responsible Data Collection....37

Labeling Data: Data Changes and Drift in Production ML....38

Labeling Data: Direct Labeling and Human Labeling....40

Validating Data: Detecting Data Issues....41

Validating Data: TensorFlow Data Validation....41

Skew Detection with TFDV....42

Types of Skew....43

Example: Spotting Imbalanced Datasets with TensorFlow Data Validation....44

Conclusion....46

Chapter 3. Feature Engineering and Feature Selection....48

Introduction to Feature Engineering....48

Preprocessing Operations....50

Feature Engineering Techniques....51

Normalizing and Standardizing....51

Bucketizing....52

Feature Crosses....53

Dimensionality and Embeddings....53

Visualization....53

Feature Transformation at Scale....54

Choose a Framework That Scales Well....54

Avoid Training–Serving Skew....55

Consider Instance-Level Versus Full-Pass Transformations....55

Using TensorFlow Transform....56

Analyzers....58

Code Example....59

Feature Selection....59

Feature Spaces....60

Feature Selection Overview....60

Filter Methods....61

Wrapper Methods....62

Embedded Methods....64

Feature and Example Selection for LLMs and GenAI....65

Example: Using TF Transform to Tokenize Text....65

Benefits of Using TF Transform....68

Alternatives to TF Transform....69

Conclusion....69

Chapter 4. Data Journey and Data Storage....70

Data Journey....70

ML Metadata....71

Using a Schema....72

Schema Development....73

Schema Environments....73

Changes Across Datasets....74

Enterprise Data Storage....75

Feature Stores....75

Data Warehouses....77

Data Lakes....78

Conclusion....78

Chapter 5. Advanced Labeling, Augmentation, and Data Preprocessing....80

Advanced Labeling....81

Semi-Supervised Labeling....81

Active Learning....83

Weak Supervision....86

Advanced Labeling Review....87

Data Augmentation....88

Example: CIFAR-10....89

Other Augmentation Techniques....89

Data Augmentation Review....89

Preprocessing Time Series Data: An Example....90

Windowing....91

Sampling....92

Conclusion....93

Chapter 6. Model Resource Management Techniques....94

Dimensionality Reduction: Dimensionality Effect on Performance....94

Example: Word Embedding Using Keras....95

Curse of Dimensionality....99

Adding Dimensions Increases Feature Space Volume....100

Dimensionality Reduction....101

Quantization and Pruning....105

Mobile, IoT, Edge, and Similar Use Cases....105

Quantization....105

Optimizing Your TensorFlow Model with TF Lite....111

Optimization Options....112

Pruning....113

Knowledge Distillation....116

Teacher and Student Networks....116

Knowledge Distillation Techniques....117

TMKD: Distilling Knowledge for a Q&A Task....120

Increasing Robustness by Distilling EfficientNets....122

Conclusion....123

Chapter 7. High-Performance Modeling....124

Distributed Training....124

Data Parallelism....125

Efficient Input Pipelines....128

Input Pipeline Basics....128

Input Pipeline Patterns: Improving Efficiency....129

Optimizing Your Input Pipeline with TensorFlow Data....130

Training Large Models: The Rise of Giant Neural Nets and Parallelism....132

Potential Solutions and Their Shortcomings....133

Pipeline Parallelism to the Rescue?....134

Conclusion....136

Chapter 8. Model Analysis....138

Analyzing Model Performance....138

Black-Box Evaluation....139

Performance Metrics and Optimization Objectives....139

Advanced Model Analysis....140

TensorFlow Model Analysis....140

The Learning Interpretability Tool....146

Advanced Model Debugging....147

Benchmark Models....148

Sensitivity Analysis....148

Residual Analysis....152

Model Remediation....153

Discrimination Remediation....154

Fairness....154

Fairness Evaluation....155

Fairness Considerations....157

Continuous Evaluation and Monitoring....157

Conclusion....158

Chapter 9. Interpretability....160

Explainable AI....160

Model Interpretation Methods....163

Method Categories....163

Intrinsically Interpretable Models....166

Model-Agnostic Methods....171

Local Interpretable Model-Agnostic Explanations....175

Shapley Values....176

The SHAP Library....178

Testing Concept Activation Vectors....180

AI Explanations....181

Example: Exploring Model Sensitivity with SHAP....183

Regression Models....183

Natural Language Processing Models....185

Conclusion....186

Chapter 10. Neural Architecture Search....188

Hyperparameter Tuning....188

Introduction to AutoML....190

Key Components of NAS....190

Search Spaces....191

Search Strategies....193

Performance Estimation Strategies....195

AutoML in the Cloud....196

Amazon SageMaker Autopilot....196

Microsoft Azure Automated Machine Learning....197

Google Cloud AutoML....198

Using AutoML....199

Generative AI and AutoML....199

Conclusion....199

Chapter 11. Introduction to Model Serving....200

Model Training....200

Model Prediction....201

Latency....201

Throughput....201

Cost....202

Resources and Requirements for Serving Models....202

Cost and Complexity....202

Accelerators....203

Feeding the Beast....204

Model Deployments....204

Data Center Deployments....205

Mobile and Distributed Deployments....205

Model Servers....206

Managed Services....207

Conclusion....208

Chapter 12. Model Serving Patterns....210

Batch Inference....210

Batch Throughput....211

Batch Inference Use Cases....212

ETL for Distributed Batch and Stream Processing Systems....213

Introduction to Real-Time Inference....213

Synchronous Delivery of Real-Time Predictions....215

Asynchronous Delivery of Real-Time Predictions....215

Optimizing Real-Time Inference....215

Real-Time Inference Use Cases....216

Serving Model Ensembles....217

Ensemble Topologies....217

Example Ensemble....217

Ensemble Serving Considerations....217

Model Routers: Ensembles in GenAI....218

Data Preprocessing and Postprocessing in Real Time....218

Training Transformations Versus Serving Transformations....220

Windowing....220

Options for Preprocessing....221

Enter TensorFlow Transform....223

Postprocessing....224

Inference at the Edge and at the Browser....225

Challenges....226

Model Deployments via Containers....227

Training on the Device....227

Federated Learning....228

Runtime Interoperability....228

Inference in Web Browsers....229

Conclusion....229

Chapter 13. Model Serving Infrastructure....230

Model Servers....231

TensorFlow Serving....231

NVIDIA Triton Inference Server....233

TorchServe....234

Building Scalable Infrastructure....235

Containerization....237

Traditional Deployment Era....237

Virtualized Deployment Era....238

Container Deployment Era....238

The Docker Containerization Framework....238

Container Orchestration....240

Reliability and Availability Through Redundancy....243

Observability....244

High Availability....245

Automated Deployments....246

Hardware Accelerators....246

GPUs....247

TPUs....247

Conclusion....248

Chapter 14. Model Serving Examples....250

Example: Deploying TensorFlow Models with TensorFlow Serving....250

Exporting Keras Models for TF Serving....250

Setting Up TF Serving with Docker....251

Basic Configuration of TF Serving....251

Making Model Prediction Requests with REST....252

Making Model Prediction Requests with gRPC....254

Getting Predictions from Classification and Regression Models....255

Using Payloads....256

Getting Model Metadata from TF Serving....256

Making Batch Inference Requests....257

Example: Profiling TF Serving Inferences with TF Profiler....259

Prerequisites....259

TensorBoard Setup....260

Model Profile....261

Example: Basic TorchServe Setup....265

Installing the TorchServe Dependencies....265

Exporting Your Model for TorchServe....265

Setting Up TorchServe....266

Making Model Prediction Requests....269

Making Batch Inference Requests....269

Conclusion....270

Chapter 15. Model Management and Delivery....272

Experiment Tracking....272

Experimenting in Notebooks....273

Experimenting Overall....274

Tools for Experiment Tracking and Versioning....275

Introduction to MLOps....279

Data Scientists Versus Software Engineers....279

ML Engineers....279

ML in Products and Services....280

MLOps....280

MLOps Methodology....282

MLOps Level 0....282

MLOps Level 1....284

MLOps Level 2....287

Components of an Orchestrated Workflow....290

Three Types of Custom Components....292

Python Function–Based Components....292

Container-Based Components....293

Fully Custom Components....294

TFX Deep Dive....297

TFX SDK....297

Intermediate Representation....298

Runtime....298

Implementing an ML Pipeline Using TFX Components....298

Advanced Features of TFX....300

Managing Model Versions....302

Approaches to Versioning Models....302

Model Lineage....304

Model Registries....304

Continuous Integration and Continuous Deployment....305

Continuous Integration....305

Continuous Delivery....307

Progressive Delivery....307

Blue/Green Deployment....308

Canary Deployment....308

Live Experimentation....309

Conclusion....311

Chapter 16. Model Monitoring and Logging....312

The Importance of Monitoring....313

Observability in Machine Learning....314

What Should You Monitor?....315

Custom Alerting in TFX....316

Logging....317

Distributed Tracing....319

Monitoring for Model Decay....320

Data Drift and Concept Drift....321

Model Decay Detection....322

Supervised Monitoring Techniques....323

Unsupervised Monitoring Techniques....324

Mitigating Model Decay....325

Retraining Your Model....326

When to Retrain....326

Automated Retraining....327

Conclusion....327

Chapter 17. Privacy and Legal Requirements....328

Why Is Data Privacy Important?....329

What Data Needs to Be Kept Private?....329

Harms....330

Only Collect What You Need....330

GenAI Data Scraped from the Web and Other Sources....331

Legal Requirements....331

The GDPR and the CCPA....331

The GDPR’s Right to Be Forgotten....332

Pseudonymization and Anonymization....333

Differential Privacy....334

Local and Global DP....335

Epsilon-Delta DP....335

Applying Differential Privacy to ML....336

TensorFlow Privacy Example....337

Federated Learning....339

Encrypted ML....340

Conclusion....341

Chapter 18. Orchestrating Machine Learning Pipelines....342

An Introduction to Pipeline Orchestration....342

Why Pipeline Orchestration?....342

Directed Acyclic Graphs....343

Pipeline Orchestration with TFX....344

Interactive TFX Pipelines....344

Converting Your Interactive Pipeline for Production....346

Orchestrating TFX Pipelines with Apache Beam....346

Orchestrating TFX Pipelines with Kubeflow Pipelines....348

Introduction to Kubeflow Pipelines....348

Installation and Initial Setup....350

Accessing Kubeflow Pipelines....351

The Workflow from TFX to Kubeflow....352

OpFunc Functions....355

Orchestrating Kubeflow Pipelines....357

Google Cloud Vertex Pipelines....360

Setting Up Google Cloud and Vertex Pipelines....360

Setting Up a Google Cloud Service Account....364

Orchestrating Pipelines with Vertex Pipelines....367

Executing Vertex Pipelines....369

Choosing Your Orchestrator....371

Interactive TFX....371

Apache Beam....371

Kubeflow Pipelines....371

Google Cloud Vertex Pipelines....372

Alternatives to TFX....372

Conclusion....372

Chapter 19. Advanced TFX....374

Advanced Pipeline Practices....374

Configure Your Components....374

Import Artifacts....375

Use Resolver Node....376

Execute a Conditional Pipeline....377

Export TF Lite Models....378

Warm-Starting Model Training....379

Use Exit Handlers....380

Trigger Messages from TFX....381

Custom TFX Components: Architecture and Use Cases....383

Architecture of TFX Components....383

Use Cases of Custom Components....384

Using Function-Based Custom Components....384

Writing a Custom Component from Scratch....385

Defining Component Specifications....387

Defining Component Channels....388

Writing the Custom Executor....388

Writing the Custom Driver....391

Assembling the Custom Component....392

Using Our Basic Custom Component....393

Implementation Review....394

Reusing Existing Components....394

Creating Container-Based Custom Components....397

Which Custom Component Is Right for You?....399

TFX-Addons....400

Conclusion....401

Chapter 20. ML Pipelines for Computer Vision Problems....402

Our Data....403

Our Model....403

Custom Ingestion Component....404

Data Preprocessing....405

Exporting the Model....406

Our Pipeline....407

Data Ingestion....407

Data Preprocessing....408

Model Training....409

Model Evaluation....409

Model Export....411

Putting It All Together....411

Executing on Apache Beam....412

Executing on Vertex Pipelines....413

Model Deployment with TensorFlow Serving....414

Conclusion....416

Chapter 21. ML Pipelines for Natural Language Processing....418

Our Data....419

Our Model....419

Ingestion Component....420

Data Preprocessing....421

Putting the Pipeline Together....424

Executing the Pipeline....424

Model Deployment with Google Cloud Vertex....425

Registering Your ML Model....425

Creating a New Model Endpoint....427

Deploying Your ML Model....427

Requesting Predictions from the Deployed Model....429

Cleaning Up Your Deployed Model....430

Conclusion....431

Chapter 22. Generative AI....432

Generative Models....433

GenAI Model Types....433

Agents and Copilots....434

Pretraining....434

Pretraining Datasets....435

Embeddings....435

Self-Supervised Training with Masks....436

Fine-Tuning....437

Fine-Tuning Versus Transfer Learning....437

Fine-Tuning Datasets....438

Fine-Tuning Considerations for Production....438

Fine-Tuning Versus Model APIs....439

Parameter-Efficient Fine-Tuning....439

LoRA....439

S-LoRA....440

Human Alignment....440

Reinforcement Learning from Human Feedback....440

Reinforcement Learning from AI Feedback....441

Direct Preference Optimization....441

Prompting....442

Chaining....443

Retrieval Augmented Generation....443

ReAct....444

Evaluation....444

Evaluation Techniques....444

Benchmarking Across Models....445

LMOps....445

GenAI Attacks....446

Jailbreaks....446

Prompt Injection....447

Responsible GenAI....447

Design for Responsibility....447

Conduct Adversarial Testing....448

Constitutional AI....448

Conclusion....449

Chapter 23. The Future of Machine Learning Production Systems and Next Steps....450

Let’s Think in Terms of ML Systems, Not ML Models....450

Bringing ML Systems Closer to Domain Experts....451

Privacy Has Never Been More Important....451

Conclusion....451

Index....454

About the Authors....473

Colophon....473

Using machine learning for products, services, and critical business processes is quite different from using ML in an academic or research setting—especially for recent ML graduates and those moving from research to a commercial environment. Whether you currently work to create products and services that use ML, or would like to in the future, this practical book gives you a broad view of the entire field.

Authors Robert Crowe, Hannes Hapke, Emily Caveness, and Di Zhu help you identify topics that you can dive into deeper, along with reference materials and tutorials that teach you the details. You'll learn the state of the art of machine learning engineering, including a wide range of topics such as modeling, deployment, and MLOps. You'll learn the basics and advanced aspects to understand the production ML lifecycle.

This book provides four in-depth sections that cover all aspects of machine learning engineering:

Data: collecting, labeling, validating, automation, and data preprocessing; data feature engineering and selection; data journey and storage
Modeling: high performance modeling; model resource management techniques; model analysis and interoperability; neural architecture search
Deployment: model serving patterns and infrastructure for ML models and LLMs; management and delivery; monitoring and logging
Productionalizing: ML pipelines; classifying unstructured texts and images; genAI model pipelines

Если вам понравилась эта страница - поделитесь ею с друзьями, тем самым вы помогаете нам развиваться и добавлять всё больше интересных и нужным вам книг