Copyright....3
Table of Contents....4
Foreword....16
Preface....20
Who Should Read This Book....20
Why We Wrote This Book....21
Navigating This Book....21
Conventions Used in This Book....22
Using Code Examples....22
O’Reilly Online Learning....23
How to Contact Us....23
Acknowledgments....24
Robert....25
Hannes....25
Emily....25
Di....26
Chapter 1. Introduction to Machine Learning Production Systems....28
What Is Production Machine Learning?....28
Benefits of Machine Learning Pipelines....30
Focus on Developing New Models, Not on Maintaining Existing Models....30
Prevention of Bugs....30
Creation of Records for Debugging and Reproducing Results....31
Standardization....31
The Business Case for ML Pipelines....31
When to Use Machine Learning Pipelines....32
Steps in a Machine Learning Pipeline....32
Data Ingestion and Data Versioning....33
Data Validation....33
Feature Engineering....33
Model Training and Model Tuning....34
Model Analysis....34
Model Deployment....35
Looking Ahead....35
Chapter 2. Collecting, Labeling, and Validating Data....36
Important Considerations in Data Collection....36
Responsible Data Collection....37
Labeling Data: Data Changes and Drift in Production ML....38
Labeling Data: Direct Labeling and Human Labeling....40
Validating Data: Detecting Data Issues....41
Validating Data: TensorFlow Data Validation....41
Skew Detection with TFDV....42
Types of Skew....43
Example: Spotting Imbalanced Datasets with TensorFlow Data Validation....44
Conclusion....46
Chapter 3. Feature Engineering and Feature Selection....48
Introduction to Feature Engineering....48
Preprocessing Operations....50
Feature Engineering Techniques....51
Normalizing and Standardizing....51
Bucketizing....52
Feature Crosses....53
Dimensionality and Embeddings....53
Visualization....53
Feature Transformation at Scale....54
Choose a Framework That Scales Well....54
Avoid Training–Serving Skew....55
Consider Instance-Level Versus Full-Pass Transformations....55
Using TensorFlow Transform....56
Analyzers....58
Code Example....59
Feature Selection....59
Feature Spaces....60
Feature Selection Overview....60
Filter Methods....61
Wrapper Methods....62
Embedded Methods....64
Feature and Example Selection for LLMs and GenAI....65
Example: Using TF Transform to Tokenize Text....65
Benefits of Using TF Transform....68
Alternatives to TF Transform....69
Conclusion....69
Chapter 4. Data Journey and Data Storage....70
Data Journey....70
ML Metadata....71
Using a Schema....72
Schema Development....73
Schema Environments....73
Changes Across Datasets....74
Enterprise Data Storage....75
Feature Stores....75
Data Warehouses....77
Data Lakes....78
Conclusion....78
Chapter 5. Advanced Labeling, Augmentation, and Data Preprocessing....80
Advanced Labeling....81
Semi-Supervised Labeling....81
Active Learning....83
Weak Supervision....86
Advanced Labeling Review....87
Data Augmentation....88
Example: CIFAR-10....89
Other Augmentation Techniques....89
Data Augmentation Review....89
Preprocessing Time Series Data: An Example....90
Windowing....91
Sampling....92
Conclusion....93
Chapter 6. Model Resource Management Techniques....94
Dimensionality Reduction: Dimensionality Effect on Performance....94
Example: Word Embedding Using Keras....95
Curse of Dimensionality....99
Adding Dimensions Increases Feature Space Volume....100
Dimensionality Reduction....101
Quantization and Pruning....105
Mobile, IoT, Edge, and Similar Use Cases....105
Quantization....105
Optimizing Your TensorFlow Model with TF Lite....111
Optimization Options....112
Pruning....113
Knowledge Distillation....116
Teacher and Student Networks....116
Knowledge Distillation Techniques....117
TMKD: Distilling Knowledge for a Q&A Task....120
Increasing Robustness by Distilling EfficientNets....122
Conclusion....123
Chapter 7. High-Performance Modeling....124
Distributed Training....124
Data Parallelism....125
Efficient Input Pipelines....128
Input Pipeline Basics....128
Input Pipeline Patterns: Improving Efficiency....129
Optimizing Your Input Pipeline with TensorFlow Data....130
Training Large Models: The Rise of Giant Neural Nets and Parallelism....132
Potential Solutions and Their Shortcomings....133
Pipeline Parallelism to the Rescue?....134
Conclusion....136
Chapter 8. Model Analysis....138
Analyzing Model Performance....138
Black-Box Evaluation....139
Performance Metrics and Optimization Objectives....139
Advanced Model Analysis....140
TensorFlow Model Analysis....140
The Learning Interpretability Tool....146
Advanced Model Debugging....147
Benchmark Models....148
Sensitivity Analysis....148
Residual Analysis....152
Model Remediation....153
Discrimination Remediation....154
Fairness....154
Fairness Evaluation....155
Fairness Considerations....157
Continuous Evaluation and Monitoring....157
Conclusion....158
Chapter 9. Interpretability....160
Explainable AI....160
Model Interpretation Methods....163
Method Categories....163
Intrinsically Interpretable Models....166
Model-Agnostic Methods....171
Local Interpretable Model-Agnostic Explanations....175
Shapley Values....176
The SHAP Library....178
Testing Concept Activation Vectors....180
AI Explanations....181
Example: Exploring Model Sensitivity with SHAP....183
Regression Models....183
Natural Language Processing Models....185
Conclusion....186
Chapter 10. Neural Architecture Search....188
Hyperparameter Tuning....188
Introduction to AutoML....190
Key Components of NAS....190
Search Spaces....191
Search Strategies....193
Performance Estimation Strategies....195
AutoML in the Cloud....196
Amazon SageMaker Autopilot....196
Microsoft Azure Automated Machine Learning....197
Google Cloud AutoML....198
Using AutoML....199
Generative AI and AutoML....199
Conclusion....199
Chapter 11. Introduction to Model Serving....200
Model Training....200
Model Prediction....201
Latency....201
Throughput....201
Cost....202
Resources and Requirements for Serving Models....202
Cost and Complexity....202
Accelerators....203
Feeding the Beast....204
Model Deployments....204
Data Center Deployments....205
Mobile and Distributed Deployments....205
Model Servers....206
Managed Services....207
Conclusion....208
Chapter 12. Model Serving Patterns....210
Batch Inference....210
Batch Throughput....211
Batch Inference Use Cases....212
ETL for Distributed Batch and Stream Processing Systems....213
Introduction to Real-Time Inference....213
Synchronous Delivery of Real-Time Predictions....215
Asynchronous Delivery of Real-Time Predictions....215
Optimizing Real-Time Inference....215
Real-Time Inference Use Cases....216
Serving Model Ensembles....217
Ensemble Topologies....217
Example Ensemble....217
Ensemble Serving Considerations....217
Model Routers: Ensembles in GenAI....218
Data Preprocessing and Postprocessing in Real Time....218
Training Transformations Versus Serving Transformations....220
Windowing....220
Options for Preprocessing....221
Enter TensorFlow Transform....223
Postprocessing....224
Inference at the Edge and at the Browser....225
Challenges....226
Model Deployments via Containers....227
Training on the Device....227
Federated Learning....228
Runtime Interoperability....228
Inference in Web Browsers....229
Conclusion....229
Chapter 13. Model Serving Infrastructure....230
Model Servers....231
TensorFlow Serving....231
NVIDIA Triton Inference Server....233
TorchServe....234
Building Scalable Infrastructure....235
Containerization....237
Traditional Deployment Era....237
Virtualized Deployment Era....238
Container Deployment Era....238
The Docker Containerization Framework....238
Container Orchestration....240
Reliability and Availability Through Redundancy....243
Observability....244
High Availability....245
Automated Deployments....246
Hardware Accelerators....246
GPUs....247
TPUs....247
Conclusion....248
Chapter 14. Model Serving Examples....250
Example: Deploying TensorFlow Models with TensorFlow Serving....250
Exporting Keras Models for TF Serving....250
Setting Up TF Serving with Docker....251
Basic Configuration of TF Serving....251
Making Model Prediction Requests with REST....252
Making Model Prediction Requests with gRPC....254
Getting Predictions from Classification and Regression Models....255
Using Payloads....256
Getting Model Metadata from TF Serving....256
Making Batch Inference Requests....257
Example: Profiling TF Serving Inferences with TF Profiler....259
Prerequisites....259
TensorBoard Setup....260
Model Profile....261
Example: Basic TorchServe Setup....265
Installing the TorchServe Dependencies....265
Exporting Your Model for TorchServe....265
Setting Up TorchServe....266
Making Model Prediction Requests....269
Making Batch Inference Requests....269
Conclusion....270
Chapter 15. Model Management and Delivery....272
Experiment Tracking....272
Experimenting in Notebooks....273
Experimenting Overall....274
Tools for Experiment Tracking and Versioning....275
Introduction to MLOps....279
Data Scientists Versus Software Engineers....279
ML Engineers....279
ML in Products and Services....280
MLOps....280
MLOps Methodology....282
MLOps Level 0....282
MLOps Level 1....284
MLOps Level 2....287
Components of an Orchestrated Workflow....290
Three Types of Custom Components....292
Python Function–Based Components....292
Container-Based Components....293
Fully Custom Components....294
TFX Deep Dive....297
TFX SDK....297
Intermediate Representation....298
Runtime....298
Implementing an ML Pipeline Using TFX Components....298
Advanced Features of TFX....300
Managing Model Versions....302
Approaches to Versioning Models....302
Model Lineage....304
Model Registries....304
Continuous Integration and Continuous Deployment....305
Continuous Integration....305
Continuous Delivery....307
Progressive Delivery....307
Blue/Green Deployment....308
Canary Deployment....308
Live Experimentation....309
Conclusion....311
Chapter 16. Model Monitoring and Logging....312
The Importance of Monitoring....313
Observability in Machine Learning....314
What Should You Monitor?....315
Custom Alerting in TFX....316
Logging....317
Distributed Tracing....319
Monitoring for Model Decay....320
Data Drift and Concept Drift....321
Model Decay Detection....322
Supervised Monitoring Techniques....323
Unsupervised Monitoring Techniques....324
Mitigating Model Decay....325
Retraining Your Model....326
When to Retrain....326
Automated Retraining....327
Conclusion....327
Chapter 17. Privacy and Legal Requirements....328
Why Is Data Privacy Important?....329
What Data Needs to Be Kept Private?....329
Harms....330
Only Collect What You Need....330
GenAI Data Scraped from the Web and Other Sources....331
Legal Requirements....331
The GDPR and the CCPA....331
The GDPR’s Right to Be Forgotten....332
Pseudonymization and Anonymization....333
Differential Privacy....334
Local and Global DP....335
Epsilon-Delta DP....335
Applying Differential Privacy to ML....336
TensorFlow Privacy Example....337
Federated Learning....339
Encrypted ML....340
Conclusion....341
Chapter 18. Orchestrating Machine Learning Pipelines....342
An Introduction to Pipeline Orchestration....342
Why Pipeline Orchestration?....342
Directed Acyclic Graphs....343
Pipeline Orchestration with TFX....344
Interactive TFX Pipelines....344
Converting Your Interactive Pipeline for Production....346
Orchestrating TFX Pipelines with Apache Beam....346
Orchestrating TFX Pipelines with Kubeflow Pipelines....348
Introduction to Kubeflow Pipelines....348
Installation and Initial Setup....350
Accessing Kubeflow Pipelines....351
The Workflow from TFX to Kubeflow....352
OpFunc Functions....355
Orchestrating Kubeflow Pipelines....357
Google Cloud Vertex Pipelines....360
Setting Up Google Cloud and Vertex Pipelines....360
Setting Up a Google Cloud Service Account....364
Orchestrating Pipelines with Vertex Pipelines....367
Executing Vertex Pipelines....369
Choosing Your Orchestrator....371
Interactive TFX....371
Apache Beam....371
Kubeflow Pipelines....371
Google Cloud Vertex Pipelines....372
Alternatives to TFX....372
Conclusion....372
Chapter 19. Advanced TFX....374
Advanced Pipeline Practices....374
Configure Your Components....374
Import Artifacts....375
Use Resolver Node....376
Execute a Conditional Pipeline....377
Export TF Lite Models....378
Warm-Starting Model Training....379
Use Exit Handlers....380
Trigger Messages from TFX....381
Custom TFX Components: Architecture and Use Cases....383
Architecture of TFX Components....383
Use Cases of Custom Components....384
Using Function-Based Custom Components....384
Writing a Custom Component from Scratch....385
Defining Component Specifications....387
Defining Component Channels....388
Writing the Custom Executor....388
Writing the Custom Driver....391
Assembling the Custom Component....392
Using Our Basic Custom Component....393
Implementation Review....394
Reusing Existing Components....394
Creating Container-Based Custom Components....397
Which Custom Component Is Right for You?....399
TFX-Addons....400
Conclusion....401
Chapter 20. ML Pipelines for Computer Vision Problems....402
Our Data....403
Our Model....403
Custom Ingestion Component....404
Data Preprocessing....405
Exporting the Model....406
Our Pipeline....407
Data Ingestion....407
Data Preprocessing....408
Model Training....409
Model Evaluation....409
Model Export....411
Putting It All Together....411
Executing on Apache Beam....412
Executing on Vertex Pipelines....413
Model Deployment with TensorFlow Serving....414
Conclusion....416
Chapter 21. ML Pipelines for Natural Language Processing....418
Our Data....419
Our Model....419
Ingestion Component....420
Data Preprocessing....421
Putting the Pipeline Together....424
Executing the Pipeline....424
Model Deployment with Google Cloud Vertex....425
Registering Your ML Model....425
Creating a New Model Endpoint....427
Deploying Your ML Model....427
Requesting Predictions from the Deployed Model....429
Cleaning Up Your Deployed Model....430
Conclusion....431
Chapter 22. Generative AI....432
Generative Models....433
GenAI Model Types....433
Agents and Copilots....434
Pretraining....434
Pretraining Datasets....435
Embeddings....435
Self-Supervised Training with Masks....436
Fine-Tuning....437
Fine-Tuning Versus Transfer Learning....437
Fine-Tuning Datasets....438
Fine-Tuning Considerations for Production....438
Fine-Tuning Versus Model APIs....439
Parameter-Efficient Fine-Tuning....439
LoRA....439
S-LoRA....440
Human Alignment....440
Reinforcement Learning from Human Feedback....440
Reinforcement Learning from AI Feedback....441
Direct Preference Optimization....441
Prompting....442
Chaining....443
Retrieval Augmented Generation....443
ReAct....444
Evaluation....444
Evaluation Techniques....444
Benchmarking Across Models....445
LMOps....445
GenAI Attacks....446
Jailbreaks....446
Prompt Injection....447
Responsible GenAI....447
Design for Responsibility....447
Conduct Adversarial Testing....448
Constitutional AI....448
Conclusion....449
Chapter 23. The Future of Machine Learning Production Systems and Next Steps....450
Let’s Think in Terms of ML Systems, Not ML Models....450
Bringing ML Systems Closer to Domain Experts....451
Privacy Has Never Been More Important....451
Conclusion....451
Index....454
About the Authors....473
Colophon....473
Using machine learning for products, services, and critical business processes is quite different from using ML in an academic or research setting—especially for recent ML graduates and those moving from research to a commercial environment. Whether you currently work to create products and services that use ML, or would like to in the future, this practical book gives you a broad view of the entire field.
Authors Robert Crowe, Hannes Hapke, Emily Caveness, and Di Zhu help you identify topics that you can dive into deeper, along with reference materials and tutorials that teach you the details. You'll learn the state of the art of machine learning engineering, including a wide range of topics such as modeling, deployment, and MLOps. You'll learn the basics and advanced aspects to understand the production ML lifecycle.