Machine Learning System Design: With end-to-end examples

Machine Learning System Design: With end-to-end examples

Автор: Babushkin Valerii , Kravchenko Arseny

Дата выхода: 2025

Издательство: Manning Publications Co.

Количество страниц: 375

Размер файла: 4,9 МБ

Тип файла: PDF

Добавил: codelibs

Проверить на вирусы

Machine Learning System Design....1

brief contents....6

contents....8

preface....15

acknowledgments....16

about this book....18

Who should read this book?....18

How this book is organized: A roadmap....19

liveBook discussion forum....20

about the authors....21

about the cover illustration....22

Part 1 Preparations....24

1 Essentials of machine learning system design....26

1.1 ML system design: What are you?....27

1.1.1 Why ML system design is so important....31

1.1.2 Roots of ML system design....31

1.2 How this book is structured....33

1.3 When principles of ML system design can be helpful....35

Summary....39

2 Is there a problem?....40

2.1 Problem space vs. solution space....41

2.2 Finding the problem....44

2.2.1 How we can approximate a solution through an ML system....47

2.3 Risks, limitations, and possible consequences....49

2.4 Costs of a mistake....51

Summary....53

3 Preliminary research....54

3.1 What problems can inspire you?....55

3.2 Build or buy: Open source-based or proprietary tech....57

3.2.1 Build or buy....57

3.2.2 Open source-based or proprietary tech....59

3.3 Problem decompositioning....59

3.4 Choosing the right degree of innovation....64

3.4.1 What solutions can be useful?....65

3.4.2 Working on the solution space: Practical example....67

Summary....69

4 Design document....71

4.1 Common myths surrounding the design document....72

4.1.1 Myth #1. Design documents work only for big companies but not startups....72

4.1.2 Myth #2. Design documents are efficient only for complex projects....73

4.1.3 Myth #3. Every design document should be based on a template....73

4.1.4 Myth #4. Every design document should lead to a deployed system....73

4.2 Goals and antigoals....74

4.3 Design document structure....77

4.4 Reviewing a design document....80

4.4.1 Design document review example....82

4.5 A design doc is a living thing....83

Summary....85

Part 2 Early stage....86

5 Loss functions and metrics....88

5.1 Losses....89

5.1.1 Loss tricks for deep learning models....92

5.2 Metrics....93

5.2.1 Consistency metrics....102

5.2.2 Offline and online metrics, proxy metrics, and hierarchy of metrics....104

5.3 Design document: Adding losses and metrics....107

5.3.1 Metrics and loss functions for Supermegaretail....107

5.3.2 Metrics and loss functions for PhotoStock Inc.....111

5.3.3 Wrap up....113

Summary....113

6 Gathering datasets....114

6.1 Data sources....115

6.2 Cooking the dataset....117

6.2.1 ETL....117

6.2.2 Filtering....118

6.2.3 Feature engineering....119

6.2.4 Labeling....119

6.3 Data and metadata....123

6.4 How much data is enough?....124

6.5 Chicken-or-egg problem....127

6.6 Properties of a healthy data pipeline....129

6.7 Design document: Dataset....131

6.7.1 Dataset for Supermegaretail....131

6.7.2 Dataset for PhotoStock Inc.....134

Summary....136

7 Validation schemas....137

7.1 Reliable evaluation....138

7.2 Standard schemas....139

7.2.1 Holdout sets....139

7.2.2 Cross-validation....140

7.2.3 The choice of K....141

7.2.4 Time-series validation....142

7.3 Nontrivial schemas....144

7.3.1 Nested validation....145

7.3.2 Adversarial validation....146

7.3.3 Quantifying dataset leakage exploitation....146

7.4 Split updating procedure....147

7.5 Design document: Choosing validation schemas....154

7.5.1 Validation schemas for Supermegaretail....154

7.5.2 Validation schemas for PhotoStock Inc.....157

Summary....158

8 Baseline solution....159

8.1 Baseline: What are you?....160

8.2 Constant baselines....162

8.2.1 Why do we need constant baselines?....164

8.3 Model baselines and feature baselines....165

8.4 Variety of deep learning baselines....167

8.5 Baseline comparison....169

8.6 Design document: Baselines....171

8.6.1 Baselines for Supermegaretail....171

8.6.2 Baselines for PhotoStock Inc.....173

Summary....174

Part 3 Intermediate steps....176

9 Error analysis....178

9.1 Learning curve analysis....179

9.1.1 Overfitting and underfitting....180

9.1.2 Loss curve....181

9.1.3 Interpreting loss curves....182

9.1.4 Model-wise learning curve....185

9.1.5 Sample-wise learning curve....186

9.1.6 Double descent....186

9.2 Residual analysis....187

9.2.1 Goals of residual analysis....189

9.2.2 Model assumptions....190

9.2.3 Residual distribution....193

9.2.4 Fairness of residuals....195

9.2.5 Underprediction and overprediction....196

9.2.6 Elasticity curves....197

9.3 Finding commonalities in residuals....198

9.3.1 Worst/best-case analysis....199

9.3.2 Adversarial validation....200

9.3.3 Variety of group analysis....200

9.3.4 Corner-case analysis....201

9.4 Design document: Error analysis....202

9.4.1 Error analysis for Supermegaretail....202

9.4.2 Error analysis for PhotoStock Inc.....206

Summary....207

10 Training pipelines....208

10.1 Training pipeline: What are you?....208

10.1.1 Training pipeline vs. inference pipeline....209

10.2 Tools and platforms....213

10.3 Scalability....214

10.4 Configurability....216

10.5 Testing....219

10.5.1 Property-based testing....220

10.6 Design document: Training pipelines....221

10.6.1 Training pipeline for Supermegaretail....222

10.6.2 Training pipeline for PhotoStock Inc.....223

Summary....225

11 Features and feature engineering....226

11.1 Feature engineering: What are you?....227

11.1.1 Criteria of good and bad features....228

11.1.2 Feature generation 101....229

11.1.3 Model predictions as a feature....231

11.2 Feature importance analysis....232

11.2.1 Classification of methods....234

11.2.2 Accuracy–interpretability tradeoff....236

11.2.3 Feature importance in deep learning....236

11.3 Feature selection....239

11.3.1 Feature generation vs. feature selection....239

11.3.2 Goals and possible drawbacks....239

11.3.3 Feature selection method overview....241

11.4 Feature store....244

11.4.1 Feature store: Pros and cons....246

11.4.2 Desired properties of a feature store....248

11.4.3 Feature catalog....252

11.5 Design document: Feature engineering....252

11.5.1 Features for Supermegaretail....252

11.5.2 Features for PhotoStock Inc.....254

Summary....256

12 Measuring and reporting results....257

12.1 Measuring results....258

12.1.1 Model performance....258

12.1.2 Transition to business metrics....259

12.1.3 Simulated environment....260

12.1.4 Human evaluation....264

12.2 A/B testing....264

12.2.1 Experiment design....265

12.2.2 Splitting strategy....267

12.2.3 Selecting metrics....268

12.2.4 Statistical criteria....269

12.2.5 Simulated experiments....270

12.2.6 When A/B testing is not possible....271

12.3 Reporting results....271

12.3.1 Control and auxiliary metrics....272

12.3.2 Uplift monitoring....272

12.3.3 When to finish the experiment....273

12.3.4 What to report....274

12.3.5 Debrief document....274

12.4 Design document: Measuring and reporting....275

12.4.1 Measuring and reporting for Supermegaretail....275

12.4.2 Measuring and reporting for PhotoStock Inc.....277

Summary....283

Part 4 Integration and growth....284

13 Integration....286

13.1 API design....287

13.1.1 API practices....291

13.2 Release cycle....292

13.3 Operating the system....296

13.3.1 Tech-related connections....296

13.3.2 Non-tech-related connections....297

13.4 Overrides and fallbacks....297

13.5 Design document: Integration....299

13.5.1 Integration for Supermegaretail....299

13.5.2 Integration for PhotoStock Inc.....302

Summary....304

14 Monitoring and reliability....305

14.1 Why monitoring is important....306

14.1.1 Incoming data....307

14.1.2 Model....307

14.1.3 Model output....308

14.1.4 Postprocessing/decision-making....309

14.2 Software system health....310

14.3 Data quality and integrity....311

14.3.1 Processing problems....311

14.3.2 Data source corruption....312

14.3.3 Cascade/upstream models....313

14.3.4 Schema change....314

14.3.5 Training-serving skew....314

14.3.6 How to monitor and react....315

14.4 Model quality and relevance....318

14.4.1 Data drift....320

14.4.2 Concept drift....321

14.4.3 How to monitor....322

14.4.4 How to react....324

14.5 Design document: Monitoring....329

14.5.1 Monitoring for Supermegaretail....329

14.5.2 Monitoring for PhotoStock Inc.....331

Summary....332

15 Serving and inference optimization....334

15.1 Serving and inference: Challenges....335

15.2 Tradeoffs and patterns....337

15.2.1 Tradeoffs....337

15.2.2 Patterns....340

15.3 Tools and frameworks....341

15.3.1 Choosing a framework....342

15.3.2 Serverless inference....344

15.4 Optimizing inference pipelines....346

15.4.1 Starting with profiling....346

15.4.2 The best optimizing is minimum optimizing....348

15.5 Design document: Serving and inference....348

15.5.1 Serving and inference for Supermegaretail....349

15.5.2 Serving and inference for PhotoStock Inc.....350

Summary....352

16 Ownership and maintenance....353

16.1 Accountability....354

16.2 Bus factor....359

16.2.1 Why is being too efficient not beneficial?....359

16.2.2 Why is being too redundant not beneficial?....360

16.2.3 When and how to use the bus factor....360

16.3 Documentation....361

16.4 Complexity....363

16.5 Maintenance and ownership: Supermegaretail and PhotoStock Inc.....366

Summary....367

index....368

A....368

B....368

C....368

D....369

E....369

F....370

G....370

H....370

I....370

K....371

L....371

M....371

N....372

O....372

P....372

Q....372

R....373

S....373

T....373

U....374

V....374

W....374

Y....374

Machine Learning System Design - back....375

From information gathering to release and maintenance, Machine Learning System Design guides you step-by-step through every stage of the machine learning process. Inside, you’ll find a reliable framework for building, maintaining, and improving machine learning systems at any scale or complexity.

In Machine Learning System Design: With end-to-end examples you will learn:

The big picture of machine learning system design
Analyzing a problem space to identify the optimal ML solution
Ace ML system design interviews
Selecting appropriate metrics and evaluation criteria
Prioritizing tasks at different stages of ML system design
Solving dataset-related problems with data gathering, error analysis, and feature engineering
Recognizing common pitfalls in ML system development
Designing ML systems to be lean, maintainable, and extensible over time

Authors Valeri Babushkin and Arseny Kravchenko have filled this unique handbook with campfire stories and personal tips from their own extensive careers. You’ll learn directly from their experience as you consider every facet of a machine learning system, from requirements gathering and data sourcing to deployment and management of the finished system.

About the technology

Designing and delivering a machine learning system is an intricate multistep process that requires many skills and roles. Whether you’re an engineer adding machine learning to an existing application or designing a ML system from the ground up, you need to navigate massive datasets and streams, lock down testing and deployment requirements, and master the unique complexities of putting ML models into production. That’s where this book comes in.

About the book

Machine Learning System Design shows you how to design and deploy a machine learning project from start to finish. You’ll follow a step-by-step framework for designing, implementing, releasing, and maintaining ML systems. As you go, requirement checklists and real-world examples help you prepare to deliver and optimize your own ML systems. You’ll especially love the campfire stories and personal tips, and ML system design interview tips.

What's inside

Metrics and evaluation criteria
Solve common dataset problems
Common pitfalls in ML system development
ML system design interview tips

About the reader

For readers who know the basics of software engineering and machine learning. Examples in Python.

Если вам понравилась эта страница - поделитесь ею с друзьями, тем самым вы помогаете нам развиваться и добавлять всё больше интересных и нужным вам книг