Deep Learning with Python. 3 Ed

Deep Learning with Python. 3 Ed

Автор: Chollet François , Watson Matthew

Дата выхода: 2026

Издательство: Manning Publications Co.

Количество страниц: 650

Размер файла: 12,4 МБ

Тип файла: PDF

Добавил: codelibs

Проверить на вирусы Дополнительные материалы

Deep Learning with Python, Third Edition....1

Praise for the Second Edition....3

brief contents....7

contents....8

preface....17

acknowledgments....19

about this book....20

Who should read this book....20

How this book is organized: A road map....21

About the code....22

liveBook discussion forum....22

about the authors....23

about the cover illustration....24

1 What is deep learning?....25

1.1 Artificial intelligence, machine learning, and deep learning....26

1.2 Artificial intelligence....26

1.3 Machine learning....27

1.4 Learning rules and representations from data....28

1.5 The “deep” in “deep learning”....31

1.6 Understanding how deep learning works, in three figures....32

1.7 What makes deep learning different....34

1.8 The age of generative AI....35

1.9 What deep learning has achieved so far....35

1.10 Beware of the short-term hype....36

1.11 Summer can turn to winter....38

1.12 The promise of AI....38

2 The mathematical building blocks of neural networks....40

2.1 A first look at a neural network....41

2.2 Data representations for neural networks....45

2.2.1 Scalars (rank-0 tensors)....46

2.2.2 Vectors (rank-1 tensors)....46

2.2.3 Matrices (rank-2 tensors)....46

2.2.4 Rank-3 tensors and higher-rank tensors....47

2.2.5 Key attributes....47

2.2.6 Manipulating tensors in NumPy....49

2.2.7 The notion of data batches....49

2.2.8 Real-world examples of data tensors....50

2.3 The gears of neural networks: Tensor operations....52

2.3.1 Element-wise operations....53

2.3.2 Broadcasting....54

2.3.3 Tensor product....56

2.3.4 Tensor reshaping....58

2.3.5 Geometric interpretation of tensor operations....59

2.3.6 A geometric interpretation of deep learning....62

2.4 The engine of neural networks: Gradient-based optimization....63

2.4.1 What’s a derivative?....65

2.4.2 Derivative of a tensor operation: The gradient....66

2.4.3 Stochastic gradient descent....67

2.4.4 Chaining derivatives: The Backpropagation algorithm....70

2.5 Looking back at our first example....75

2.5.1 Reimplementing our first example from scratch....77

2.5.2 Running one training step....79

2.5.3 The full training loop....81

2.5.4 Evaluating the model....82

3 Introduction to TensorFlow, PyTorch, JAX, and Keras....84

3.1 A brief history of deep learning frameworks....85

3.2 How these frameworks relate to each other....87

3.3 Introduction to TensorFlow....87

3.3.1 First steps with TensorFlow....88

3.3.2 An end-to-end example: A linear classifier in pure TensorFlow....93

3.3.3 What makes the TensorFlow approach unique....98

3.4 Introduction to PyTorch....98

3.4.1 First steps with PyTorch....99

3.4.2 An end-to-end example: A linear classifier in pure PyTorch....102

3.4.3 What makes the PyTorch approach unique....105

3.5 Introduction to JAX....106

3.5.1 First steps with JAX....106

3.5.2 Tensors in JAX....107

3.5.3 Random number generation in JAX....107

3.5.4 An end-to-end example: A linear classifier in pure JAX....112

3.5.5 What makes the JAX approach unique....114

3.6 Introduction to Keras....114

3.6.1 First steps with Keras....115

3.6.2 Layers: The building blocks of deep learning....116

3.6.3 From layers to models....120

3.6.4 The “compile” step: Configuring the learning process....121

3.6.5 Picking a loss function....123

3.6.6 Understanding the fit method....124

3.6.7 Monitoring loss and metrics on validation data....125

3.6.8 Inference: Using a model after training....126

4 Classification and regression....128

4.1 Classifying movie reviews: A binary classification example....130

4.1.1 The IMDb dataset....130

4.1.2 Preparing the data....131

4.1.3 Building your model....132

4.1.4 Validating your approach....135

4.1.5 Using a trained model to generate predictions on new data....139

4.1.6 Further experiments....139

4.1.7 Wrapping up....140

4.2 Classifying newswires: A multiclass classification example....140

4.2.1 The Reuters dataset....140

4.2.2 Preparing the data....142

4.2.3 Building your model....142

4.2.4 Validating your approach....144

4.2.5 Generating predictions on new data....148

4.2.6 A different way to handle the labels and the loss....148

4.2.7 The importance of having sufficiently large intermediate layers....149

4.2.8 Further experiments....149

4.2.9 Wrapping up....150

4.3 Predicting house prices: A regression example....150

4.3.1 The California Housing Price dataset....150

4.3.2 Preparing the data....152

4.3.3 Building your model....152

4.3.4 Validating your approach using K-fold validation....153

4.3.5 Generating predictions on new data....158

4.3.6 Wrapping up....158

5 Fundamentals of machine learning....160

5.1 Generalization: The goal of machine learning....160

5.1.1 Underfitting and overfitting....161

5.1.2 The nature of generalization in deep learning....167

5.2 Evaluating machine-learning models....173

5.2.1 Training, validation, and test sets....173

5.2.2 Beating a common-sense baseline....176

5.2.3 Things to keep in mind about model evaluation....176

5.3 Improving model fit....177

5.3.1 Tuning key gradient descent parameters....177

5.3.2 Using better architecture priors....179

5.3.3 Increasing model capacity....179

5.4 Improving generalization....182

5.4.1 Dataset curation....183

5.4.2 Feature engineering....183

5.4.3 Using early stopping....185

5.4.4 Regularizing your model....185

6 The universal workflow of machine learning....195

6.1 Defining the task....196

6.1.1 Framing the problem....196

6.1.2 Collecting a dataset....198

6.1.3 Understanding your data....202

6.1.4 Choosing a measure of success....202

6.2 Developing a model....203

6.2.1 Preparing the data....203

6.2.2 Choosing an evaluation protocol....204

6.2.3 Beating a baseline....205

6.2.4 Scaling up: Developing a model that overfits....206

6.2.5 Regularizing and tuning your model....207

6.3 Deploying your model....207

6.3.1 Explaining your work to stakeholders and setting expectations....208

6.3.2 Shipping an inference model....208

6.3.3 Monitoring your model in the wild....212

6.3.4 Maintaining your model....212

7 A deep dive on Keras....214

7.1 A spectrum of workflows....215

7.2 Different ways to build Keras models....216

7.2.1 The Sequential model....216

7.2.2 The Functional API....219

7.2.3 Subclassing the Model class....226

7.2.4 Mixing and matching different components....228

7.2.5 Remember: Use the right tool for the job....229

7.3 Using built-in training and evaluation loops....230

7.3.1 Writing your own metrics....231

7.3.2 Using callbacks....232

7.3.3 Writing your own callbacks....234

7.3.4 Monitoring and visualization with TensorBoard....236

7.4 Writing your own training and evaluation loops....238

7.4.1 Training vs. inference....239

7.4.2 Writing custom training step functions....240

7.4.3 Low-level usage of metrics....245

7.4.4 Using fit() with a custom training loop....246

7.4.5 Handling metrics in a custom train_step()....250

8 Image classification....255

8.1 Introduction to convnets....256

8.1.1 The convolution operation....258

8.1.2 The max-pooling operation....263

8.2 Training a convnet from scratch on a small dataset....265

8.2.1 The relevance of deep learning for small-data problems....266

8.2.2 Downloading the data....266

8.2.3 Building your model....269

8.2.4 Data preprocessing....271

8.2.5 Using data augmentation....276

8.3 Using a pretrained model....280

8.3.1 Feature extraction with a pretrained model....280

8.3.2 Fine-tuning a pretrained model....288

9 ConvNet architecture patterns....292

9.1 Modularity, hierarchy, and reuse....293

9.2 Residual connections....296

9.3 Batch normalization....300

9.4 Depthwise separable convolutions....302

9.5 Putting it together: A mini Xception-like model....304

9.6 Beyond convolution: Vision Transformers....306

10 Interpreting what ConvNets learn....308

10.1 Visualizing intermediate activations....309

10.2 Visualizing ConvNet filters....315

10.2.1 Gradient ascent in TensorFlow....318

10.2.2 Gradient ascent in PyTorch....319

10.2.3 Gradient ascent in JAX....319

10.2.4 The filter visualization loop....320

10.3 Visualizing heatmaps of class activation....323

10.3.1 Getting the gradient of the top class: TensorFlow version....326

10.3.2 Getting the gradient of the top class: PyTorch version....326

10.3.3 Getting the gradient of the top class: JAX version....327

10.3.4 Displaying the class activation heatmap....328

10.4 Visualizing the latent space of a ConvNet....330

11 Image segmentation....332

11.1 Computer vision tasks....332

11.1.1 Types of image segmentation....334

11.2 Training a segmentation model from scratch....335

11.2.1 Downloading a segmentation dataset....335

11.2.2 Building and training the segmentation model....338

11.3 Using a pretrained segmentation model....342

11.3.1 Downloading the Segment Anything Model....343

11.3.2 How Segment Anything works....343

11.3.3 Preparing a test image....345

11.3.4 Prompting the model with a target point....347

11.3.5 Prompting the model with a target box....351

12 Object detection....353

12.1 Single-stage vs. two-stage object detectors....354

12.1.1 Two-stage R-CNN detectors....354

12.1.2 Single-stage detectors....356

12.2 Training a YOLO model from scratch....356

12.2.1 Downloading the COCO dataset....356

12.2.2 Creating a YOLO model....360

12.2.3 Readying the COCO data for the YOLO model....363

12.2.4 Training the YOLO model....366

12.3 Using a pretrained RetinaNet detector....370

13 Timeseries forecasting....375

13.1 Different kinds of timeseries tasks....375

13.2 A temperature forecasting example....376

13.2.1 Preparing the data....380

13.2.2 A commonsense, non-machine-learning baseline....383

13.2.3 Let’s try a basic machine learning model....384

13.2.4 Let’s try a 1D convolutional model....386

13.3 Recurrent neural networks....388

13.3.1 Understanding recurrent neural networks....389

13.3.2 A recurrent layer in Keras....392

13.3.3 Getting the most out of recurrent neural networks....396

13.3.4 Using recurrent dropout to fight overfitting....396

13.3.5 Stacking recurrent layers....399

13.3.6 Using bidirectional RNNs....401

13.4 Going even further....403

14 Text classification....405

14.1 A brief history of natural language processing....405

14.2 Preparing text data....408

14.2.1 Character and word tokenization....411

14.2.2 Subword tokenization....414

14.3 Sets vs. sequences....419

14.3.1 Loading the IMDb classification dataset....420

14.4 Set models....422

14.4.1 Training a bag-of-words model....423

14.4.2 Training a bigram model....427

14.5 Sequence models....429

14.5.1 Training a recurrent model....430

14.5.2 Understanding word embeddings....433

14.5.3 Using a word embedding....434

14.5.4 Pretraining a word embedding....438

14.5.5 Using the pretrained embedding for classification....442

15 Language models and the Transformer....445

15.1 The language model....445

15.1.1 Training a Shakespeare language model....446

15.1.2 Generating Shakespeare....450

15.2 Sequence-to-sequence learning....452

15.2.1 English-to-Spanish translation....454

15.2.2 Sequence-to-sequence learning with RNNs....456

15.3 The Transformer architecture....461

15.3.1 Dot-product attention....463

15.3.2 Transformer encoder block....468

15.3.3 Transformer decoder block....470

15.3.4 Sequence-to-sequence learning with a Transformer....472

15.3.5 Embedding positional information....475

15.4 Classification with a pretrained Transformer....478

15.4.1 Pretraining a Transformer encoder....478

15.4.2 Loading a pretrained Transformer....479

15.4.3 Preprocessing IMDb movie reviews....482

15.4.4 Fine-tuning a pretrained Transformer....484

15.5 What makes the Transformer effective?....485

16 Text generation....490

16.1 A brief history of sequence generation....492

16.2 Training a mini-GPT....494

16.2.1 Building the model....497

16.2.2 Pretraining the model....500

16.2.3 Generative decoding....502

16.2.4 Sampling strategies....504

16.3 Using a pretrained LLM....508

16.3.1 Text generation with the Gemma model....509

16.3.2 Instruction fine-tuning....512

16.3.3 Low-Rank Adaptation (LoRA)....514

16.4 Going further with LLMs....519

16.4.1 Reinforcement Learning with Human Feedback (RLHF)....519

16.4.2 Multimodal LLMs....522

16.4.3 Retrieval Augmented Generation (RAG)....525

16.4.4 “Reasoning” models....526

16.5 Where are LLMs heading next?....528

17 Image generation....532

17.1 Deep learning for image generation....532

17.1.1 Sampling from latent spaces of images....533

17.1.2 Variational autoencoders....534

17.1.3 Implementing a VAE with Keras....537

17.2 Diffusion models....542

17.2.1 The Oxford Flowers dataset....544

17.2.2 A U-Net denoising autoencoder....545

17.2.3 The concepts of diffusion time and diffusion schedule....547

17.2.4 The training process....549

17.2.5 The generation process....551

17.2.6 Visualizing results with a custom callback....552

17.2.7 It’s go time!....553

17.3 Text-to-image models....555

17.3.1 Exploring the latent space of a text-to-image model....557

18 Best practices for the real world....562

18.1 Getting the most out of your models....563

18.1.1 Hyperparameter optimization....563

18.1.2 Model ensembling....570

18.2 Scaling up model training with multiple devices....572

18.2.1 Multi-GPU training....572

18.2.2 Distributed training in practice....574

18.2.3 TPU training....579

18.3 Speeding up training and inference with lower-precision computation....580

18.3.1 Understanding floating-point precision....580

18.3.2 Float16 inference....582

18.3.3 Mixed-precision training....583

18.3.4 Using loss scaling with mixed precision....583

18.3.5 Beyond mixed precision: float8 training....584

18.3.6 Faster inference with quantization....585

19 The future of AI....588

19.1 The limitations of deep learning....588

19.1.1 Deep learning models struggle to adapt to novelty....589

19.1.2 Deep learning models are highly sensitive to phrasing and other distractors....591

19.1.3 Deep learning models struggle to learn generalizable programs....593

19.1.4 The risk of anthropomorphizing machine-learning models....593

19.2 Scale isn’t all you need....594

19.2.1 Automatons vs. intelligent agents....595

19.2.2 Local generalization vs. extreme generalization....597

19.2.3 The purpose of intelligence....599

19.2.4 Climbing the spectrum of generalization....599

19.3 How to build intelligence....600

19.3.1 The kaleidoscope hypothesis....601

19.3.2 The essence of intelligence: Abstraction acquisition and recombination....602

19.3.3 The importance of setting the right target....602

19.3.4 A new target: On-the-fly adaptation....604

19.3.5 ARC Prize....605

19.3.6 The test-time adaptation era....606

19.3.7 ARC-AGI 2....607

19.4 The missing ingredients: Search and symbols....608

19.4.1 The two poles of abstraction....609

19.4.2 Cognition as a combination of both kinds of abstraction....611

19.4.3 Why deep learning isn’t a complete answer to abstraction generation....612

19.4.4 An alternative approach to AI: Program synthesis....613

19.4.5 Blending deep learning and program synthesis....614

19.4.6 Modular component recombination and lifelong learning....616

19.4.7 The long-term vision....618

20 Conclusions....619

20.1 Key concepts in review....619

20.1.1 Various approaches to artificial intelligence....620

20.1.2 What makes deep learning special within the field of machine learning....620

20.1.3 How to think about deep learning....621

20.1.4 Key enabling technologies....622

20.1.5 The universal machine learning workflow....623

20.1.6 Key network architectures....624

20.2 Limitations of deep learning....629

20.3 What might lie ahead....630

20.4 Staying up to date in a fast-moving field....631

20.4.1 Practice on real-world problems using Kaggle....631

20.4.2 Read about the latest developments on arXiv....631

20.4.3 Explore the Keras ecosystem....632

20.5 Final words....632

index....633

Deep Learning with Python, Third Edition puts the power of deep learning in your hands. This new edition includes the latest Keras and TensorFlow features, generative AI models, and added coverage of PyTorch and JAX. Learn directly from the creator of Keras and step confidently into the world of deep learning with Python.

In Deep Learning with Python, Third Edition you’ll discover:

Deep learning from first principles
The latest features of Keras 3
A primer on JAX, PyTorch, and TensorFlow
Image classification and image segmentation
Time series forecasting
Large Language models
Text classification and machine translation
Text and image generation—build your own GPT and diffusion models!
Scaling and tuning models

With over 100,000 copies sold, Deep Learning with Python makes it possible for developers, data scientists, and machine learning enthusiasts to put deep learning into action. In this expanded and updated third edition, Keras creator François Chollet offers insights for both novice and experienced machine learning practitioners. You'll master state-of-the-art deep learning tools and techniques, from the latest features of Keras 3 to building AI models that can generate text and images.

About the technology

In less than a decade, deep learning has changed the world—twice. First, Python-based libraries like Keras, TensorFlow, and PyTorch elevated neural networks from lab experiments to high-performance production systems deployed at scale. And now, through Large Language Models and other generative AI tools, deep learning is again transforming business and society. In this new edition, Keras creator François Chollet invites you into this amazing subject in the fluid, mentoring style of a true insider.

About the book

Deep Learning with Python, Third Edition makes the concepts behind deep learning and generative AI understandable and approachable. This complete rewrite of the bestselling original includes fresh chapters on transformers, building your own GPT-like LLM, and generating images with diffusion models. Each chapter introduces practical projects and code examples that build your understanding of deep learning, layer by layer.

What's inside

Hands-on, code-first learning
Comprehensive, from basics to generative AI
Intuitive and easy math explanations
Examples in Keras, PyTorch, JAX, and TensorFlow

About the reader

For readers with intermediate Python skills. No previous experience with machine learning or linear algebra required.

Если вам понравилась эта страница - поделитесь ею с друзьями, тем самым вы помогаете нам развиваться и добавлять всё больше интересных и нужным вам книг