Machine Learning with PyTorch and Scikit-Learn: Develop machine learning and deep learning models with Python

Machine Learning with PyTorch and Scikit-Learn: Develop machine learning and deep learning models with Python

Автор: Liu Yuxi (Hayden) , Mirjalili Vahid , Raschka Sebastian

Дата выхода: 2022

Издательство: Packt Publishing Limited

Количество страниц: 1110

Размер файла: 10,8 МБ

Тип файла: PDF

Добавил: codelibs

Проверить на вирусы

Preface....28

Who this book is for....30

What this book covers....31

To get the most out of this book....35

Get in touch....38

Share your thoughts....39

Giving Computers the Ability to Learn from Data....40

Building intelligent machines to transform data into knowledge....41

The three different types of machine learning....42

Making predictions about the future with supervised learning....43

Classification for predicting class labels....44

Regression for predicting continuous outcomes....46

Solving interactive problems with reinforcement learning....48

Discovering hidden structures with unsupervised learning....50

Finding subgroups with clustering....50

Dimensionality reduction for data compression....51

Introduction to the basic terminology and notations....52

Notation and conventions used in this book....53

Machine learning terminology....55

A roadmap for building machine learning systems....56

Preprocessing – getting data into shape....57

Training and selecting a predictive model....58

Evaluating models and predicting unseen data instances....59

Using Python for machine learning....60

Installing Python and packages from the Python Package Index....61

Using the Anaconda Python distribution and package manager....62

Packages for scientific computing, data science, and machine learning....64

Summary....66

Training Simple Machine Learning Algorithms for Classification....68

Artificial neurons – a brief glimpse into the early history of machine learning....68

The formal definition of an artificial neuron....70

The perceptron learning rule....72

Implementing a perceptron learning algorithm in Python....75

An object-oriented perceptron API....76

Training a perceptron model on the Iris dataset....80

Adaptive linear neurons and the convergence of learning....87

Minimizing loss functions with gradient descent....88

Implementing Adaline in Python....91

Improving gradient descent through feature scaling....96

Large-scale machine learning and stochastic gradient descent....98

Summary....104

A Tour of Machine Learning Classifiers Using Scikit-Learn....106

Choosing a classification algorithm....106

First steps with scikit-learn – training a perceptron....108

Modeling class probabilities via logistic regression....115

Logistic regression and conditional probabilities....116

Learning the model weights via the logistic loss function....121

Converting an Adaline implementation into an algorithm for logistic regression....125

Training a logistic regression model with scikit-learn....129

Tackling overfitting via regularization....133

Maximum margin classification with support vector machines....138

Maximum margin intuition....139

Dealing with a nonlinearly separable case using slack variables....140

Alternative implementations in scikit-learn....142

Solving nonlinear problems using a kernel SVM....143

Kernel methods for linearly inseparable data....143

Using the kernel trick to find separating hyperplanes in a high-dimensional space....145

Decision tree learning....149

Maximizing IG – getting the most bang for your buck....151

Building a decision tree....156

Combining multiple decision trees via random forests....159

K-nearest neighbors – a lazy learning algorithm....164

Summary....170

Building Good Training Datasets – Data Preprocessing....172

Dealing with missing data....172

Identifying missing values in tabular data....173

Eliminating training examples or features with missing values....175

Imputing missing values....177

Understanding the scikit-learn estimator API....179

Handling categorical data....181

Categorical data encoding with pandas....182

Mapping ordinal features....182

Encoding class labels....184

Performing one-hot encoding on nominal features....185

Optional: encoding ordinal features....190

Partitioning a dataset into separate training and test datasets....191

Bringing features onto the same scale....196

Selecting meaningful features....200

L1 and L2 regularization as penalties against model complexity....201

A geometric interpretation of L2 regularization....202

Sparse solutions with L1 regularization....205

Sequential feature selection algorithms....210

Assessing feature importance with random forests....219

Summary....222

Compressing Data via Dimensionality Reduction....224

Unsupervised dimensionality reduction via principal component analysis....225

The main steps in principal component analysis....225

Extracting the principal components step by step....229

Total and explained variance....233

Feature transformation....235

Principal component analysis in scikit-learn....240

Assessing feature contributions....244

Supervised data compression via linear discriminant analysis....247

Principal component analysis versus linear discriminant analysis....247

The inner workings of linear discriminant analysis....249

Computing the scatter matrices....250

Selecting linear discriminants for the new feature subspace....254

Projecting examples onto the new feature space....257

LDA via scikit-learn....258

Nonlinear dimensionality reduction and visualization....260

Why consider nonlinear dimensionality reduction?....261

Visualizing data via t-distributed stochastic neighbor embedding....263

Summary....268

Learning Best Practices for Model Evaluation and Hyperparameter Tuning....270

Streamlining workflows with pipelines....270

Loading the Breast Cancer Wisconsin dataset....271

Combining transformers and estimators in a pipeline....274

Using k-fold cross-validation to assess model performance....277

The holdout method....277

K-fold cross-validation....279

Debugging algorithms with learning and validation curves....286

Diagnosing bias and variance problems with learning curves....286

Addressing over- and underfitting with validation curves....290

Fine-tuning machine learning models via grid search....293

Tuning hyperparameters via grid search....293

Exploring hyperparameter configurations more widely with randomized search....296

More resource-efficient hyperparameter search with successive halving....299

Algorithm selection with nested cross-validation....303

Looking at different performance evaluation metrics....306

Reading a confusion matrix....306

Optimizing the precision and recall of a classification model....309

Plotting a receiver operating characteristic....313

Scoring metrics for multiclass classification....317

Dealing with class imbalance....318

Summary....323

Combining Different Models for Ensemble Learning....325

Learning with ensembles....325

Combining classifiers via majority vote....331

Implementing a simple majority vote classifier....331

Using the majority voting principle to make predictions....338

Evaluating and tuning the ensemble classifier....342

Bagging – building an ensemble of classifiers from bootstrap samples....350

Bagging in a nutshell....351

Applying bagging to classify examples in the Wine dataset....353

Leveraging weak learners via adaptive boosting....358

How adaptive boosting works....359

Applying AdaBoost using scikit-learn....366

Gradient boosting – training an ensemble based on loss gradients....370

Comparing AdaBoost with gradient boosting....371

Outlining the general gradient boosting algorithm....372

Explaining the gradient boosting algorithm for classification....375

Illustrating gradient boosting for classification....378

Using XGBoost....381

Summary....384

Applying Machine Learning to Sentiment Analysis....386

Preparing the IMDb movie review data for text processing....386

Obtaining the movie review dataset....387

Preprocessing the movie dataset into a more convenient format....388

Introducing the bag-of-words model....390

Transforming words into feature vectors....391

Assessing word relevancy via term frequency-inverse document frequency....394

Cleaning text data....397

Processing documents into tokens....400

Training a logistic regression model for document classification....403

Working with bigger data – online algorithms and out-of-core learning....407

Topic modeling with latent Dirichlet allocation....412

Decomposing text documents with LDA....413

LDA with scikit-learn....414

Summary....418

Predicting Continuous Target Variables with Regression Analysis....420

Introducing linear regression....421

Simple linear regression....421

Multiple linear regression....422

Exploring the Ames Housing dataset....424

Loading the Ames Housing dataset into a DataFrame....424

Visualizing the important characteristics of a dataset....428

Looking at relationships using a correlation matrix....430

Implementing an ordinary least squares linear regression model....433

Solving regression for regression parameters with gradient descent....434

Estimating the coefficient of a regression model via scikit-learn....439

Fitting a robust regression model using RANSAC....443

Evaluating the performance of linear regression models....447

Using regularized methods for regression....454

Turning a linear regression model into a curve – polynomial regression....457

Adding polynomial terms using scikit-learn....457

Modeling nonlinear relationships in the Ames Housing dataset....460

Dealing with nonlinear relationships using random forests....463

Decision tree regression....464

Random forest regression....467

Summary....470

Working with Unlabeled Data – Clustering Analysis....472

Grouping objects by similarity using k-means....472

k-means clustering using scikit-learn....473

A smarter way of placing the initial cluster centroids using k-means++....479

Hard versus soft clustering....481

Using the elbow method to find the optimal number of clusters....483

Quantifying the quality of clustering via silhouette plots....485

Organizing clusters as a hierarchical tree....490

Grouping clusters in a bottom-up fashion....491

Performing hierarchical clustering on a distance matrix....493

Attaching dendrograms to a heat map....498

Applying agglomerative clustering via scikit-learn....500

Locating regions of high density via DBSCAN....501

Summary....508

Implementing a Multilayer Artificial Neural Network from Scratch....511

Modeling complex functions with artificial neural networks....511

Single-layer neural network recap....513

Introducing the multilayer neural network architecture....516

Activating a neural network via forward propagation....519

Classifying handwritten digits....522

Obtaining and preparing the MNIST dataset....523

Implementing a multilayer perceptron....527

Coding the neural network training loop....533

Evaluating the neural network performance....539

Training an artificial neural network....544

Computing the loss function....545

Developing your understanding of backpropagation....547

Training neural networks via backpropagation....549

About convergence in neural networks....554

A few last words about the neural network implementation....556

Summary....557

Parallelizing Neural Network Training with PyTorch....559

PyTorch and training performance....560

Performance challenges....560

What is PyTorch?....562

How we will learn PyTorch....564

First steps with PyTorch....565

Installing PyTorch....565

Creating tensors in PyTorch....567

Manipulating the data type and shape of a tensor....568

Applying mathematical operations to tensors....570

Split, stack, and concatenate tensors....572

Building input pipelines in PyTorch....575

Creating a PyTorch DataLoader from existing tensors....576

Combining two tensors into a joint dataset....577

Shuffle, batch, and repeat....579

Creating a dataset from files on your local storage disk....581

Fetching available datasets from the torchvision.datasets library....586

Building an NN model in PyTorch....592

The PyTorch neural network module (torch.nn)....593

Building a linear regression model....594

Model training via the torch.nn and torch.optim modules....598

Building a multilayer perceptron for classifying flowers in the Iris dataset....600

Evaluating the trained model on the test dataset....604

Saving and reloading the trained model....605

Choosing activation functions for multilayer neural networks....606

Logistic function recap....608

Estimating class probabilities in multiclass classification via the softmax function....610

Broadening the output spectrum using a hyperbolic tangent....612

Rectified linear unit activation....615

Summary....617

Going Deeper – The Mechanics of PyTorch....619

The key features of PyTorch....620

PyTorch’s computation graphs....621

Understanding computation graphs....622

Creating a graph in PyTorch....623

PyTorch tensor objects for storing and updating model parameters....624

Computing gradients via automatic differentiation....628

Computing the gradients of the loss with respect to trainable variables....628

Understanding automatic differentiation....631

Adversarial examples....631

Simplifying implementations of common architectures via the torch.nn module....632

Implementing models based on nn.Sequential....632

Choosing a loss function....634

Solving an XOR classification problem....636

Making model building more flexible with nn.Module....642

Writing custom layers in PyTorch....645

Project one – predicting the fuel efficiency of a car....650

Working with feature columns....651

Training a DNN regression model....656

Project two – classifying MNIST handwritten digits....659

Higher-level PyTorch APIs: a short introduction to PyTorch-Lightning....663

Setting up the PyTorch Lightning model....665

Setting up the data loaders for Lightning....668

Training the model using the PyTorch Lightning Trainer class....670

Evaluating the model using TensorBoard....671

Summary....677

Classifying Images with Deep Convolutional Neural Networks....679

The building blocks of CNNs....679

Understanding CNNs and feature hierarchies....680

Performing discrete convolutions....683

Discrete convolutions in one dimension....683

Padding inputs to control the size of the output feature maps....686

Determining the size of the convolution output....688

Performing a discrete convolution in 2D....689

Subsampling layers....693

Putting everything together – implementing a CNN....695

Working with multiple input or color channels....696

Regularizing an NN with L2 regularization and dropout....700

Loss functions for classification....704

Implementing a deep CNN using PyTorch....707

The multilayer CNN architecture....707

Loading and preprocessing the data....708

Implementing a CNN using the torch.nn module....709

Configuring CNN layers in PyTorch....710

Constructing a CNN in PyTorch....711

Smile classification from face images using a CNN....717

Loading the CelebA dataset....717

Image transformation and data augmentation....719

Training a CNN smile classifier....726

Summary....732

Modeling Sequential Data Using Recurrent Neural Networks....734

Introducing sequential data....735

Modeling sequential data – order matters....735

Sequential data versus time series data....736

Representing sequences....737

The different categories of sequence modeling....738

RNNs for modeling sequences....740

Understanding the dataflow in RNNs....740

Computing activations in an RNN....744

Hidden recurrence versus output recurrence....747

The challenges of learning long-range interactions....751

Long short-term memory cells....753

Implementing RNNs for sequence modeling in PyTorch....756

Project one – predicting the sentiment of IMDb movie reviews....757

Preparing the movie review data....757

Embedding layers for sentence encoding....764

Building an RNN model....767

Building an RNN model for the sentiment analysis task....769

Project two – character-level language modeling in PyTorch....775

Preprocessing the dataset....776

Building a character-level RNN model....782

Evaluation phase – generating new text passages....785

Summary....791

Transformers – Improving Natural Language Processing with Attention Mechanisms....793

Adding an attention mechanism to RNNs....794

Attention helps RNNs with accessing information....795

The original attention mechanism for RNNs....796

Processing the inputs using a bidirectional RNN....798

Generating outputs from context vectors....799

Computing the attention weights....800

Introducing the self-attention mechanism....802

Starting with a basic form of self-attention....803

Parameterizing the self-attention mechanism: scaled dot-product attention....809

Attention is all we need: introducing the original transformer architecture....813

Encoding context embeddings via multi-head attention....815

Learning a language model: decoder and masked multi-head attention....823

Implementation details: positional encodings and layer normalization....825

Building large-scale language models by leveraging unlabeled data....828

Pre-training and fine-tuning transformer models....828

Leveraging unlabeled data with GPT....832

Using GPT-2 to generate new text....839

Bidirectional pre-training with BERT....843

The best of both worlds: BART....849

Fine-tuning a BERT model in PyTorch....853

Loading the IMDb movie review dataset....854

Tokenizing the dataset....857

Loading and fine-tuning a pre-trained BERT model....859

Fine-tuning a transformer more conveniently using the Trainer API....864

Summary....869

Generative Adversarial Networks for Synthesizing New Data....872

Introducing generative adversarial networks....872

Starting with autoencoders....874

Generative models for synthesizing new data....877

Generating new samples with GANs....879

Understanding the loss functions of the generator and discriminator networks in a GAN model....881

Implementing a GAN from scratch....884

Training GAN models on Google Colab....884

Implementing the generator and the discriminator networks....888

Defining the training dataset....892

Training the GAN model....895

Improving the quality of synthesized images using a convolutional and Wasserstein GAN....902

Transposed convolution....903

Batch normalization....905

Implementing the generator and discriminator....908

Dissimilarity measures between two distributions....916

Using EM distance in practice for GANs....921

Gradient penalty....922

Implementing WGAN-GP to train the DCGAN model....923

Mode collapse....928

Other GAN applications....930

Summary....931

Graph Neural Networks for Capturing Dependencies in Graph Structured Data....933

Introduction to graph data....934

Undirected graphs....935

Directed graphs....936

Labeled graphs....937

Representing molecules as graphs....938

Understanding graph convolutions....939

The motivation behind using graph convolutions....939

Implementing a basic graph convolution....943

Implementing a GNN in PyTorch from scratch....948

Defining the NodeNetwork model....949

Coding the NodeNetwork’s graph convolution layer....951

Adding a global pooling layer to deal with varying graph sizes....952

Preparing the DataLoader....956

Using the NodeNetwork to make predictions....959

Implementing a GNN using the PyTorch Geometric library....961

Other GNN layers and recent developments....969

Spectral graph convolutions....970

Pooling....973

Normalization....975

Pointers to advanced graph neural network literature....978

Summary....980

Reinforcement Learning for Decision Making in Complex Environments....983

Introduction – learning from experience....984

Understanding reinforcement learning....984

Defining the agent-environment interface of a reinforcement learning system....987

The theoretical foundations of RL....989

Markov decision processes....989

The mathematical formulation of Markov decision processes....991

Visualization of a Markov process....993

Episodic versus continuing tasks....994

RL terminology: return, policy, and value function....995

The return....995

Policy....998

Value function....998

Dynamic programming using the Bellman equation....1001

Reinforcement learning algorithms....1002

Dynamic programming....1003

Policy evaluation – predicting the value function with dynamic programming....1004

Improving the policy using the estimated value function....1005

Policy iteration....1006

Value iteration....1007

Reinforcement learning with Monte Carlo....1008

State-value function estimation using MC....1009

Action-value function estimation using MC....1009

Finding an optimal policy using MC control....1010

Policy improvement – computing the greedy policy from the action-value function....1010

Temporal difference learning....1011

TD prediction....1011

On-policy TD control (SARSA)....1013

Off-policy TD control (Q-learning)....1014

Implementing our first RL algorithm....1015

Introducing the OpenAI Gym toolkit....1015

Working with the existing environments in OpenAI Gym....1016

A grid world example....1018

Implementing the grid world environment in OpenAI Gym....1019

Solving the grid world problem with Q-learning....1027

A glance at deep Q-learning....1031

Training a DQN model according to the Q-learning algorithm....1033

Replay memory....1033

Determining the target values for computing the loss....1035

Implementing a deep Q-learning algorithm....1036

Chapter and book summary....1041

Other Books You May Enjoy....1047

Index....1052

Machine Learning with PyTorch and Scikit-Learn is a comprehensive guide to machine learning and deep learning with PyTorch. It acts as both a step-by-step tutorial and a reference you'll keep coming back to as you build your machine learning systems.

Packed with clear explanations, visualizations, and examples, the book covers all the essential machine learning techniques in depth. While some books teach you only to follow instructions, with this machine learning book, we teach the principles allowing you to build models and applications for yourself.

Why PyTorch?

PyTorch is the Pythonic way to learn machine learning, making it easier to learn and simpler to code with. This book explains the essential parts of PyTorch and how to create models using popular libraries, such as PyTorch Lightning and PyTorch Geometric.

You will also learn about generative adversarial networks (GANs) for generating new data and training intelligent agents with reinforcement learning. Finally, this new edition is expanded to cover the latest trends in deep learning, including graph neural networks and large-scale transformers used for natural language processing (NLP).

This PyTorch book is your companion to machine learning with Python, whether you're a Python developer new to machine learning or want to deepen your knowledge of the latest developments.

What you will learn

Explore frameworks, models, and techniques for machines to 'learn' from data
Use scikit-learn for machine learning and PyTorch for deep learning
Train machine learning classifiers on images, text, and more
Build and train neural networks, transformers, and boosting algorithms
Discover best practices for evaluating and tuning models
Predict continuous target outcomes using regression analysis
Dig deeper into textual and social media data using sentiment analysis

Who this book is for

If you have a good grasp of Python basics and want to start learning about machine learning and deep learning, then this is the book for you. This is an essential resource written for developers and data scientists who want to create practical machine learning and deep learning applications using scikit-learn and PyTorch.

Before you get started with this book, you'll need a good understanding of calculus, as well as linear algebra.

Если вам понравилась эта страница - поделитесь ею с друзьями, тем самым вы помогаете нам развиваться и добавлять всё больше интересных и нужным вам книг