Cover....1
Title Page....5
Copyright Page....6
Dedication Page....7
Contents....9
Foreword....15
Acknowledgments....17
General Introduction....19
Chapter 1 Concepts, Libraries, and Essential Tools in Machine Learning and Deep Learning....23
1.1 Learning Styles for Machine Learning....24
1.1.1 Supervised Learning....24
1.1.1.1 Overfitting and Underfitting....25
1.1.1.2 K-Folds Cross-Validation....26
1.1.1.3 Train/Test Split....26
1.1.1.4 Confusion Matrix....27
1.1.1.5 Loss Functions....29
1.1.2 Unsupervised Learning....31
1.1.3 Semi-Supervised Learning....31
1.1.4 Reinforcement Learning....31
1.2 Essential Python Tools for Machine Learning....31
1.2.1 Data Manipulation with Python....32
1.2.2 Python Machine Learning Libraries....32
1.2.2.1 Scikit-learn....32
1.2.2.2 TensorFlow....32
1.2.2.3 Keras....34
1.2.2.4 PyTorch....34
1.2.3 Jupyter Notebook and JupyterLab....35
1.3 HephAIstos for Running Machine Learning on CPUs, GPUs, and QPUs....35
1.3.1 Installation....35
1.3.2 HephAIstos Function....37
1.4 Where to Find the Datasets and Code Examples....54
Further Reading....55
Chapter 2 Feature Engineering Techniques in Machine Learning....57
2.1 Feature Rescaling: Structured Continuous Numeric Data....58
2.1.1 Data Transformation....59
2.1.1.1 StandardScaler....59
2.1.1.2 MinMaxScaler....61
2.1.1.3 MaxAbsScaler....62
2.1.1.4 RobustScaler....62
2.1.1.5 Normalizer: Unit Vector Normalization....64
2.1.1.6 Other Options....65
2.1.1.7 Transformation to Improve Normal Distribution....66
2.1.1.8 Quantile Transformation....70
2.1.2 Example: Rescaling Applied to an SVM Model....72
2.2 Strategies to Work with Categorical (Discrete) Data....79
2.2.1 Ordinal Encoding....81
2.2.2 One-Hot Encoding....83
2.2.3 Label Encoding....84
2.2.4 Helmert Encoding....85
2.2.5 Binary Encoding....86
2.2.6 Frequency Encoding....87
2.2.7 Mean Encoding....88
2.2.8 Sum Encoding....90
2.2.9 Weight of Evidence Encoding....90
2.2.10 Probability Ratio Encoding....92
2.2.11 Hashing Encoding....93
2.2.12 Backward Difference Encoding....94
2.2.13 Leave-One-Out Encoding....95
2.2.14 James-Stein Encoding....96
2.2.15 M-Estimator Encoding....98
2.2.16 Using HephAIstos to Encode Categorical Data....99
2.3 Time-Related Features Engineering....99
2.3.1 Date-Related Features....101
2.3.2 Lag Variables....101
2.3.3 Rolling Window Feature....104
2.3.4 Expending Window Feature....106
2.3.5 Understanding Time Series Data in Context....107
2.4 Handling Missing Values in Machine Learning....110
2.4.1 Row or Column Removal....111
2.4.2 Statistical Imputation: Mean, Median, and Mode....112
2.4.3 Linear Interpolation....113
2.4.4 Multivariate Imputation by Chained Equation Imputation....114
2.4.5 KNN Imputation....115
2.5 Feature Extraction and Selection....119
2.5.1 Feature Extraction....119
2.5.1.1 Principal Component Analysis....120
2.5.1.2 Independent Component Analysis....124
2.5.1.3 Linear Discriminant Analysis....132
2.5.1.4 Locally Linear Embedding....137
2.5.1.5 The t-Distributed Stochastic Neighbor Embedding Technique....145
2.5.1.6 More Manifold Learning Techniques....147
2.5.1.7 Feature Extraction with HephAIstos....152
2.5.2 Feature Selection....153
2.5.2.1 Filter Methods....154
2.5.2.2 Wrapper Methods....168
2.5.2.3 Embedded Methods....176
2.5.2.4 Feature Importance Using Graphics Processing Units (GPUs)....189
2.5.2.5 Feature Selection Using HephAIstos....190
Further Reading....192
Chapter 3 Machine Learning Algorithms....197
3.1 Linear Regression....198
3.1.1 The Math....198
3.1.2 Gradient Descent to Optimize the Cost Function....199
3.1.3 Implementation of Linear Regression....204
3.1.3.1 Univariate Linear Regression....204
3.1.3.2 Multiple Linear Regression: Predicting Water Temperature....207
3.2 Logistic Regression....224
3.2.1 Binary Logistic Regression....224
3.2.1.1 Cost Function....225
3.2.1.2 Gradient Descent....226
3.2.2 Multinomial Logistic Regression....226
3.2.3 Multinomial Logistic Regression Applied to Fashion MNIST....226
3.2.3.1 Logistic Regression with scikit-learn....227
3.2.3.2 Logistic Regression with Keras on TensorFlow....230
3.2.4 Binary Logistic Regression with Keras on TensorFlow....232
3.3 Support Vector Machine....233
3.3.1 Linearly Separable Data....234
3.3.2 Not Fully Linearly Separable Data....236
3.3.3 Nonlinear SVMs....238
3.3.4 SVMs for Regression....239
3.3.5 Application of SVMs....241
3.3.5.1 SVM Using scikit-learn for Classification....242
3.3.5.2 SVM Using scikit-learn for Regression....244
3.4 Artificial Neural Networks....245
3.4.1 Multilayer Perceptron....246
3.4.2 Estimation of the Parameters....247
3.4.2.1 Loss Functions....247
3.4.2.2 Backpropagation: Binary Classification....248
3.4.2.3 Backpropagation: Multi-class Classification....249
3.4.3 Convolutional Neural Networks....252
3.4.4 Recurrent Neural Network....254
3.4.5 Application of MLP Neural Networks....255
3.4.6 Application of RNNs: LST Memory....264
3.4.7 Building a CNN....268
3.5 Many More Algorithms to Explore....271
3.6 Unsupervised Machine Learning Algorithms....273
3.6.1 Clustering....273
3.6.1.1 K-means....275
3.6.1.2 Mini-batch K-means....277
3.6.1.3 Mean Shift....279
3.6.1.4 Affinity Propagation....281
3.6.1.5 Density-based Spatial Clustering of Applications with Noise....284
3.7 Machine Learning Algorithms with HephAIstos....286
References....292
Further Reading....292
Chapter 4 Natural Language Processing....295
4.1 Classifying Messages as Spam or Ham....296
4.2 Sentiment Analysis....303
4.3 Bidirectional Encoder Representations from Transformers....308
4.4 BERT’s Functionality....309
4.5 Installing and Training BERT for Binary Text Classification Using TensorFlow....310
4.6 Utilizing BERT for Text Summarization....316
4.7 Utilizing BERT for Question Answering....318
Further Reading....319
Chapter 5 Machine Learning Algorithms in Quantum Computing....321
5.1 Quantum Machine Learning....325
5.2 Quantum Kernel Machine Learning....328
5.3 Quantum Kernel Training....350
5.4 Pegasos QSVC: Binary Classification....355
5.5 Quantum Neural Networks....359
5.5.1 Binary Classification with EstimatorQNN....360
5.5.2 Classification with a SamplerQNN....365
5.5.3 Classification with Variational Quantum Classifier....370
5.5.4 Regression....373
5.6 Quantum Generative Adversarial Network....374
5.7 Quantum Algorithms with HephAIstos....390
References....394
Further Reading....395
Chapter 6 Machine Learning in Production....397
6.1 Why Use Docker Containers for Machine Learning?....397
6.1.1 First Things First: The Microservices....397
6.1.2 Containerization....398
6.1.3 Docker and Machine Learning: Resolving the “It Works in My Machine” Problem....398
6.1.4 Quick Install and First Use of Docker....399
6.1.4.1 Install Docker....399
6.1.4.2 Using Docker from the Command Line....400
6.1.5 Dockerfile....402
6.1.6 Build and Run a Docker Container for Your Machine Learning Model....403
6.2 Machine Learning Prediction in Real Time Using Docker and Python REST APIs with Flask....411
6.2.1 Flask-RESTful APIs....412
6.2.2 Machine Learning Models....414
6.2.3 Docker Image for the Online Inference....415
6.2.4 Running Docker Online Inference....416
6.3 From DevOps to MLOPS: Integrate Machine Learning Models Using Jenkins and Docker....418
6.3.1 Jenkins Installation....419
6.3.2 Scenario Implementation....421
6.4 Machine Learning with Docker and Kubernetes: Install a Cluster from Scratch....427
6.4.1 Kubernetes Vocabulary....427
6.4.2 Kubernetes Quick Install....428
6.4.3 Install a Kubernetes Cluster....429
6.4.4 Kubernetes: Initialization and Internal Network....432
6.5 Machine Learning with Docker and Kubernetes: Training Models....437
6.5.1 Kubernetes Jobs: Model Training and Batch Inference....437
6.5.2 Create and Prepare the Virtual Machines....437
6.5.3 Kubeadm Installation....437
6.5.4 Create a Kubernetes Cluster....438
6.5.5 Containerize our Python Application that Trains Models....440
6.5.6 Create Configuration Files for Kubernetes....444
6.5.7 Commands to Delete the Cluster....446
6.6 Machine Learning with Docker and Kubernetes: Batch Inference....446
6.6.1 Create Configuration Files for Kubernetes....449
6.7 Machine Learning Prediction in Real Time Using Docker, Python Rest APIs with Flask, and Kubernetes: Online Inference....450
6.7.1 Flask-RESTful APIs....450
6.7.2 Machine Learning Models....453
6.7.3 Docker Image for Online Inference....454
6.7.4 Running Docker Online Inference....455
6.7.5 Create and Prepare the Virtual Machines....456
6.7.6 Kubeadm Installation....456
6.7.7 Create a Kubernetes Cluster....457
6.7.8 Deploying the Containerized Machine Learning Model to Kubernetes....459
6.8 A Machine Learning Application that Deploys to the IBM Cloud Kubernetes Service: Python, Docker, Kubernetes....462
6.8.1 Create Kubernetes Service on IBM Cloud....462
6.8.2 Containerization of a Machine Learning Application....465
6.8.3 Push the Image to the IBM Cloud Registry....468
6.8.4 Deploy the Application to Kubernetes....470
6.9 Red Hat OpenShift to Develop and Deploy Enterprise ML/DL Applications....474
6.9.1 What is OpenShift?....475
6.9.2 What Is the Difference Between OpenShift and Kubernetes?....475
6.9.3 Why Red Hat OpenShift for ML/DL? To Build a Production-Ready ML/DL Environment....476
6.10 Deploying a Machine Learning Model as an API on the Red Hat OpenShift Container Platform: From Source Code in a GitHub Repository with Flask, Scikit-Learn, and Docker....476
6.10.1 Create an OpenShift Cluster Instance....477
6.10.1.1 Deploying an Application from Source Code in a GitHub Repository....479
Further Reading....485
Conclusion: The Future of Computing for Data Science?....487
Index....499
EULA....510
Machine Learning Theory and Applications delves into the realm of machine learning and deep learning, exploring their practical applications by comprehending mathematical concepts and implementing them in real-world scenarios using Python and renowned open-source libraries. This comprehensive guide covers a wide range of topics, including data preparation, feature engineering techniques, commonly utilized machine learning algorithms like support vector machines and neural networks, as well as generative AI and foundation models. To facilitate the creation of machine learning pipelines, a dedicated open-source framework named hephAIstos has been developed exclusively for this book. Moreover, the text explores the fascinating domain of quantum machine learning and offers insights on executing machine learning applications across diverse hardware technologies such as CPUs, GPUs, and QPUs. Finally, the book explains how to deploy trained models through containerized applications using Kubernetes and OpenShift, as well as their integration through machine learning operations (MLOps).
Machine Learning Theory and Applications is an essential resource for data scientists, engineers, and IT specialists and architects, as well as students in computer science, mathematics, and bioinformatics. The reader is expected to understand basic Python programming and libraries such as NumPy or Pandas and basic mathematical concepts, especially linear algebra.