Contents....8
About the Author....6
1 Introduction....24
1.1 Naturally Learned Ability for Problem Solving....24
1.2 Physics-Law-based Models....24
1.3 Machine Learning Models, Data-based....26
1.4 General Steps for Training Machine Learning Models....27
1.5 Some Mathematical Concepts, Variables, and Spaces....28
1.5.1 Toy examples....28
1.5.2 Feature space....29
1.5.3 Affine space....30
1.5.4 Label space....31
1.5.5 Hypothesis space....32
1.5.6 Definition of a typical machine learning model, a mathematical view....33
1.6 Requirements for Creating Machine Learning Models....34
1.7 Types of Data....34
1.8 Relation Between Physics-Law-based and Data-based Models....35
1.9 This Book....35
1.10 Who May Read This Book....37
1.11 Codes Used in This Book....37
References....39
2 Basics of Python....42
2.1 An Exercise....44
2.2 Briefing on Python....46
2.3 Variable Types....48
2.3.1 Numbers....48
2.3.2 Underscore placeholder....51
2.3.3 Strings....51
2.3.4 Conversion between types of variables....59
2.3.5 Variable formatting....61
2.4 Arithmetic Operators....62
2.4.1 Addition, subtraction, multiplication, division, and pow....62
2.4.2 Built-in functions....63
2.5 Boolean Values and Operators....64
2.6 Lists: A diversified variable type container....65
2.6.1 List creation, appending, concatenation, and updating....65
2.6.2 Element-wise addition of lists....67
2.6.3 Slicing strings and lists....69
2.6.4 Underscore placeholders for lists....72
2.6.5 Nested list (lists in lists in lists)....72
2.7 Tuples: Value preserved....73
2.8 Dictionaries: Indexable via keys....74
2.8.1 Assigning data to a dictionary....74
2.8.2 Iterating over a dictionary....75
2.8.3 Removing a value....76
2.8.4 Merging two dictionaries....77
2.9 Numpy Arrays: Handy for scientific computation....78
2.9.1 Lists vs. Numpy arrays....78
2.9.2 Structure of a numpy array....78
2.9.3 Axis of a numpy array....83
2.9.4 Element-wise computations....84
2.9.5 Handy ways to generate multi-dimensional arrays....85
2.9.6 Use of external package: MXNet....86
2.9.7 In-place operations....89
2.9.8 Slicing from a multi-dimensional array....90
2.9.9 Broadcasting....90
2.9.10 Converting between MXNet NDArray and NumPy....93
2.9.11 Subsetting in Numpy....94
2.9.12 Numpy and universal functions (ufunc)....94
2.9.13 Numpy array and vector/matrix....95
2.10 Sets: No Duplication....98
2.10.1 Intersection of two sets....98
2.10.2 Difference of two sets....98
2.11 List Comprehensions....99
2.12 Conditions, “if” Statements, “for” and “while” Loops....100
2.12.1 Comparison operators....100
2.12.2 The “in” operator....101
2.12.3 The “is” operator....101
2.12.4 The ‘not’ operator....103
2.12.5 The “if” statements....103
2.12.6 The “for” loops....104
2.12.7 The “while” loops....105
2.12.8 Ternary conditionals....107
2.13 Functions (Methods)....107
2.13.1 Block structure for function definition....107
2.13.2 Function with arguments....107
2.13.3 Lambda functions (Anonymous functions)....109
2.14 Classes and Objects....109
2.14.1 A simplest class....109
2.14.2 A class for scientific computation....112
2.14.3 Subclass (class inheritance)....113
2.15 Modules....114
2.16 Generation of Plots....115
2.17 Code Performance Assessment....116
2.18 Summary....117
Reference....117
3 Basic Mathematical Computations....118
3.1 Linear Algebra....118
3.1.1 Scalar numbers....119
3.1.2 Vectors....119
3.1.3 Matrices....121
3.1.4 Tensors....123
3.1.5 Sum and mean of a tensor....124
3.1.6 Dot-product of two vectors....125
3.1.7 Outer product of two vectors....128
3.1.8 Matrix-vector product....129
3.1.9 Matrix-matrix multiplication....129
3.1.10 Norms....131
3.1.11 Solving algebraic system equations....132
3.1.12 Matrix inversion....134
3.1.13 Eigenvalue decomposition of a matrix....136
3.1.14 Condition number of a matrix....139
3.1.15 Rank of a matrix....141
3.2 Rotation Matrix....142
3.3 Interpolation....143
3.3.1 1-D piecewise linear interpolation using numpy.interp....144
3.3.2 1-D least-square solution approximation....145
3.3.3 1-D interpolation using interp1d....147
3.3.4 2-D spline representation using bisplrep....147
3.3.5 Radial basis functions for smoothing and interpolation....149
3.4 Singular Value Decomposition....152
3.4.1 SVD formulation....152
3.4.2 Algorithms for SVD....153
3.4.3 Numerical examples....154
3.4.4 SVD for data compression....156
3.5 Principal Component Analysis....158
3.5.1 PCA formulation....158
3.5.2 Numerical examples....160
3.5.2.1 Example 1: PCA using a three-line code....160
3.5.2.2 Example 2: Truncated PCA....162
3.6 Numerical Root Finding....166
3.7 Numerical Integration....168
3.7.1 Trapezoid rule....168
3.7.2 Gauss integration....170
3.8 Initial data treatment....171
3.8.1 Min-max scaling....172
3.8.2 “One-hot” encoding....175
3.8.3 Standard scaling....176
References....178
4 Statistics and Probability-based Learning Model....180
4.1 Analysis of Probability of an Event....181
4.1.1 Random sampling, controlled random sampling....181
4.1.2 Probability....183
4.2 Random Distributions....187
4.2.1 Uniform distribution....188
4.2.2 Normal distribution (Gaussian distribution)....188
4.3 Entropy of Probability....190
4.3.1 Example 1: Probability and its entropy....192
4.3.2 Example 2: Variation of entropy....193
4.3.3 Example 3: Entropy for events with a variable that takes different numbers of values of uniform distribution....195
4.4 Cross-Entropy: Predicated and True Probability....196
4.4.1 Example 1: Cross-entropy of a quality prediction....197
4.4.2 Example 2: Cross-entropy of a poor prediction....198
4.5 KL-Divergence....198
4.5.1 Example 1: KL-divergence of a distribution of quality prediction....199
4.5.2 Example 2: KL-divergence of a poorly predicted distribution....199
4.6 Binary Cross-Entropy....200
4.6.1 Example 1: Binary cross-entropy for a distribution of quality prediction....201
4.6.2 Example 2: Binary cross-entropy for a poorly predicted distribution....201
4.6.3 Example 3: Binary cross-entropy for more uniform true distribution: A quality prediction....202
4.6.4 Example 4: Binary cross-entropy for more uniform true distribution: A poor prediction....203
4.7 Bayesian Statistics....203
4.8 Naive Bayes Classification: Statistics-based Learning....204
4.8.1 Formulation....204
4.8.2 Case study: Handwritten digits recognition....204
4.8.3 Algorithm for the Naive Bayes classification....205
4.8.4 Testing the Naive Bayes model....208
4.8.5 Discussion....210
5 Prediction Function and Universal Prediction Theory....212
5.1 Linear Prediction Function and Affine Transformation....213
5.1.1 Linear prediction function: A basic hypothesis....214
5.1.2 Predictability for constants, the role of the bias....215
5.1.3 Predictability for linear functions: The role of the weights....215
5.1.4 Prediction of linear functions: A machine learning procedure....216
5.1.5 Affine transformation....217
5.2 Affine Transformation Unit (ATU), A Simplest Network....220
5.3 Typical Data Structures....221
5.4 Demonstration Examples of Affine Transformation....222
5.4.1 An edge, a rectangle under affine transformation....225
5.4.2 A circle under affine transformation....227
5.4.3 A spiral under affine transformation....228
5.4.4 Fern leaf under affine transformation....228
5.4.5 On linear prediction function with affine transformation....229
5.4.6 Affine transformation wrapped with activation function....229
5.5 Parameter Encoding and the Essential Mechanism of Learning....233
5.5.1 The x to ŵ encoding, a data-parameter converter unit....233
5.5.2 Uniqueness of the encoding....234
5.5.3 Uniqueness of the encoding: Not affectedby activation function....235
5.5.3 Uniqueness of the encoding: Not affected by activation function....235
5.6 The Gradient of the Prediction Function....236
5.7 Affine Transformation Array (ATA)....236
5.8 Predictability of High-Order Functions of a Deepnet....237
5.8.1 A role of activation functions....237
5.8.2 Formation of a deepnet by chaining ATA....238
5.8.3 Example: A 1 → 1 → 1 network....240
5.9 Universal Prediction Theory....241
5.10 Nonlinear Affine Transformations....242
5.11 Feature Functions in Physics-Law-based Models....243
References....244
6 The Perceptron and SVM....246
6.1 Linearly Separable Classification Problems....247
6.2 A Python Code for the Perceptron....249
6.3 The Perceptron Convergence Theorem....256
6.4 Support Vector Machine....260
6.4.1 Problem statement....260
6.4.2 Formulation of objective function and constraints....261
6.4.3 Modified objective function with constraints: Multipliers method....265
6.4.4 Converting to a standard quadratic programming problem....268
6.4.5 Prediction in SVM....272
6.4.6 Example: A Python code for SVM....273
6.4.7 Confusion matrix....277
6.4.8 Example: A Sickit-learn class for SVM....277
6.4.9 SVM for datasets not separable with hyperplanes....279
6.4.10 Kernel trick....280
6.4.11 Example: SVM classification with curves....281
6.4.12 Multiclass classification via SVM....283
6.4.13 Example: Use of SVM classifiers for iris dataset....283
References....286
7 Activation Functions and Universal Approximation Theory....288
7.1 Sigmoid Function (σ(z))....289
7.2 Sigmoid Function of an Affine Transformation Function....291
7.3 Neural-Pulse-Unite (NPU)....292
7.4 Universal Approximation Theorem....297
7.4.1 Function approximation using NPUs....297
7.4.2 Function approximations using neuron basis functions....298
7.4.3 Remarks....304
7.5 Hyperbolic Tangent Function (tanh)....305
7.6 Relu Functions....306
7.7 Softplus Function....309
7.8 Conditions for activation functions....311
7.9 Novel activation functions....311
7.9.1 Rational activation function....311
7.9.2 Power function....315
7.9.3 Power-linear function....317
7.9.4 Power-quadratic function....320
References....324
8 Automatic Differentiation and Autograd....326
8.1 General Issues on Optimization and Minimization....326
8.2 Analytic Differentiation....327
8.3 Numerical Differentiation....328
8.4 Automatic Differentiation....328
8.4.1 The concept of automatic or algorithmic differentiation....328
8.4.2 Differentiation of a function with respect to a vector and matrix....329
8.5 Autograd Implemented in Numpy....331
8.6 Autograd Implemented in the MXNet....333
8.6.1 Gradients of scalar functions with simple variable....334
8.6.2 Gradients of scalar functions in high dimensions....336
8.6.3 Gradients of scalar functions with quadratic variables in high dimensions....341
8.6.4 Gradient of scalar function with a matrix of variables in high dimensions....342
8.7 Gradients for Functions with Conditions....345
8.8 Example: Gradients of an L2 Loss Function for a Single Neuron....346
8.9 Examples: Differences Between Analytical, Autograd, and Numerical Differentiation....350
8.10 Discussion....352
References....352
9 Solution Existence Theory and Optimization Techniques....354
9.1 Introduction....354
9.2 Analytic Optimization Methods: Ideal Cases....355
9.2.1 Least square formulation....355
9.2.2 L2 loss function....356
9.2.3 Normal equation....357
9.2.4 Solution existence analysis....357
9.2.5 Solution existence theory....359
9.2.6 Effects of parallel data-points....360
9.2.7 Predictability of the solution against the label....360
9.3 Considerations in Optimization for Complex Problems....361
9.3.1 Local minima....362
9.3.2 Saddle points....363
9.3.3 Convex functions....366
9.4 Gradient Descent (GD) Method for Optimization....367
9.4.1 Gradient descent in one dimension....368
9.4.2 Remarks....369
9.4.3 Gradient descent in hyper-dimensions....370
9.4.4 Property of a convex function....371
9.4.5 The convergence theorem for the Gradient Decent algorithm....372
9.4.6 Setting or the learning rates....374
9.5 Stochastic Gradient Descent....376
9.5.1 Numerical experiment....377
9.6 Gradient Descent with Momentum....386
9.6.1 The most critical problem with GD methods....386
9.6.2 Formulation....388
9.6.3 Numerical experiment....391
9.7 Nesterov Accelerated Gradient....393
9.7.1 Formulation....393
9.8 AdaGrad Gradient Algorithm....394
9.8.1 Formulation....394
9.8.2 Numerical experiment....395
9.9 RMSProp Gradient Algorithm....397
9.9.1 Formulation....398
9.9.2 Numerical experiment....398
9.10 AdaDelta Gradient Algorithm....401
9.10.1 The idea....401
9.10.2 Numerical experiment....401
9.11 Adam Gradient Algorithm....404
9.11.1 Formulation....404
9.11.2 Numerical experiment....405
9.12 A Case Study: Compare Minimization Techniques Used in MLPClassifier....408
9.13 Other Algorithms....409
References....410
10 Loss Functions for Regression....412
10.1 Formulations for Linear Regression....413
10.1.1 Mathematical model....413
10.1.2 Neural network configuration....413
10.1.3 The xw formulation....414
10.2 Loss Functions for Linear Regression....414
10.2.1 Mean squared error loss or L2 loss function....415
10.2.2 Absolute error loss or L1 loss function....416
10.2.3 Huber loss function....417
10.2.4 Log-cosh loss function....417
10.2.5 Comparison between these loss functions....418
10.2.6 Python codes for these loss functions....419
10.3 Python Codes for Regression....421
10.3.1 Linear regression using high-order polynomial and other feature functions....424
10.3.2 Linear regression using Gaussian basis functions....427
10.4 Neural Network Model for Linear Regressions with Big Datasets....429
10.4.1 Setting up neural network models....429
10.4.2 Create data iterators....432
10.4.3 Training parameters....434
10.4.4 Define the neural network....435
10.4.5 Define the loss function....435
10.4.6 Use of optimizer....435
10.4.7 Execute the training....435
10.4.8 Examining training progress....436
10.5 Neural Network Model for Nonlinear Regression....438
10.5.1 Train models on the Boston housing price dataset....439
10.5.2 Plotting partial dependence for two features....439
10.5.3 Plot curves on top of each other....441
10.6 On Nonlinear Regressions....441
10.7 Conclusion....442
References....442
11 Loss Functions and Models for Classification....444
11.1 Prediction Functions....444
11.1.1 Linear function....445
11.1.2 Logistic prediction function....445
11.1.3 The tanh prediction function....446
11.2 Loss Functions for Classification Problems....446
11.2.1 The margin concept....446
11.2.2 0–1 loss....447
11.2.3 Hinge loss....448
11.2.4 Logistic loss....449
11.2.5 Exponential loss....450
11.2.6 Square loss....450
11.2.7 Binary cross-entropy loss....452
11.2.8 Remarks....455
11.3 A Simple Neural Network for Classification....455
11.4 Example of Binary Classification Using Neural Network with mxnet....456
11.4.1 Dataset for binary classification....456
11.4.2 Define loss functions....458
11.4.3 Plot the convergence curve of the loss function....460
11.4.4 Computing the accuracy of the trained model....460
11.5 Example of Binary Classification Using Sklearn....461
11.6 Regression with Decision Tree, AdaBoost, and Gradient Boosting....466
References....466
12 Multiclass Classification....468
12.1 Softmax Activation Neural Networks for k-Classifications....468
12.2 Cross-Entropy Loss Function for k-Classifications....470
12.3 Case Study 1: Handwritten Digit Classification with 1-Layer NN....471
12.3.1 Set contexts according to computer hardware....471
12.3.2 Loading the MNIST dataset....471
12.3.3 Set model parameters....474
12.3.4 Multiclass logistic regression....474
12.3.5 Defining a neural network model....475
12.3.6 Defining the cross-entropy loss function....475
12.3.7 Optimization method....476
12.3.8 Accuracy evaluation....476
12.3.9 Initiation of the model and training execution....476
12.3.10 Prediction with the trained model....478
12.4 Case Study 2: Handwritten Digit Classification with Sklearn Random Forest Multi-Classifier....479
12.5 Case Study 3: Comparison of Random Forest, Extra-Forest, and Gradient Boosting for Multi-Classifier....483
12.6 Multi-Classification via TensorFlow....487
12.7 Remarks....488
Reference....488
13 Multilayer Perceptron (MLP) for Regression and Classification....490
13.1 The General Architecture and Formulations of MLP....490
13.1.1 The general architecture....490
13.1.2 The xw+b formulation....492
13.1.3 The xw formulation, use of affine transformation weight matrix....494
13.1.4 MLP configuration with affine transformation weight matrix....496
13.1.5 Space evolution process in MLP....497
13.2 Neurons-Samples Theory....497
13.2.1 Affine spaces and the training parameters used in an MLP....498
13.2.2 Neurons-Samples Theory for MLPs....499
13.3 Nonlinear Activation Functions for the Hidden Layers....501
13.4 General Rule for Estimating Learning Parameters in an MLP....501
13.5 Key Techniques for MLP and Its Capability....502
13.6 A Case Study on Handwritten Digits Using MXNet....504
13.6.1 Import necessary libraries and load data....504
13.6.2 Set neural network model parameters....505
13.6.3 Softmax cross entropy loss function....505
13.6.4 Define a neural network model....506
13.6.5 Optimization method....507
13.6.6 Model accuracy evaluation....507
13.6.7 Training the neural network and timing the training....507
13.6.8 Prediction with the model trained....509
13.7 Visualization of MLP Weights Using Sklearn....511
13.7.1 Import necessary Sklearn module....511
13.7.2 Load MNIST dataset....511
13.7.3 Set an MLP model....512
13.7.4 Training the MLP model and time the training....512
13.7.5 Performance analysis....512
13.7.6 Viewing the weight matrix as images....513
13.8 MLP for Nonlinear Regression....513
13.8.1 California housing data and preprocessing....515
13.8.2 Configure, train, and test the MLP....516
13.8.3 Compute and plot the partial dependence....517
13.8.4 Comparison studies on different regressors....518
13.8.5 Gradient boosting regressor....518
13.8.6 Decision tree regressor....521
References....522
14 Overfitting and Regularization....524
14.1 Why Regularization....524
14.2 Tikhonov Regularization....527
14.2.1 Demonstration examples: One data-point....531
14.2.2 Demonstration examples: Two data-points....540
14.2.3 Demonstration examples: Three data-points....544
14.2.4 Summary of the case studies....548
14.3 A Case Study on Regularization Effects using MXNet....549
14.3.1 Load the MNIST dataset....550
14.3.2 Define a neural network model....550
14.3.3 Define loss function and optimizer....550
14.3.4 Define a function to evaluate the accuracy....551
14.3.5 Define a utility function plotting convergence curve....551
14.3.6 Train the neural network model....552
14.3.7 Evaluation of the trained model: A typical case of overfitting....554
14.3.8 Application of L2 regularization....554
14.3.9 Re-initializing the parameters....554
14.3.10 Training the L2-regularized neural network model....554
14.3.11 Effect of the L2 regularization....556
14.4 A Case Study on Regularization Parameters Using Sklearn....557
References....561
15 Convolutional Neural Network (CNN) for Classification and Object Detection....562
15.1 Filter and Convolution....562
15.2 Affine Transformation Unit in CNNs....565
15.3 Pooling....567
15.4 Up Sampling....568
15.5 Configuration of a Typical CNN....568
15.6 Some Landmark CNNs....569
15.6.1 LeNet-5....570
15.6.2 AlexNet....571
15.6.3 VGG-16....572
15.6.4 ResNet....572
15.6.5 Inception....574
15.6.6 YOLO: A CONV net for object detection....574
15.7 An Example of Convolutional Neural Network....575
15.7.1 Import TensorFlow....576
15.7.2 Download and preparation of a CIFAR10 dataset....576
15.7.3 Verification of the data....576
15.7.4 Creation of Conv2D layers....577
15.7.5 Add Dense layers to the Conv2D layers....579
15.7.6 Compile and train the CNN model....580
15.7.7 Evaluation of the trained CNN model....580
15.8 Applications of YOLO for Object Detection....581
References....585
16 Recurrent Neural Network (RNN) and Sequence Feature Models....586
16.1 A Typical Structure of LSTMs....587
16.2 Formulation of LSTMs....588
16.2.1 General formulation....588
16.2.2 LSTM layer and standard neural layer....589
16.2.3 Reduced LSTM....589
16.3 Peephole LSTM....590
16.4 Gated Recurrent Units (GRUs)....591
16.5 Examples....592
16.5.1 A simple reduced LSTM with a standard NN layer for regression....592
16.5.2 LSTM class in tensorflow.keras....597
16.5.3 Using LSTM for handwritten digit recognition....598
16.5.4 Using LSTM for predicting dynamics of moving vectors....601
16.6 Examples of LSTM for Speech Recognition....607
References....607
17 Unsupervised Learning Techniques....608
17.1 Background....608
17.2 K-means for Clustering....608
17.2.1 Initialization of means....609
17.2.2 Assignment of data-points to clusters....610
17.2.3 Update of means....611
17.2.4 Example 1: Case studies on comparison of initiation methods for K-means clustering....613
17.2.4.1 Define a function for benchmarking study....614
17.2.4.2 Generation of synthetic data-points....617
17.2.4.3 Examination of different initiation methods....619
17.2.4.4 Visualize the clustering results....621
17.2.5 Example 2: K-means clustering on the handwritten digit dataset....624
17.2.5.1 Load handwritten digit dataset....624
17.2.5.2 Examination of different initiation methods....625
17.2.5.3 Visualize the results for handwritten digit clustering using PCA....627
17.3 Mean-Shift for Clustering Without Pre-Specifying k....628
17.4 Autoencoders....632
17.4.1 Basic structure of autoencoders....633
17.4.2 Example 1: Image compression and denoising....634
17.4.3 Example 2: Image segmentation....634
17.5 Autoencoder vs. PCA....638
17.6 Variational Autoencoder (VAE)....640
17.6.1.1 Key ideas in VAE....641
17.6.1.2 KL-divergence for two single-variable normal distributions....642
17.6.1.3 KL-divergence for two multi-variable normal distributions....643
References....646
18 Reinforcement Learning (RL)....648
18.1 Basic Underlying Concept....648
18.1.1 Problem statement....648
18.1.2 Applications in sciences, engineering, and business....649
18.1.3 Reinforcement learning approach....650
18.1.4 Actions in discrete time: Solution strategy....651
18.2 Markov Decision Process....652
18.3 Policy....653
18.4 Value Functions....653
18.5 Bellman Equation....654
18.6 Q-learning Algorithm....656
18.6.1 Example 1: A robot explores a room with unknown obstacles with Q-learning algorithm....656
18.6.2 OpenAI Gym....658
18.6.3 Define utility functions....659
18.6.4 A simple Q-learning algorithm....659
18.6.5 Hyper-parameters and convergence....663
18.7 Q-Network Learning....664
18.7.1 Example 2: A robot explores a room with unknown obstacles with Q-Network....664
18.7.2 Building TensorFlow graph....665
18.7.3 Results from the Q-Network....667
18.8 Policy gradient methods....669
18.8.1 PPO with NN policy....669
18.8.2 Strategy used in policy gradient methods and PPO....670
18.8.2.1 Build an NN model for policy....670
18.8.2.2 P and R formulation....670
18.8.3 Ratio policy....672
18.8.4 PPO: Controlling a pole staying upright....673
18.8.5 Save and reload the learned model....677
18.8.6 Evaluate and view the trained model....677
18.8.7 PPO: Self-driving car....680
18.8.8 View samples of the racing car before training....681
18.8.9 Train the racing car using the CNN policy....682
18.8.10 Evaluate and view the learned model....683
18.9 Remarks....685
References....685
Index....686
Machine Learning (ML) has become a very important area of research widely used in various industries.This compendium introduces the basic concepts, fundamental theories and essential computational techniques related to ML models. With most essential basics and a strong foundation, one can comfortably learn related topics, methods, and algorithms. Most importantly, readers with strong fundamentals can even develop innovative and more effective machine models for his/her problems. The book is written to achieve this goal.This book will cover most of these algorithms (Linear and logistic regression, Decision Tree, Support Vector Machine, Naive Bayes, etc.), but our focus will be more on neural network-based models because rigorous theory and predictive models can be established. Machine Learning is a very active area of research and development. New models, including the so-called cognitive machine learning models, are being studied.Different types of effective artificial Neural Networks (NNs) with various configurations have been developed and widely used for practical problems in sciences and engineering, including multilayer perceptron (MLP), Convolutional Neural Networks (CNNs), and Recurrent Neural Networks (RNNs). TrumpetNets and TubeNets were also recently proposed by the author for creating two-way deepnets using physics-law-based models as trainers, such as the FEM and S-FEM.Machine Learning is essentially to mimic the natural learning process occurring in biological brains that can have a huge number of neurons. In terms of usage of data, we may have three major categories:
The useful reference text benefits professionals, academics, researchers, graduate and undergraduate students in AI, ML and neural networks.