Cover Page....2
Table of Contents....3
Preface....5
Part 1: Introducing High-Performance Computing....14
Chapter 1: High-Performance Computing Fundamentals....15
Why do we need HPC?....16
Limitations of on-premises HPC....17
Benefits of doing HPC on the cloud....19
Driving innovation across industries with HPC....22
Summary....26
Further reading....27
Chapter 2: Data Management and Transfer....28
Importance of data management....28
Challenges of moving data into the cloud....30
How to securely transfer large amounts of data into the cloud....32
AWS online data transfer services....34
AWS offline data transfer services....47
Summary....58
Further reading....58
Chapter 3: Compute and Networking....60
Introducing the AWS compute ecosystem....61
Networking on AWS....74
Selecting the right compute for HPC workloads....81
Best practices for HPC workloads....89
Summary....94
References....94
Chapter 4: Data Storage....98
Technical requirements....99
AWS services for storing data....99
Data security and governance....125
Tiered storage for cost optimization....128
Choosing the right storage option for HPC workloads....131
Summary....133
Further reading....133
Part 2: Applied Modeling....135
Chapter 5: Data Analysis....136
Technical requirements....137
Exploring data analysis methods....137
Reviewing the AWS services for data analysis....142
Analyzing large amounts of structured and unstructured data....146
Processing data at scale on AWS....172
Cleaning up....173
Summary....174
Chapter 6: Distributed Training of Machine Learning Models....175
Technical requirements....176
Building ML systems using AWS....176
Introducing the fundamentals of distributed training....180
Executing a distributed training workload on AWS....190
Summary....205
Chapter 7: Deploying Machine Learning Models at Scale....207
Managed deployment on AWS....208
Choosing the right deployment option....213
Batch inference....217
Real-time inference....225
Asynchronous inference....233
The high availability of model endpoints....235
Blue/green deployments....240
Summary....242
References....242
Chapter 8: Optimizing and Managing Machine Learning Models for Edge Deployment....244
Technical requirements....245
Understanding edge computing....245
Reviewing the key considerations for optimal edge deployments....246
Designing an architecture for optimal edge deployments....250
Summary....271
Chapter 9: Performance Optimization for Real-Time Inference....272
Technical requirements....273
Reducing the memory footprint of DL models....273
Key metrics for optimizing models....287
Choosing the instance type, load testing, and performance tuning for models....289
Observing the results....297
Summary....299
Chapter 10: Data Visualization....300
Data visualization using Amazon SageMaker Data Wrangler....301
Amazon’s graphics-optimized instances....318
Summary....319
Further reading....319
Part 3: Driving Innovation Across Industries....321
Chapter 11: Computational Fluid Dynamics....322
Technical requirements....322
Introducing CFD....323
Reviewing best practices for running CFD on AWS....329
Discussing how ML can be applied to CFD....355
Summary....358
References....358
Chapter 12: Genomics....360
Technical requirements....361
Managing large genomics data on AWS....361
Designing architecture for genomics....363
Applying ML to genomics....364
Summary....385
Chapter 13: Autonomous Vehicles....386
Technical requirements....387
Introducing AV systems....387
AWS services supporting AV systems....391
Designing an architecture for AV systems....394
ML applied to AV systems....399
Summary....424
References....425
Chapter 14: Numerical Optimization....429
Introduction to optimization....430
Common numerical optimization algorithms....440
Example use cases of large-scale numerical optimization problems....446
Numerical optimization using high-performance compute on AWS....458
Machine learning and numerical optimization....463
Summary....465
Further reading....466
Index....468
Why subscribe?....503
Other Books You May Enjoy....504
Packt is searching for authors like you....508
Share Your Thoughts....508
Download a free PDF copy of this book....509
Machine learning (ML) and high-performance computing (HPC) on AWS run compute-intensive workloads across industries and emerging applications. Its use cases can be linked to various verticals, such as computational fluid dynamics (CFD), genomics, and autonomous vehicles.
This book provides end-to-end guidance, starting with HPC concepts for storage and networking. It then progresses to working examples on how to process large datasets using SageMaker Studio and EMR. Next, you'll learn how to build, train, and deploy large models using distributed training. Later chapters also guide you through deploying models to edge devices using SageMaker and IoT Greengrass, and performance optimization of ML models, for low latency use cases.
By the end of this book, you'll be able to build, train, and deploy your own large-scale ML application, using HPC on AWS, following industry best practices and addressing the key pain points encountered in the application life cycle.
The book begins with HPC concepts, however, it expects you to have prior machine learning knowledge. This book is for ML engineers and data scientists interested in learning advanced topics on using large datasets for training large models using distributed training concepts on AWS, deploying models at scale, and performance optimization for low latency use cases. Practitioners in fields such as numerical optimization, computation fluid dynamics, autonomous vehicles, and genomics, who require HPC for applying ML models to applications at scale will also find the book useful.