Applied Machine Learning and High-Performance Computing on AWS: Accelerate the development of machine learning applications following architectural best practices

Applied Machine Learning and High-Performance Computing on AWS: Accelerate the development of machine learning applications following architectural best practices

Applied Machine Learning and High-Performance Computing on AWS: Accelerate the development of machine learning applications following architectural best practices
Автор: Khanuja Mani, Potgieter Trenton, Sabir Farooq, Subramanian Shreyas
Дата выхода: 2022
Издательство: Packt Publishing Limited
Количество страниц: 510
Размер файла: 6.8 MB
Тип файла: PDF
Добавил: codelibs
 Проверить на вирусы  Дополнительные материалы 

Cover Page....2

Table of Contents....3

Preface....5

Part 1: Introducing High-Performance Computing....14

Chapter 1: High-Performance Computing Fundamentals....15

Why do we need HPC?....16

Limitations of on-premises HPC....17

Benefits of doing HPC on the cloud....19

Driving innovation across industries with HPC....22

Summary....26

Further reading....27

Chapter 2: Data Management and Transfer....28

Importance of data management....28

Challenges of moving data into the cloud....30

How to securely transfer large amounts of data into the cloud....32

AWS online data transfer services....34

AWS offline data transfer services....47

Summary....58

Further reading....58

Chapter 3: Compute and Networking....60

Introducing the AWS compute ecosystem....61

Networking on AWS....74

Selecting the right compute for HPC workloads....81

Best practices for HPC workloads....89

Summary....94

References....94

Chapter 4: Data Storage....98

Technical requirements....99

AWS services for storing data....99

Data security and governance....125

Tiered storage for cost optimization....128

Choosing the right storage option for HPC workloads....131

Summary....133

Further reading....133

Part 2: Applied Modeling....135

Chapter 5: Data Analysis....136

Technical requirements....137

Exploring data analysis methods....137

Reviewing the AWS services for data analysis....142

Analyzing large amounts of structured and unstructured data....146

Processing data at scale on AWS....172

Cleaning up....173

Summary....174

Chapter 6: Distributed Training of Machine Learning Models....175

Technical requirements....176

Building ML systems using AWS....176

Introducing the fundamentals of distributed training....180

Executing a distributed training workload on AWS....190

Summary....205

Chapter 7: Deploying Machine Learning Models at Scale....207

Managed deployment on AWS....208

Choosing the right deployment option....213

Batch inference....217

Real-time inference....225

Asynchronous inference....233

The high availability of model endpoints....235

Blue/green deployments....240

Summary....242

References....242

Chapter 8: Optimizing and Managing Machine Learning Models for Edge Deployment....244

Technical requirements....245

Understanding edge computing....245

Reviewing the key considerations for optimal edge deployments....246

Designing an architecture for optimal edge deployments....250

Summary....271

Chapter 9: Performance Optimization for Real-Time Inference....272

Technical requirements....273

Reducing the memory footprint of DL models....273

Key metrics for optimizing models....287

Choosing the instance type, load testing, and performance tuning for models....289

Observing the results....297

Summary....299

Chapter 10: Data Visualization....300

Data visualization using Amazon SageMaker Data Wrangler....301

Amazon’s graphics-optimized instances....318

Summary....319

Further reading....319

Part 3: Driving Innovation Across Industries....321

Chapter 11: Computational Fluid Dynamics....322

Technical requirements....322

Introducing CFD....323

Reviewing best practices for running CFD on AWS....329

Discussing how ML can be applied to CFD....355

Summary....358

References....358

Chapter 12: Genomics....360

Technical requirements....361

Managing large genomics data on AWS....361

Designing architecture for genomics....363

Applying ML to genomics....364

Summary....385

Chapter 13: Autonomous Vehicles....386

Technical requirements....387

Introducing AV systems....387

AWS services supporting AV systems....391

Designing an architecture for AV systems....394

ML applied to AV systems....399

Summary....424

References....425

Chapter 14: Numerical Optimization....429

Introduction to optimization....430

Common numerical optimization algorithms....440

Example use cases of large-scale numerical optimization problems....446

Numerical optimization using high-performance compute on AWS....458

Machine learning and numerical optimization....463

Summary....465

Further reading....466

Index....468

Why subscribe?....503

Other Books You May Enjoy....504

Packt is searching for authors like you....508

Share Your Thoughts....508

Download a free PDF copy of this book....509

Machine learning (ML) and high-performance computing (HPC) on AWS run compute-intensive workloads across industries and emerging applications. Its use cases can be linked to various verticals, such as computational fluid dynamics (CFD), genomics, and autonomous vehicles.

This book provides end-to-end guidance, starting with HPC concepts for storage and networking. It then progresses to working examples on how to process large datasets using SageMaker Studio and EMR. Next, you'll learn how to build, train, and deploy large models using distributed training. Later chapters also guide you through deploying models to edge devices using SageMaker and IoT Greengrass, and performance optimization of ML models, for low latency use cases.

By the end of this book, you'll be able to build, train, and deploy your own large-scale ML application, using HPC on AWS, following industry best practices and addressing the key pain points encountered in the application life cycle.

What You Will Learn:

  • Explore data management, storage, and fast networking for HPC applications
  • Focus on the analysis and visualization of a large volume of data using Spark
  • Train visual transformer models using SageMaker distributed training
  • Deploy and manage ML models at scale on the cloud and at the edge
  • Get to grips with performance optimization of ML models for low latency workloads
  • Apply HPC to industry domains such as CFD, genomics, AV, and optimization

Who this book is for:

The book begins with HPC concepts, however, it expects you to have prior machine learning knowledge. This book is for ML engineers and data scientists interested in learning advanced topics on using large datasets for training large models using distributed training concepts on AWS, deploying models at scale, and performance optimization for low latency use cases. Practitioners in fields such as numerical optimization, computation fluid dynamics, autonomous vehicles, and genomics, who require HPC for applying ML models to applications at scale will also find the book useful.


Похожее:

Список отзывов:

Нет отзывов к книге.