Generative AI on AWS: Building Context-Aware Multimodal Reasoning Applications

Generative AI on AWS: Building Context-Aware Multimodal Reasoning Applications

Автор: Barth Antje , Eigenbrode Shelbee , Fregly Chris

Дата выхода: 2024

Издательство: O’Reilly Media, Inc.

Количество страниц: 312

Размер файла: 2,8 МБ

Тип файла: PDF

Добавил: codelibs

Проверить на вирусы

Cover....1

Table of Contents....9

Preface....15

Conventions Used in This Book....16

Using Code Examples....17

O’Reilly Online Learning....18

How to Contact Us....18

Acknowledgments....19

Chris....19

Antje....19

Shelbee....19

Chapter 1. Generative AI Use Cases, Fundamentals, and Project Life Cycle....21

Use Cases and Tasks....21

Foundation Models and Model Hubs....24

Generative AI Project Life Cycle....25

Generative AI on AWS....28

Why Generative AI on AWS?....31

Building Generative AI Applications on AWS....32

Summary....33

Chapter 2. Prompt Engineering and In-Context Learning....35

Prompts and Completions....35

Tokens....36

Prompt Engineering....36

Prompt Structure....38

Instruction....38

Context....38

In-Context Learning with Few-Shot Inference....40

Zero-Shot Inference....41

One-Shot Inference....41

Few-Shot Inference....42

In-Context Learning Gone Wrong....43

In-Context Learning Best Practices....43

Prompt-Engineering Best Practices....44

Inference Configuration Parameters....49

Summary....54

Chapter 3. Large-Language Foundation Models....55

Large-Language Foundation Models....56

Tokenizers....57

Embedding Vectors....58

Transformer Architecture....60

Inputs and Context Window....62

Embedding Layer....62

Encoder....62

Self-Attention....62

Decoder....64

Softmax Output....64

Types of Transformer-Based Foundation Models....66

Pretraining Datasets....68

Scaling Laws....69

Compute-Optimal Models....71

Summary....72

Chapter 4. Memory and Compute Optimizations....75

Memory Challenges....75

Data Types and Numerical Precision....78

Quantization....79

fp16....80

bfloat16....82

fp8....84

int8....84

Optimizing the Self-Attention Layers....86

FlashAttention....87

Grouped-Query Attention....87

Distributed Computing....88

Distributed Data Parallel....89

Fully Sharded Data Parallel....90

Performance Comparison of FSDP over DDP....92

Distributed Computing on AWS....94

Fully Sharded Data Parallel with Amazon SageMaker....95

AWS Neuron SDK and AWS Trainium....97

Summary....97

Chapter 5. Fine-Tuning and Evaluation....99

Instruction Fine-Tuning....100

Llama 2-Chat....100

Falcon-Chat....100

FLAN-T5....100

Instruction Dataset....101

Multitask Instruction Dataset....101

FLAN: Example Multitask Instruction Dataset....102

Prompt Template....103

Convert a Custom Dataset into an Instruction Dataset....104

Instruction Fine-Tuning....106

Amazon SageMaker Studio....107

Amazon SageMaker JumpStart....108

Amazon SageMaker Estimator for Hugging Face....109

Evaluation....110

Evaluation Metrics....111

Benchmarks and Datasets....112

Summary....114

Chapter 6. Parameter-Efficient Fine-Tuning....115

Full Fine-Tuning Versus PEFT....116

LoRA and QLoRA....118

LoRA Fundamentals....119

Rank....120

Target Modules and Layers....120

Applying LoRA....121

Merging LoRA Adapter with Original Model....123

Maintaining Separate LoRA Adapters....124

Full-Fine Tuning Versus LoRA Performance....124

QLoRA....125

Prompt Tuning and Soft Prompts....126

Summary....129

Chapter 7. Fine-Tuning with Reinforcement Learning from Human Feedback....131

Human Alignment: Helpful, Honest, and Harmless....132

Reinforcement Learning Overview....132

Train a Custom Reward Model....135

Collect Training Dataset with Human-in-the-Loop....135

Sample Instructions for Human Labelers....136

Using Amazon SageMaker Ground Truth for Human Annotations....136

Prepare Ranking Data to Train a Reward Model....138

Train the Reward Model....141

Existing Reward Model: Toxicity Detector by Meta....143

Fine-Tune with Reinforcement Learning from Human Feedback....144

Using the Reward Model with RLHF....145

Proximal Policy Optimization RL Algorithm....146

Perform RLHF Fine-Tuning with PPO....146

Mitigate Reward Hacking....148

Using Parameter-Efficient Fine-Tuning with RLHF....150

Evaluate RLHF Fine-Tuned Model....151

Qualitative Evaluation....151

Quantitative Evaluation....152

Load Evaluation Model....153

Define Evaluation-Metric Aggregation Function....153

Compare Evaluation Metrics Before and After....154

Summary....155

Chapter 8. Model Deployment Optimizations....157

Model Optimizations for Inference....157

Pruning....159

Post-Training Quantization with GPTQ....160

Distillation....162

Large Model Inference Container....164

AWS Inferentia: Purpose-Built Hardware for Inference....165

Model Update and Deployment Strategies....167

A/B Testing....168

Shadow Deployment....169

Metrics and Monitoring....171

Autoscaling....172

Autoscaling Policies....172

Define an Autoscaling Policy....173

Summary....174

Chapter 9. Context-Aware Reasoning Applications Using RAG and Agents....175

Large Language Model Limitations....176

Hallucination....177

Knowledge Cutoff....177

Retrieval-Augmented Generation....178

External Sources of Knowledge....179

RAG Workflow....180

Document Loading....181

Chunking....182

Document Retrieval and Reranking....183

Prompt Augmentation....184

RAG Orchestration and Implementation....185

Document Loading and Chunking....186

Embedding Vector Store and Retrieval....188

Retrieval Chains....191

Reranking with Maximum Marginal Relevance....193

Agents....194

ReAct Framework....196

Program-Aided Language Framework....198

Generative AI Applications....201

FMOps: Operationalizing the Generative AI Project Life Cycle....207

Experimentation Considerations....208

Development Considerations....210

Production Deployment Considerations....212

Summary....213

Chapter 10. Multimodal Foundation Models....215

Use Cases....216

Multimodal Prompt Engineering Best Practices....217

Image Generation and Enhancement....218

Image Generation....218

Image Editing and Enhancement....219

Inpainting, Outpainting, Depth-to-Image....224

Inpainting....224

Outpainting....226

Depth-to-Image....227

Image Captioning and Visual Question Answering....229

Image Captioning....231

Content Moderation....231

Visual Question Answering....231

Model Evaluation....236

Text-to-Image Generative Tasks....236

Forward Diffusion....239

Nonverbal Reasoning....239

Diffusion Architecture Fundamentals....241

Forward Diffusion....241

Reverse Diffusion....242

U-Net....243

Stable Diffusion 2 Architecture....244

Text Encoder....245

U-Net and Diffusion Process....246

Text Conditioning....248

Cross-Attention....248

Scheduler....249

Image Decoder....249

Stable Diffusion XL Architecture....250

U-Net and Cross-Attention....250

Refiner....250

Conditioning....251

Summary....253

Chapter 11. Controlled Generation and Fine-Tuning with Stable Diffusion....255

ControlNet....255

Fine-Tuning....260

DreamBooth....261

DreamBooth and PEFT-LoRA....263

Textual Inversion....265

Human Alignment with Reinforcement Learning from Human Feedback....269

Summary....272

Chapter 12. Amazon Bedrock: Managed Service for Generative AI....273

Bedrock Foundation Models....273

Amazon Titan Foundation Models....274

Stable Diffusion Foundation Models from Stability AI....274

Bedrock Inference APIs....274

Large Language Models....276

Generate SQL Code....277

Summarize Text....277

Embeddings....278

Fine-Tuning....281

Agents....284

Multimodal Models....287

Create Images from Text....287

Create Images from Images....289

Data Privacy and Network Security....290

Governance and Monitoring....292

Summary....292

Index....293

About the Authors....310

Colophon....310

Companies today are moving rapidly to integrate generative AI into their products and services. But there's a great deal of hype (and misunderstanding) about the impact and promise of this technology. With this book, Chris Fregly, Antje Barth, and Shelbee Eigenbrode from AWS help CTOs, ML practitioners, application developers, business analysts, data engineers, and data scientists find practical ways to use this exciting new technology.

You'll learn the generative AI project life cycle including use case definition, model selection, model fine-tuning, retrieval-augmented generation, reinforcement learning from human feedback, and model quantization, optimization, and deployment. And you'll explore different types of models including large language models (LLMs) and multimodal models such as Stable Diffusion for generating images and Flamingo/IDEFICS for answering questions about images.

Apply generative AI to your business use cases
Determine which generative AI models are best suited to your task
Perform prompt engineering and in-context learning
Fine-tune generative AI models on your datasets with low-rank adaptation (LoRA)
Align generative AI models to human values with reinforcement learning from human feedback (RLHF)
Augment your model with retrieval-augmented generation (RAG)
Explore libraries such as LangChain and ReAct to develop agents and actions
Build generative AI applications with Amazon Bedrock

Если вам понравилась эта страница - поделитесь ею с друзьями, тем самым вы помогаете нам развиваться и добавлять всё больше интересных и нужным вам книг