AI Engineering: Building Applications with Foundation Models

AI Engineering: Building Applications with Foundation Models

AI Engineering: Building Applications with Foundation Models
Автор: Huyen Chip
Дата выхода: 2025
Издательство: O’Reilly Media, Inc.
Количество страниц: 535
Размер файла: 31,9 МБ
Тип файла: PDF
Добавил: codelibs
 Проверить на вирусы

Cover

Copyright

Table of Contents

Preface

What This Book Is About

What This Book Is Not

Who This Book Is For

Navigating This Book

Conventions Used in This Book

Using Code Examples

O’Reilly Online Learning

How to Contact Us

Acknowledgments

Chapter 1. Introduction to Building AI Applications with Foundation Models

The Rise of AI Engineering

From Language Models to Large Language Models

From Large Language Models to Foundation Models

From Foundation Models to AI Engineering

Foundation Model Use Cases

Coding

Image and Video Production

Writing

Education

Conversational Bots

Information Aggregation

Data Organization

Workflow Automation

Planning AI Applications

Use Case Evaluation

Setting Expectations

Milestone Planning

Maintenance

The AI Engineering Stack

Three Layers of the AI Stack

AI Engineering Versus ML Engineering

AI Engineering Versus Full-Stack Engineering

Summary

Chapter 2. Understanding Foundation Models

Training Data

Multilingual Models

Domain-Specific Models

Modeling

Model Architecture

Model Size

Post-Training

Supervised Finetuning

Preference Finetuning

Sampling

Sampling Fundamentals

Sampling Strategies

Test Time Compute

Structured Outputs

The Probabilistic Nature of AI

Summary

Chapter 3. Evaluation Methodology

Challenges of Evaluating Foundation Models

Understanding Language Modeling Metrics

Entropy

Cross Entropy

Bits-per-Character and Bits-per-Byte

Perplexity

Perplexity Interpretation and Use Cases

Exact Evaluation

Functional Correctness

Similarity Measurements Against Reference Data

Introduction to Embedding

AI as a Judge

Why AI as a Judge?

How to Use AI as a Judge

Limitations of AI as a Judge

What Models Can Act as Judges?

Ranking Models with Comparative Evaluation

Challenges of Comparative Evaluation

The Future of Comparative Evaluation

Summary

Chapter 4. Evaluate AI Systems

Evaluation Criteria

Domain-Specific Capability

Generation Capability

Instruction-Following Capability

Cost and Latency

Model Selection

Model Selection Workflow

Model Build Versus Buy

Navigate Public Benchmarks

Design Your Evaluation Pipeline

Step 1. Evaluate All Components in a System

Step 2. Create an Evaluation Guideline

Step 3. Define Evaluation Methods and Data

Summary

Chapter 5. Prompt Engineering

Introduction to Prompting

In-Context Learning: Zero-Shot and Few-Shot

System Prompt and User Prompt

Context Length and Context Efficiency

Prompt Engineering Best Practices

Write Clear and Explicit Instructions

Provide Sufficient Context

Break Complex Tasks into Simpler Subtasks

Give the Model Time to Think

Iterate on Your Prompts

Evaluate Prompt Engineering Tools

Organize and Version Prompts

Defensive Prompt Engineering

Proprietary Prompts and Reverse Prompt Engineering

Jailbreaking and Prompt Injection

Information Extraction

Defenses Against Prompt Attacks

Summary

Chapter 6. RAG and Agents

RAG

RAG Architecture

Retrieval Algorithms

Retrieval Optimization

RAG Beyond Texts

Agents

Agent Overview

Tools

Planning

Agent Failure Modes and Evaluation

Memory

Summary

Chapter 7. Finetuning

Finetuning Overview

When to Finetune

Reasons to Finetune

Reasons Not to Finetune

Finetuning and RAG

Memory Bottlenecks

Backpropagation and Trainable Parameters

Memory Math

Numerical Representations

Quantization

Finetuning Techniques

Parameter-Efficient Finetuning

Model Merging and Multi-Task Finetuning

Finetuning Tactics

Summary

Chapter 8. Dataset Engineering

Data Curation

Data Quality

Data Coverage

Data Quantity

Data Acquisition and Annotation

Data Augmentation and Synthesis

Why Data Synthesis

Traditional Data Synthesis Techniques

AI-Powered Data Synthesis

Model Distillation

Data Processing

Inspect Data

Deduplicate Data

Clean and Filter Data

Format Data

Summary

Chapter 9. Inference Optimization

Understanding Inference Optimization

Inference Overview

Inference Performance Metrics

AI Accelerators

Inference Optimization

Model Optimization

Inference Service Optimization

Summary

Chapter 10. AI Engineering Architecture and User Feedback

AI Engineering Architecture

Step 1. Enhance Context

Step 2. Put in Guardrails

Step 3. Add Model Router and Gateway

Step 4. Reduce Latency with Caches

Step 5. Add Agent Patterns

Monitoring and Observability

AI Pipeline Orchestration

User Feedback

Extracting Conversational Feedback

Feedback Design

Feedback Limitations

Summary

Epilogue

Index

About the Author

Colophon

Recent breakthroughs in AI have not only increased demand for AI products, they've also lowered the barriers to entry for those who want to build AI products. The model-as-a-service approach has transformed AI from an esoteric discipline into a powerful development tool that anyone can use. Everyone, including those with minimal or no prior AI experience, can now leverage AI models to build applications. In this book, author Chip Huyen discusses AI engineering: the process of building applications with readily available foundation models.

The book starts with an overview of AI engineering, explaining how it differs from traditional ML engineering and discussing the new AI stack. The more AI is used, the more opportunities there are for catastrophic failures, and therefore, the more important evaluation becomes. This book discusses different approaches to evaluating open-ended models, including the rapidly growing AI-as-a-judge approach.

AI application developers will discover how to navigate the AI landscape, including models, datasets, evaluation benchmarks, and the seemingly infinite number of use cases and application patterns. You'll learn a framework for developing an AI application, starting with simple techniques and progressing toward more sophisticated methods, and discover how to efficiently deploy these applications.


  • Understand what AI engineering is and how it differs from traditional machine learning engineering
  • Learn the process for developing an AI application, the challenges at each step, and approaches to address them
  • Explore various model adaptation techniques, including prompt engineering, RAG, fine-tuning, agents, and dataset engineering, and understand how and why they work
  • Examine the bottlenecks for latency and cost when serving foundation models and learn how to overcome them
  • Choose the right model, dataset, evaluation benchmarks, and metrics for your needs
  • Chip Huyen works to accelerate data analytics on GPUs at Voltron Data. Previously, she was with Snorkel AI and NVIDIA, founded an AI infrastructure startup, and taught Machine Learning Systems Design at Stanford. She's the author of the book Designing Machine Learning Systems, an Amazon bestseller in AI.
  • AI Engineering builds upon and is complementary to Designing Machine Learning Systems (O'Reilly).



Похожее:

Список отзывов:

Нет отзывов к книге.