Building AI Intensive Python Applications: Create intelligent apps with LLMs and vector databases

Building AI Intensive Python Applications: Create intelligent apps with LLMs and vector databases

Автор: Alake Richmond , Gangadhar Ashwin , Larew Nicholas , Narváez Sigfrido , Palmer Rachelle , Perlmutter Ben , Ranjan Shubham , Rueckstiess Thomas , Weller Henry

Дата выхода: 2024

Издательство: Packt Publishing Limited

Количество страниц: 299

Размер файла: 2,3 МБ

Тип файла: PDF

Добавил: codelibs

Проверить на вирусы

Cover....1

FM....2

Table of Contents....10

Preface....16

Chapter 1: Getting Started with Generative AI....20

Technical requirements....21

Defining the terminology....21

The generative AI stack....22

Python and GenAI....23

OpenAI API....24

MongoDB with Vector Search....25

Important features of generative AI....26

Why use generative AI?....27

The ethics and risks of GenAI....27

Summary....28

Chapter 2: Building Blocks of Intelligent Applications....30

Technical requirements....31

Defining intelligent applications....31

The building blocks of intelligent applications....32

LLMs – reasoning engines for intelligent apps....32

Use cases for LLM reasoning engines....33

Diverse capabilities of LLMs....33

Multi-modal language models....34

A paradigm shift in AI development....35

Embedding models and vector databases – semantic long-term memory....35

Embedding models....35

Vector databases....36

Model hosting....37

Your (soon-to-be) intelligent app....38

Sample application – RAG chatbot....39

Implications of intelligent applications for software engineering....42

Summary....42

Part 1....44

Foundations of AI: LLMs, Embedding Models, Vector Databases, and Application Design....44

Chapter 3: Large Language Models....46

Technical requirements....47

Probabilistic framework....47

n-gram language models....49

Machine learning for language modelling....51

Artificial neural networks....51

Training an artificial neural network....53

ANNs for natural language processing....55

Tokenization....55

Embedding....56

Predicting probability distributions....58

Dealing with sequential data....59

Recurrent neural networks....60

Transformer architecture....61

LLMs in practice....63

The evolving field of LLMs....63

Prompting, fine-tuning, and RAG....63

Summary....64

Chapter 4: Embedding Models....66

Technical requirements....67

What is an embedding model?....68

How do embedding models differ from LLMs?....69

When to use embedding models versus LLMs....70

Types of embedding models....70

Choosing embedding models....74

Task requirements....75

Dataset characteristics....75

Computational resources....75

Vector representations....76

Embedding model leaderboards....78

Embedding models overview....78

Do you always need an embedding model?....79

Executing code from LangChain....80

Best practices....83

Summary....83

Chapter 5: Vector Databases....84

Technical requirements....85

What is a vector embedding?....85

Vector similarity....86

Exact versus approximate search....87

Measuring search....88

Graph connectivity....88

Navigable small worlds....89

How to search a navigable small world....90

Hierarchical navigable small worlds....91

The need for vector databases....93

How vector search enhances AI models....94

Case studies and real-world applications....95

Okta – natural language access request (semantic search)....95

One AI – language-based AI (RAG over business data)....96

Novo Nordisk – automatic clinical study generation (advanced RAG/RPA)....97

Vector search best practices....98

Data modeling....98

Deployment....107

Summary....108

Chapter 6: AI/ML Application Design....110

Technical requirements....111

Data modeling....111

Enriching data with embeddings....112

Considering search use cases....114

Data storage....118

Determining the type of database cluster....118

Determining IOPS....119

Determining RAM....120

Final cluster configuration....121

Performance and availability versus cost....122

Data flow....122

Handling static data sources....122

Storing operational data enriched with vector embeddings....123

Freshness and retention....127

Real-time updates....127

Data lifecycle....128

Adopting new embedding models....129

Security and RBAC....130

Best practices for AI/ML application design....131

Summary....132

Part 2....134

Building Your Python Application: Frameworks, Libraries, APIs, and Vector Search....134

Chapter 7: Useful Frameworks, Libraries, and APIs....136

Technical requirements....137

Python for AI/ML....137

AI/ML frameworks....138

LangChain....139

LangChain semantic search with score....143

Semantic search with pre-filtering....144

Implementing a basic RAG solution with LangChain....145

LangChain prompt templates and chains....146

Key Python libraries....147

pandas....147

PyMongoArrow....150

PyTorch....152

AI/ML APIs....153

OpenAI API....154

Hugging Face....155

Summary....159

Chapter 8: Implementing Vector Search in AI Applications....160

Technical requirements....161

Information retrieval with MongoDB Atlas Vector Search....162

Vector search tutorial in Python....162

Vector Search tutorial with LangChain....168

Building RAG architecture systems....169

Chunking or document-splitting strategies....171

Simple RAG....173

Advanced RAG....176

Summary....186

Part 3....188

Optimizing AI Applications: Scaling, Fine-Tuning, Troubleshooting, Monitoring, and Analytics....188

Chapter 9: LLM Output Evaluation....190

Technical requirements....191

What is LLM evaluation?....191

Component and end-to-end evaluations....192

Model benchmarking....195

Evaluation datasets....196

Defining a baseline....198

User feedback....198

Synthetic data....199

Evaluation metrics....200

Assertion-based metrics....200

Statistical metrics....203

LLM-as-a-judge evaluations....206

RAG metrics....211

Human review....219

Evaluations as guardrails....220

Summary....220

Chapter 10: Refining the Semantic Data Model to Improve Accuracy....222

Technical requirements....223

Embeddings....223

Experimenting with different embedding models....223

Fine-tuning embedding models....227

Embedding metadata....229

Formatting metadata....232

Including static metadata....237

Extracting metadata programmatically....237

Generating metadata with LLMs....238

Including metadata with query embedding and ingested content embeddings....240

Optimizing retrieval-augmented generation....242

Query mutation....242

Extracting query metadata for pre-filtering....243

Formatting ingested data....246

Advanced retrieval systems....248

Summary....249

Chapter 11: Common Failures of Generative AI....250

Technical requirements....251

Hallucinations....251

Causes of hallucinations....251

Implications of hallucinations....253

Sycophancy....253

Causes of sycophancy....254

Implications of sycophancy....255

Data leakage....256

Causes of data leakage....256

Implications of data leakage....258

Cost....259

Types of costs....259

Tokens....260

Performance issues in generative AI applications....262

Computational load....263

Model serving strategies....264

High I/O operations....265

Summary....265

Chapter 12: Correcting and Optimizing Your Generative AI Application....266

Technical requirements....267

Baselining....267

Training and evaluation datasets....268

Few-shot prompting....271

Retrieval and reranking....273

Late interaction strategies....274

Query rewriting....275

Testing and red teaming....276

Testing....276

Red teaming....278

Information post-processing....279

Other remedies....280

Summary....281

Appendix: Further Reading....282

Index....288

Other Books You May Enjoy....295

Master retrieval-augmented generation architecture and fine-tune your AI stack, along with discovering real-world use cases and best practices to create powerful AI apps

Key Features

Get to grips with the fundamentals of LLMs, vector databases, and Python frameworks
Implement effective retrieval-augmented generation strategies with MongoDB Atlas
Optimize AI models for performance and accuracy with model compression and deployment optimization
Purchase of the print or Kindle book includes a free PDF eBook

Book Description

The era of generative AI is upon us, and this book serves as a roadmap to harness its full potential. With its help, you’ll learn the core components of the AI stack: large language models (LLMs), vector databases, and Python frameworks, and see how these technologies work together to create intelligent applications.

The chapters will help you discover best practices for data preparation, model selection, and fine-tuning, and teach you advanced techniques such as retrieval-augmented generation (RAG) to overcome common challenges, such as hallucinations and data leakage. You’ll get a solid understanding of vector databases, implement effective vector search strategies, refine models for accuracy, and optimize performance to achieve impactful results. You’ll also identify and address AI failures to ensure your applications deliver reliable and valuable results. By evaluating and improving the output of LLMs, you’ll be able to enhance their performance and relevance.

By the end of this book, you’ll be well-equipped to build sophisticated AI applications that deliver real-world value.

What you will learn

Understand the architecture and components of the generative AI stack
Explore the role of vector databases in enhancing AI applications
Master Python frameworks for AI development
Implement Vector Search in AI applications
Find out how to effectively evaluate LLM output
Overcome common failures and challenges in AI development

Who this book is for

This book is for software engineers and developers looking to build intelligent applications using generative AI. While the book is suitable for beginners, a basic understanding of Python programming is required to make the most of it.

Если вам понравилась эта страница - поделитесь ею с друзьями, тем самым вы помогаете нам развиваться и добавлять всё больше интересных и нужным вам книг