Cover....1
Copyright....3
Table of Contents....4
Preface....12
Whats in This Book....13
Who This Book Is For....16
How to Use This Book....16
Software, Environment, and Resource Requirements....17
Conventions Used in This Book....17
Using Code Examples....18
OReilly Online Learning....18
How to Contact Us....19
Acknowledgments....19
Chapter 1. Introduction to Vector Databases....20
Why Do You Need Vector Databases?....20
A New Data Type: Vector....21
Similarity Search....22
Whats Different About the Vector Type?....25
Where Do You Use Vector Databases?....27
SQL Versus Vector Databases....28
The Foundation of Business Math: Accounting Arithmetic....28
Vector Representation in a Relational Database Management System....29
The Need for Vector-Specific Capabilities....30
NoSQL Versus Vector Databases....30
NoSQL Databases and Vector Storage....30
Limitations of Vector Extensions in NoSQL Databases....31
When to Choose NoSQL with Vector Extensions....31
Hybrid Approaches: Combining Structured and Vector Data....32
The Need for Both Vector Data and Metadata....32
Limitations of Pure Vector Storage....32
Hybrid Database Architecture....33
Example of a Hybrid Query....33
Benefits of the Hybrid Approach....34
Conclusion....34
Chapter 2. Embeddings....36
Understanding Vector Embeddings: Why We Need Them....36
Word2Vec: The Breakthrough That Changed Everything....38
Doc2Vec: From Words to Documents....39
From Embeddings to Modern Language Models: The Transformer Connection....42
Encoder-Only Transformers (BERT and Its Variants)....43
Decoder-Only Transformers (GPT Family)....43
Encoder-Decoder Transformers (T5, BART)....44
Embedding Models: The Specialized Vector Generators....45
Distinction from Traditional Models....45
Role in Modern LLM Applications....46
Practical Applications and Use Cases....47
Simple RAG Pipeline....47
The sentence-transformers Library: The Swiss Army Knife of Text Embeddings....50
Best Practices for Using SentenceTransformers: A Detailed Guide....54
The Embedding Layer: The Gateway to Zero-Shot Learning....59
Anatomy of Transformer Embeddings....59
Connection to Zero-Shot Learning....61
Key Characteristics That Enable Zero-Shot Learning....62
Limitations and Considerations....64
Latest Developments and Trends....65
Vector Arithmetic with Word2Vec: A Hands-On Guide....65
Step 1: Setup and Installation....65
Step 2: Load Pretrained Word2Vec Model....65
Step 3: Implement Vector Arithmetic Functions....66
Step 4: Classic King–Queen Analogy....67
Step 5: More Interesting Analogies....68
Step 6: Interactive Exploration Tool....68
Final Words on Vector Arithmetic....69
Conclusion....70
Chapter 3. Similarity Search with FAISS....72
Foundations....72
Vector Representations....74
Distance Metrics....75
Selection Heuristics....77
FAISS Indexes....77
Flat Indexes (Brute Force)....77
IVF-Based Indexes....78
LSH-Based Indexes....79
HNSW-Based Indexes....80
Other Specialized Indexes....80
Composite and Transformative Indexes....81
Choosing the Right Index....81
Quantization....84
SQ....84
PQ....86
The ANN Problem....90
The Problem....91
Avoid Computational Cost....91
Key ANN Techniques in FAISS....92
Choosing an Index in FAISS....94
Code Example....94
Understanding HNSW Indexes....95
What Is HNSW?....96
How HNSW Works....97
Key Parameters Explained....98
Practical Example: Building a Similarity Search System....99
Performance Characteristics....100
Best Practices....101
FAISS Architecture and Components....102
Foundation....102
Core Concepts....104
Key Components....104
Common Workflow....106
Illustrative Example....106
Key Takeaways....107
Further Exploration....107
Conclusion....108
Chapter 4. Semantic Search with SQLite3....110
Understanding the SQLite Vector Similarity Search Extension....110
Core Capabilities....111
Architecture Overview....112
Limitations....112
Setting Up the Development Environment....113
Installing Dependencies....113
Verifying the Installation....114
Operational Pragmas....115
Designing the Database Schema....115
Schema Requirements....115
Table Definitions....115
Schema Design Decisions....117
Connecting to Reddit with the Python Reddit API Wrapper....118
Creating Reddit API Credentials....118
PRAW Client Implementation....118
Usage Example....121
Content Extraction and Preprocessing....121
Text Cleaning Pipeline....121
Quality Filtering....124
Generating and Storing Embeddings....124
Embedding Generator....125
Database Storage....127
Batch Processing Pipeline....131
Building the Vector Index....132
Understanding VSS Indexing....132
Index Management....133
Implementing Semantic Search....136
Search Result Container....136
Search Engine....136
Putting It All Together....142
Workflow Example....143
Example Output....145
Extension: Incremental Indexing....146
Conclusion....148
Chapter 5. Building an ArXiv Paper Search System with PostgreSQL pgvector....150
The Challenge of Searching Scientific Literature....150
Why ArXiv Makes an Ideal Data Source....150
Real-World Use Cases....151
Technology Stack Rationale....151
Architecture Overview....152
System Components....152
Data Flow....153
Design Philosophy....154
Environment Setup and Dependencies....155
PostgreSQL and pgvector Installation....155
Python Environment Configuration....156
Directory Structure and Configuration....156
Verification and Testing....157
Database Design for Scientific Papers....158
Schema Design Principles....158
Core Tables Structure....159
Vector Storage Strategy....163
Indexing Strategy....163
ArXiv Integration and PDF Management....164
ArXiv API Client Implementation....165
PDF Download Pipeline....166
Batch Processing System....167
PDF Text Extraction and Processing....169
PDF Extraction Challenges....169
Intelligent Text Chunking....171
Embedding Generation and Storage....172
Embedding Model Strategy....173
Batch Processing Pipeline....174
Similarity Search Implementation....175
Interactive Application and UI....177
Docker Packaging for Local Deployment....179
Container Architecture....180
Docker Compose Configuration....180
Database Initialization Scripts....182
Development Workflow....183
Cloud-Ready Design....183
Basic Performance Tuning....184
Index Configuration....184
Query Performance....184
Resource Management....184
Next Steps....184
Current Limitations....184
Enhancement Ideas....185
What We Did....185
System Achievements....185
Technical Skills Gained....186
Practical Research Tool....186
Foundation for Advanced Systems....186
Future Potential....186
Conclusion....186
Chapter 6. Building a Retrieval-Augmented Generation System with SQLite VSS and Ollama....188
System Architecture Overview....189
Database Foundation with Vector Support....190
Setting Up the Vector-Enabled Database....190
Schema Design for RAG....191
Creating Search Indexes....192
Text Processing and Embedding Generation....193
Embedding Model Management....193
Intelligent Text Chunking....194
Storing Content with Embeddings....195
Hybrid Search Implementation....196
Hybrid Search Algorithm....196
Semantic Search Component....197
Keyword Search Component....198
Score Fusion and Ranking....199
LLM Integration with Ollama....200
Ollama API Client....200
Health Check Function....201
The RAG Pipeline....201
Context Formatting....201
Question-Answering Pipeline....202
Demonstration and Testing....204
Sample Data Loading....204
Main Demonstration Function....205
Interactive Q&A Interface....206
Quick Testing Utility....207
Next Steps: Extending the System....207
Missing Reddit Data Features....207
Performance Optimizations....209
Production Considerations....209
Advanced RAG Patterns....210
Conclusion....210
Chapter 7. Building a Scientific RAG System with PostgreSQL and pgvector....212
System Goals and Capabilities....213
Architecture Overview....213
Database Foundation with pgvector....215
Database Configuration and Setup....216
Schema Design for Scientific Papers....216
High-Performance Vector Indexes....218
Embedding Generation Strategy....218
ArXiv Integration and PDF Processing....219
Paper Discovery with ArXiv API....219
Intelligent PDF Text Extraction....220
Advanced Text Chunking....222
Storage Pipeline with Embeddings....223
Multilevel Semantic Search....225
Abstract-Level Search....225
Section-Level Search....226
The RAG Pipeline: Deep Dive....227
Local LLM Integration with Ollama....228
Health Check and Model Discovery....229
Intelligent Context Retrieval....229
Scientific Prompt Engineering....230
Complete RAG Execution Pipeline....231
Demonstration and Interactive Interface....232
Main Demonstration Flow....232
Search Demonstrations....233
RAG Demonstration....234
Interactive Search Interface....235
Entry Point with Mode Selection....236
Technical Note on HNSW....236
How to Evaluate Your Results....238
Next Steps: Extending the Scientific RAG System....239
Conclusion....242
Chapter 8. Building a Complete Conversation Search and RAG System....244
System Goals and Capabilities....245
System Architecture Overview....246
What Well Build Together....248
Database Foundation for Conversation Storage....249
Designing the Conversation Schema....249
Three-Table Architecture for Optimal Performance....250
High-Performance Vector Indexing....251
Conversation Import and Data Processing Pipeline....252
Robust JSON Import with Error Handling....252
Atomic Transaction Processing....253
Timestamp Handling and Data Validation....253
Error Recovery and Logging....254
Efficient Embedding Generation and Batch Processing....255
Singleton Pattern for Model Management....255
Incremental Processing Strategy....256
Batch Processing for Optimal Performance....256
Database Insertion with Conflict Handling....257
Contextual Search with Conversational Understanding....258
Semantic Similarity Search....258
Multitable Joins for Rich Context....259
Result Formatting and Structure....259
Conversation Context Retrieval....260
Context Window Calculation....260
RAG Integration for Conversation History....261
Structured Context Management....261
Local LLM Integration with Ollama....262
Health Monitoring and Model Discovery....262
Context Retrieval and Assembly....263
Conversational Prompt Engineering....263
Complete RAG Pipeline with Performance Monitoring....264
Complete Web API with FastAPI....266
FastAPI Application Structure....266
Request Models with Validation....266
Search Endpoint Implementation....267
RAG Question-Answering Endpoint....267
System Statistics and Monitoring....267
Server Startup and Configuration....268
Demonstration and Sample Data....269
Realistic Sample Data Generation....269
Multitopic Sample Coverage....270
Sample Data Processing Pipeline....271
Comprehensive System Demonstration....271
Progressive Feature Demonstration....272
RAG Demonstration with Conditional Execution....273
Production Import Functionality....273
Application Entry Points....274
Conclusion: A Complete Personal Knowledge System....274
Chapter 9. Vector Query Language....276
Core Concepts....277
Data Model....277
Basic Syntax Structure....278
Vector Operations....279
Similarity Search....279
Hybrid Search....280
Range Search....280
Batch Operations....280
Vector Functions and Aggregations....281
Vector Functions....281
Vector Aggregations....282
Index....284
About the Author....291
Colophon....291
he AI revolution is here, and at its core lies a game-changing technology that most developers haven’t fully explored: vector databases. From powering semantic search to enabling large language models (LLMs) and generative AI, vector databases are reshaping how we build applications with unstructured data like text, images, and audio. But how do you go from curious to capable with this vital technology? That’s where this book comes in.
In this hands-on guide, author Nitin Borwankar takes you through the “why, what, and how” of vector databases, starting with the basic theory behind vector embeddings and progressing to building applications with real-world tools. You’ll learn about Word2vec, how to convert open source SQL databases like SQLite3 and PostgreSQL into vector databases, and integrate them into retrieval-augmented generation (RAG) applications. Whether you’re a Python developer, data engineer, or ML practitioner, this book gives you the foundation to leverage vector databases confidently in your AI projects.