Table of Contents....4
About the Authors....13
About the Technical Reviewer....15
Acknowledgments....16
Introduction....18
Chapter 1: Why Optimize?....25
What Do We Mean by Optimization?....25
Why It Is Difficult: Imperative and Declarative....26
Optimization Goals....30
Optimizing Processes....31
Optimizing OLTP and OLAP....32
Database Design and Performance....33
Application Development and Performance....35
Other Stages of the Lifecycle....35
PostgreSQL Specifics....36
Summary....37
Chapter 2: Theory: Yes, We Need It!....38
Query Processing Overview....38
Compilation....38
Optimization and Execution....39
Relational, Logical, and Physical Operations....40
Relational Operations....40
Logical Operations....44
Queries as Expressions: Thinking in Sets....45
Operations and Algorithms....46
Summary....46
Chapter 3: Even More Theory: Algorithms....47
Algorithm Cost Models....47
Data Access Algorithms....49
Storage Structures....49
Full Scan....50
Index-Based Table Access....51
Index-Only Scan....53
Comparing Data Access Algorithms....54
Index Structures....56
What Is an Index?....56
B-Tree Indexes....58
Why Are B-Trees Used So Often?....60
Other Kinds of Indexes....61
Combining Relations....61
Nested Loops....62
Hash-Based Algorithms....64
Sort-Merge Algorithm....65
Comparing Algorithms....67
Summary....67
Chapter 4: Understanding Execution Plans....68
Putting Everything Together: How an Optimizer Builds an Execution Plan....68
Reading Execution Plans....69
Understanding Execution Plans....74
What Is Going On During Optimization?....75
Why Are There So Many Execution Plans to Choose From?....76
How Are Execution Costs Calculated?....77
How Can the Optimizer Be Led Astray?....80
Summary....81
Chapter 5: Short Queries and Indexes....82
What Makes a Query “Short”?....83
Choosing Selection Criteria....85
Index Selectivity....85
Unique Indexes and Constraints....87
Indexes and Non-equal Conditions....91
Indexes and Column Transformations....91
Indexes and the like Operator....96
Using Multiple Indexes....98
Compound Indexes....99
How Do Compound Indexes Work?....99
Lower Selectivity....101
Using Indexes for Data Retrieval....102
Covering Indexes....103
Excessive Selection Criteria....104
Partial Indexes....110
Indexes and Join Order....111
When Are Indexes Not Used....115
Avoiding Index Usage....115
Why Does PostgreSQL Ignore My Index?....115
Let PostgreSQL Do Its Job!....118
How to Build the Right Index(es)....122
To Build or Not to Build....122
Which Indexes Are Needed?....123
Which Indexes Are Not Needed?....123
Indexes and Short Query Scalability....124
Summary....124
Chapter 6: Long Queries and Full Scans....125
Which Queries Are Considered Long?....125
Long Queries and Full Scans....127
Long Queries and Hash Joins....128
Long Queries and the Order of Joins....129
What Is a Semi-join?....129
Semi-joins and Join Order....131
More on Join Order....133
What Is an Anti-join?....136
Semi- and Anti-joins Using the JOIN Operator....138
When Is It Necessary to Specify Join Order?....140
Grouping: Filter First, Group Last....143
Grouping: Group First, Select Last....151
Using SET Operations....154
Avoiding Multiple Scans....158
Conclusion....165
Chapter 7: Long Queries: Additional Techniques....166
Structuring Queries....166
Temporary Tables and CTEs....167
Temporary Tables....167
Common Table Expressions (CTEs)....169
Views: To Use or Not to Use....174
Why Use Views?....181
Materialized Views....181
Creating and Using Materialized Views....181
Refreshing Materialized Views....183
Should I Create a Materialized View?....183
Do Materialized Views Need to Be Optimized?....186
Dependencies....186
Partitioning....187
Does Partitioning Improve Performance?....189
Why Create a Partitioned Table?....192
Parallelism....192
Summary....193
Chapter 8: Optimizing Data Modification....194
What Is DML?....194
Two Ways to Optimize Data Modification....194
How Does DML Work?....195
Low-Level Input/Output....195
The Impact of Concurrency Control....196
Data Modification and Indexes....199
DML and Vacuum....200
Mass UPDATE/DELETE....200
Frequent Updates....201
Referential Integrity and Triggers....201
Summary....202
Chapter 9: Design Matters....203
Design Matters....203
Why Use a Relational Model?....207
Types of Databases....208
Entity-Attribute-Value Model....208
Key-Value Model....209
Hierarchical Model....210
Combining the Best of Different Worlds....211
Flexibility vs. Efficiency and Correctness....211
Must We Normalize?....213
Use and Misuse of Surrogate Keys....215
Summary....221
Chapter 10: What About Configuration Parameters?....222
PostgreSQL Configuration Parameters Overview....222
Memory Allocation....223
Connections and Sessions....223
Tuning Parameters for Better Performance....225
Are There Better Ways?....230
Other Limitations of Parameter Tuning....233
Conclusion....233
Chapter 11: Application Development and Performance....234
Response Time Matters....234
World Wide Wait....235
Performance Metrics....236
Impedance Mismatch....236
A Road Paved with Good Intentions....237
Application Development Patterns....237
“Shopping List Problem”....239
Interfaces....241
Welcome to the World of ORM....242
In Search of a Better Solution....244
Summary....247
Chapter 12: Functions....248
Function Creation....248
Internal Functions....249
User-Defined Functions....249
Introducing Procedural Language....250
Dollar Quoting....251
Function Parameters and Function Output: Void Functions....252
Function Overloading....252
Function Execution....254
Function Execution Internals....257
Functions and Performance....260
How Using Functions Can Worsen Performance....260
Any Chance Functions Can Improve Performance?....262
Functions and User-Defined Types....263
User-Defined Data Types....263
Functions Returning Composite Types....264
Using Composite Types with Nested Structure....267
Functions and Type Dependencies....272
Data Manipulation with Functions....272
Functions and Security....274
What About Business Logic?....275
Functions in OLAP Systems....276
Parameterizing....277
No Explicit Dependency on Tables and Views....278
Ability to Execute Dynamic SQL....278
Stored Procedures....278
Functions with No Results....278
Functions and Stored Procedures....279
Transaction Management....279
Exception Processing....280
Summary....281
Chapter 13: Dynamic SQL....282
What Is Dynamic SQL....282
Why Dynamic SQL Works Better in Postgres....282
What About SQL Injection?....283
How to Use Dynamic SQL for an Optimal Execution Plan....283
How to Use Dynamic SQL in OLAP Systems....291
Using Dynamic SQL for Flexibility....294
Using Dynamic SQL to Aid the Optimizer....303
FDWs and Dynamic SQL....306
Summary....307
Chapter 14: Avoiding the Pitfalls of Object-Relational Mapping....308
Why Application Developers Like NORM....309
ORM vs. NORM....309
NORM Explained....311
NORM in the Application Perspective....318
NORM from a Database Perspective....321
Mapping JSON to the Database....322
Generating Database Code....323
Getting Data from the Database....325
Modifying Data in the Database....328
Why Not Store JSON?!....331
Performance Gains....331
Working Together with Application Developers....332
Summary....332
Chapter 15: More Complex Filtering and Search....333
Full Text Search....333
Multidimensional and Spatial Search....335
Generalized Index Types in PostgreSQL....336
GIST Indexes....336
Indexes for Full Text Search....336
Indexing Very Large Tables....338
Indexing JSON and JSONB....339
Summary....342
Chapter 16: Ultimate Optimization Algorithm....343
Major Steps....343
Step-by-Step Guide....344
Step 1: Short or Long?....344
Step 2: Short....344
Step 2.1: The Most Restrictive Criteria....345
Step 2.2: Check the Indexes....345
Step 2.3: Add an Excessive Selection Criterion, If Applicable....345
Step 2.4: Building (or Rebuilding) the Query....345
Step 3: Long....346
Step 4: Incremental Updates....346
Step 5: Non-incremental Long Query....346
But Wait—There Is More!....347
Summary....348
Chapter 17: Conclusion....349
Index....351
Write optimized queries. This book helps you write queries that perform fast and deliver results on time. You will learn that query optimization is not a dark art practiced by a small, secretive cabal of sorcerers. Any motivated professional can learn to write efficient queries from the get-go and capably optimize existing queries. You will learn to look at the process of writing a query from the database engine’s point of view, and know how to think like the database optimizer.
The book begins with a discussion of what a performant system is and progresses to measuring performance and setting performance goals. It introduces different classes of queries and optimization techniques suitable to each, such as the use of indexes and specific join algorithms. You will learn to read and understand query execution plans along with techniques for influencing those plans for better performance. The book also covers advanced topics such as the use of functions and procedures, dynamic SQL, and generated queries. All of these techniques are then used together to produce performant applications, avoiding the pitfalls of object-relational mappers.
This second edition includes new examples using Postgres 15 and the newest version of the PostgresAir database. It includes additional details and clarifications about advanced topics, and covers configuration parameters in greater depth. Finally, it makes use of advancements in NORM, using automatically generated functions.
IT professionals working in PostgreSQL who want to develop performant and scalable applications, anyone whose job title contains the words “database developer” or “database administrator" or who is a backend developer charged with programming database calls, and system architects involved in the overall design of application systems running against a PostgreSQL database