High Performance Python: Practical Performant Programming for Humans. 3 Ed

High Performance Python: Practical Performant Programming for Humans. 3 Ed

Автор: Gorelick Micha , Ozsvald Ian

Дата выхода: 2025

Издательство: O’Reilly Media, Inc.

Количество страниц: 984

Размер файла: 4,9 МБ

Тип файла: PDF

Добавил: Aleks-5

Проверить на вирусы

Foreword....6

Preface....9

Who This Book Is For....9

Who This Book Is Not For....11

What You’ll Learn....11

Python 3....12

License....14

How to Make an Attribution....14

Using Code Examples....15

Errata and Feedback....16

Conventions Used in This Book....16

O’Reilly Online Learning....17

How to Contact Us....18

Acknowledgments....19

1. Understanding Performant Python....21

The Fundamental Computer System....22

Computing Units....23

Memory Units....29

Communications Layers....33

Idealized Computing Versus the Python Virtual Machine....36

Idealized Computing....38

Python’s Virtual Machine....40

So Why Use Python?....45

How to Be a Highly Performant Programmer....49

Good Working Practices....51

Optimizing for the Team Rather than the Code Block....56

The Remote Performant Programmer....59

Some Thoughts on Good Notebook Practice....60

Your Work....62

The Future of Python....63

Where Did the GIL Go?....64

Does Python Have a JIT?....65

Wrap-Up....67

2. Profiling to Find Bottlenecks....69

Profiling Efficiently....70

Introducing the Julia Set....74

Calculating the Full Julia Set....80

Simple Approaches to Timing—print and a Decorator....86

Simple Timing Using the Unix time Command....93

Using the cProfile Module....96

Visualizing cProfile Output with SnakeViz....106

Using line_profiler for Line-by-Line Measurements....108

Using memory_profiler to Diagnose Memory Usage....119

Combining CPU and Memory Profiling with Scalene....129

Introspecting an Existing Process with PySpy....133

VizTracer for an Interactive Time-Based Call Stack....135

Bytecode: Under the Hood....139

Using the dis Module to Examine CPython Bytecode....140

Digging into Bytecode Specialization with Specialist....143

Different Approaches, Different Complexity....145

Unit Testing During Optimization to Maintain Correctness....150

No-op @profile Decorator....151

Strategies to Profile Your Code Successfully....157

Wrap-Up....160

3. Lists and Tuples....162

A More Efficient Search....167

Lists Versus Tuples....172

Lists as Dynamic Arrays....175

Tuples as Static Arrays....182

Wrap-Up....186

4. Dictionaries and Sets....188

How Do Dictionaries and Sets Work?....194

Inserting and Retrieving....195

Deletion....203

Resizing....203

Hash Functions and Entropy....206

Wrap-Up....215

5. Iterators and Generators....218

Iterators for Infinite Series....227

Lazy Generator Evaluation....230

Wrap-Up....238

6. Matrix and Vector Computation....240

Introduction to the Problem....241

Aren’t Python Lists Good Enough?....250

Problems with Allocating Too Much....253

Memory Fragmentation....259

Understanding perf....263

Making Decisions with perf’s Output....268

Enter numpy....270

Applying numpy to the Diffusion Problem....275

Memory Allocations and In-Place Operations....281

Selective Optimizations: Finding What Needs to Be Fixed....288

numexpr: Making In-Place Operations Faster and Easier....293

Graphics Processing Units (GPUs)....297

Dynamic Graphs: PyTorch....299

GPU Speed and Numerical Precision....304

GPU-Specific Operations....310

Basic GPU Profiling....315

Performance Considerations of GPUs....320

When to Use GPUs....323

Deep Learning Performance Considerations....326

A Cautionary Tale: Verify “Optimizations” (scipy)....334

Lessons from Matrix Optimizations....337

Wrap-Up....344

7. Pandas, Dask, and Polars....348

Pandas....350

Pandas’s Internal Model....351

Arrow and NumPy....354

Applying a Function to Many Rows of Data....356

Numba to Compile NumPy for Pandas....373

Building from Partial Results Rather than Concatenating....376

There’s More Than One (and Possibly a Faster) Way to Do a Job....378

Advice for Effective Pandas Development....381

Dask for Distributed Data Structures and DataFrames....384

Diagnostics....387

Parallel Pandas with Dask....389

Parallelized apply with Swifter on Dask....393

Polars for Fast DataFrames....395

Wrap-Up....397

8. Compiling to C....398

What Sort of Speed Gains Are Possible?....400

JIT Versus AOT Compilers....403

Why Does Type Information Help the Code Run Faster?....404

Using a C Compiler....406

Reviewing the Julia Set Example....407

Cython....408

Compiling a Pure Python Version Using Cython....409

pyximport....413

Cython Annotations to Analyze a Block of Code....414

Adding Some Type Annotations....419

Cython and numpy....427

Parallelizing the Solution with OpenMP on One Machine....432

Numba....436

PyPy....441

Garbage Collection Differences....443

Running PyPy and Installing Modules....444

A Summary of Speed Improvements....447

When to Use Each Technology....449

Foreign Function Interfaces....453

ctypes....455

cffi....460

f2py....465

CPython Extensions: C....471

CPython Extensions: Rust....479

Wrap-Up....487

9. Asynchronous I/O....489

Introduction to Asynchronous Programming....492

How Does async/await Work?....499

Serial Web Crawler....501

Asynchronous Web Crawler....505

Shared CPU–I/O Workload....514

Serial CPU Workload....515

Batched CPU Workload....518

Fully Asynchronous CPU Workload....524

Wrap-Up....531

10. The multiprocessing Module....534

An Overview of the multiprocessing Module....540

Estimating Pi Using the Monte Carlo Method....543

Estimating Pi Using Processes and Threads....546

Using Python Objects....546

Replacing multiprocessing with Joblib....561

Random Numbers in Parallel Systems....568

Using numpy....569

Finding Prime Numbers....574

Queues of Work....584

Asynchronously Adding Jobs to the Queue....591

Verifying Primes Using Interprocess Communication....594

Serial Solution....600

Naive Pool Solution....601

A Less Naive Pool Solution....604

Using manager.Value as a Flag....606

Using Redis as a Flag....610

Using RawValue as a Flag....615

Using mmap as a Flag....617

Using mmap as a Flag Redux....619

Sharing numpy Data with multiprocessing....623

Synchronizing File and Variable Access....637

File Locking....638

Locking a Value....646

Wrap-Up....652

11. Clusters and Job Queues....655

Benefits of Clustering....657

Drawbacks of Clustering....658

$462 Million Wall Street Loss Through Poor Cluster Upgrade Strategy....661

Skype’s 24-Hour Global Outage....662

Common Cluster Designs....664

How to Start a Clustered Solution....665

Ways to Avoid Pain When Using Clusters....667

Two Clustering Solutions....669

Using IPython Parallel to Support Research....670

Message Brokering for Cluster Efficiency....676

Other Clustering Tools to Look At....683

Docker....684

Docker’s Performance....685

Advantages of Docker....692

Wrap-Up....694

12. Using Less RAM....696

Objects for Primitives Are Expensive....698

The array Module Stores Many Primitive Objects Cheaply....701

Using Less RAM in NumPy with NumExpr....706

Understanding the RAM Used in a Collection....713

Bytes Versus Unicode....717

Efficiently Storing Lots of Text in RAM....719

Trying These Approaches on 11 Million Tokens....721

Modeling More Text with scikit-learn’s FeatureHasher....737

Introducing DictVectorizer and FeatureHasher....738

Comparing DictVectorizer and FeatureHasher on a Real Problem....743

SciPy’s Sparse Matrices....746

Tips for Using Less RAM....751

Probabilistic Data Structures....752

Very Approximate Counting with a 1-Byte Morris Counter....755

K-Minimum Values....761

Bloom Filters....767

LogLog Counter....778

Real-World Example....786

Wrap-Up....792

13. Lessons from the Field....795

Developing a High Performance Machine Learning Algorithm....796

High Performance Computing in Journalism....801

Lessons from the Field of Cyber Reinsurance....813

Python in Quant Finance....832

Maintain Flexibility to Achieve High Performance....839

Streamlining Feature Engineering Pipelines with Feature-engine (2020)....845

Highly Performant Data Science Teams (2020)....856

Numba (2020)....863

Optimizing Versus Thinking (2020)....876

Making Deep Learning Fly with RadimRehurek.com (2014)....882

Large-Scale Social Media Analysis at Smesh (2014)....891

Index....900

About the Authors....981

Your Python code may run correctly, but what if you need it to run faster? This practical book shows you how to locate performance bottlenecks and significantly speed up your code in high-data-volume programs. By explaining the fundamental theory behind design choices, this expanded edition of High Performance Python helps experienced Python programmers gain a deeper understanding of Python's implementation.

How do you take advantage of multicore architectures or compilation? Or build a system that scales up beyond RAM limits or with a GPU? Authors Micha Gorelick and Ian Ozsvald reveal concrete solutions to many issues and include war stories from companies that use high-performance Python for GenAI data extraction, productionized machine learning, and more.

Get a better grasp of NumPy, Cython, and profilers
Learn how Python abstracts the underlying computer architecture
Use profiling to find bottlenecks in CPU time and memory usage
Write efficient programs by choosing appropriate data structures
Speed up matrix and vector computations
Process DataFrames quickly with Pandas, Dask, and Polars
Speed up your neural networks and GPU computations
Use tools to compile Python down to machine code
Manage multiple I/O and computational operations concurrently
Convert multiprocessing code to run on local or remote clusters

Если вам понравилась эта страница - поделитесь ею с друзьями, тем самым вы помогаете нам развиваться и добавлять всё больше интересных и нужным вам книг