The Art of Writing Efficient Programs: An advanced programmer's guide to efficient hardware utilization and compiler optimizations using C++ examples

The Art of Writing Efficient Programs: An advanced programmer's guide to efficient hardware utilization and compiler optimizations using C++ examples

The Art of Writing Efficient Programs: An advanced programmer's guide to efficient hardware utilization and compiler optimizations using C++ examples
Автор: Pikus Fedor G
Дата выхода: 2021
Издательство: Packt Publishing Limited
Количество страниц: 701
Размер файла: 13.8 MB
Тип файла: PDF
Добавил: codelibs
 Проверить на вирусы  Дополнительные материалы 

The Art of Writing Efficient Programs....2

Contributors....3

About the author....3

About the reviewer....5

Preface....25

Who is this book for?....27

What this book covers....28

To get the most out of this book....31

Download the example code files....33

Download the color images....33

Conventions used....33

Get in touch....34

Share Your Thoughts....35

Section 1 – Performance Fundamentals....36

Chapter 1: Introduction to Performance and Concurrency....37

Why focus on performance?....38

Why performance matters....42

What is performance?....46

Performance as throughput....47

Performance as power consumption....49

Performance for real-time applications....50

Performance as dependent on context....52

Evaluating, estimating, and predicting performance....53

Learning about high performance....55

Summary....59

Questions....59

Chapter 2: Performance Measurements....61

Technical requirements....62

Performance measurements by example....63

Performance benchmarking....75

C++ chrono timers....76

High-resolution timers....77

Performance profiling....85

The perf profiler....87

Detailed profiling with perf....91

The Google Performance profiler....96

Profiling with call graphs....99

Optimization and inlining....104

Practical profiling....107

Micro-benchmarking....110

Basics of micro-benchmarking....111

Micro-benchmarking and compiler optimizations....115

Google Benchmark....121

Micro-benchmarks are lies....127

Summary....133

Questions....135

Chapter 3: CPU Architecture, Resources, and Performance....136

Technical requirements....136

The performance begins with the CPU....137

Probing performance with micro-benchmarks....140

Visualizing instruction-level parallelism....151

Data dependencies and pipelining....155

Pipelining and branches....164

Branch prediction....170

Profiling for branch mispredictions....175

Speculative execution....179

Optimization of complex conditions....181

Branchless computing....189

Loop unrolling....190

Branchless selection....191

Branchless computing examples....193

Summary....200

Questions....202

Chapter 4: Memory Architecture and Performance....203

Technical requirements....203

The performance begins with the CPU but does not end there....204

Measuring memory access speed....208

Memory architecture....210

Measuring memory and cache speeds....213

The speed of memory: the numbers....219

The speed of random memory access....219

The speed of sequential memory access....225

Memory performance optimizations in hardware....228

Optimizing memory performance....234

Memory-efficient data structures....234

Profiling memory performance....240

Optimizing algorithms for memory performance....244

The ghost in the machine....254

What is Spectre?....255

Spectre by example....259

Spectre, unleashed....265

Summary....271

Questions....272

Chapter 5: Threads, Memory, and Concurrency....274

Technical requirements....274

Understanding threads and concurrency....275

What is a thread?....276

Symmetric multi-threading....277

Threads and memory....279

Memory-bound programs and concurrency....285

Understanding the cost of memory synchronization....288

Why data sharing is expensive....296

Learning about concurrency and order....307

The need for order....308

Memory order and memory barriers....311

Memory order in C++....322

Memory model....326

Summary....333

Questions....334

Section 2 – Advanced Concurrency....336

Chapter 6: Concurrency and Performance....337

Technical requirements....338

What is needed to use concurrency effectively?....338

Locks, alternatives, and their performance....341

Lock-based, lock-free, and wait-free programs....345

Different locks for different problems....349

Lock-based versus lock-free, what is the real difference?....356

Building blocks for concurrent programming....360

The basics of concurrent data structures....362

Counters and accumulators....367

Publishing protocol....377

Smart pointers for concurrent programming....381

Summary....392

Questions....392

Chapter 7: Data Structures for Concurrency....394

Technical requirements....394

What is a thread-safe data structure?....395

The best kind of thread safety....395

The real thread safety....399

The thread-safe stack....400

Interface design for thread safety....401

Performance of mutex-guarded data structures....405

Performance requirements for different uses....408

Stack performance in detail....415

Performance estimates for synchronization schemes....420

Lock-free stack....425

The thread-safe queue....437

Lock-free queue....439

Non-sequentially consistent data structures....449

Memory management for concurrent data structures....454

The thread-safe list....458

Lock-free list....463

Summary....475

Questions....477

Chapter 8: Concurrency in C++....478

Technical requirements....478

Concurrency support in C++11....479

Concurrency support in C++17....481

Concurrency support in C++20....489

The foundations of coroutines....489

Coroutine C++ syntax....498

Coroutine examples....500

Summary....511

Questions....512

Section 3 – Designing and Coding High-Performance Programs....513

Chapter 9: High-Performance C++....514

Technical requirements....514

What is the efficiency of a programming language?....515

Unnecessary copying....517

Copying and argument passing....518

Copying as an implementation technique....521

Copying to store data....523

Copying of return values....525

Using pointers to avoid copying....531

How to avoid unnecessary copying....533

Inefficient memory management....535

Unnecessary memory allocations....536

Memory management in concurrent programs....542

Avoiding memory fragmentation....544

Optimization of conditional execution....550

Summary....554

Questions....555

Chapter 10: Compiler Optimizations in C++....557

Technical requirements....557

Compilers optimizing code....558

Basics of compiler optimizations....558

Function inlining....563

What does the compiler really know?....573

Lifting knowledge from runtime to compile time....584

Summary....588

Questions....589

Chapter 11: Undefined Behavior and Performance....590

Technical requirements....591

What is undefined behavior?....591

Why have undefined behavior?....597

Undefined behavior and C++ optimization....599

Using undefined behavior for efficient design....613

Summary....620

Questions....622

Chapter 12: Design for Performance....623

Technical requirements....624

Interaction between the design and performance....624

Design for performance....626

The minimum information principle....627

The maximum information principle....630

API design considerations....642

API design for concurrency....642

Copying and sending data....651

Design for optimal data access....655

Performance trade-offs....660

Interface design....661

Component design....663

Errors and undefined behavior....665

Making informed design decisions....668

Summary....672

Questions....673

Assessments....674

Chapter 1:....675

Chapter 2:....676

Chapter 3:....678

Chapter 4:....680

Chapter 5:....682

Chapter 6:....684

Chapter 7:....686

Chapter 8:....688

Chapter 9:....690

Chapter 10:....692

Chapter 11:....694

Chapter 12:....696

Why subscribe?....697

Other Books You May Enjoy....698

Packt is searching for authors like you....700

Share Your Thoughts....700

The great free lunch of "performance taking care of itself" is over. Until recently, programs got faster by themselves as CPUs were upgraded, but that doesn't happen anymore. The clock frequency of new processors has almost peaked, and while new architectures provide small improvements to existing programs, this only helps slightly. To write efficient software, you now have to know how to program by making good use of the available computing resources, and this book will teach you how to do that.

The Art of Efficient Programming covers all the major aspects of writing efficient programs, such as using CPU resources and memory efficiently, avoiding unnecessary computations, measuring performance, and how to put concurrency and multithreading to good use. You'll also learn about compiler optimizations and how to use the programming language (C++) more efficiently. Finally, you'll understand how design decisions impact performance.

By the end of this book, you'll not only have enough knowledge of processors and compilers to write efficient programs, but you'll also be able to understand which techniques to use and what to measure while improving performance. At its core, this book is about learning how to learn.

What you will learn

  • Discover how to use the hardware computing resources in your programs effectively
  • Understand the relationship between memory order and memory barriers
  • Familiarize yourself with the performance implications of different data structures and organizations
  • Assess the performance impact of concurrent memory accessed and how to minimize it
  • Discover when to use and when not to use lock-free programming techniques
  • Explore different ways to improve the effectiveness of compiler optimizations
  • Design APIs for concurrent data structures and high-performance data structures to avoid inefficiencies

Who this book is for

This book is for experienced developers and programmers who work on performance-critical projects and want to learn new techniques to improve the performance of their code. Programmers in algorithmic trading, gaming, bioinformatics, computational genomics, or computational fluid dynamics communities will get the most out of the examples in this book, but the techniques are fairly universal. Although this book uses the C++ language, the concepts demonstrated in the book can be easily transferred or applied to other compiled languages such as C, Java, Rust, Go, and more.


Похожее:

Список отзывов:

Нет отзывов к книге.