Title Page....2
Copyright and Credits....3
Dedication....4
About Packt....5
Contributors....6
Table of Contents....8
Preface....19
Section 1: Software Engineering and the Software Development Life Cycle....25
Chapter 1: A Bird's-Eye View of Software Engineering....26
What is software engineering?....27
Types of software engineering roles....28
The role of the software engineer (SWE)....28
The role of the software development engineer in test (SDET)....30
The role of the site reliability engineer (SRE)....31
The role of the release engineer (RE)....32
The role of the system architect....33
A list of software development models that all engineers should know....33
Waterfall....34
Iterative enhancement....36
Spiral....37
Agile....39
Lean....39
Eliminate waste....40
Create knowledge....40
Defer commitment....41
Build in quality....41
Deliver fast....41
Respect and empower people....42
See and optimize the whole....42
Scrum....43
Scrum roles....44
Essential Scrum events....45
Kanban....45
DevOps....46
The CAMS model....47
The three ways model....49
Summary....50
Questions....51
Further reading....51
Section 2: Best Practices for Maintainable and Testable Go Code....53
Chapter 2: Best Practices for Writing Clean and Maintainable Go Code....54
The SOLID principles of object-oriented design....55
Single responsibility....55
Open/closed principle....58
Liskov substitution....61
Interface segregation....64
Dependency inversion....66
Applying the SOLID principles....67
Organizing code into packages....67
Naming conventions for Go packages....68
Circular dependencies....69
Breaking circular dependencies via implicit interfaces....69
Sometimes, code repetition is not a bad idea!....71
Tips and tools for writing lean and easy-to-maintain Go code....73
Optimizing function implementations for readability....73
Variable naming conventions....74
Using Go interfaces effectively....76
Zero values are your friends....80
Using tools to analyze and manipulate Go programs....82
Taking care of formatting and imports (gofmt, goimports)....84
Refactoring code across packages (gorename, gomvpkg, fix)....85
Improving code quality metrics with the help of linters....87
Summary....90
Questions....91
Further reading....92
Chapter 3: Dependency Management....95
What's all the fuss about software versioning?....96
Semantic versioning....96
Comparing semantic versions....97
Applying semantic versioning to Go packages....98
Managing the source code for multiple package versions....102
Single repository with versioned folders....102
Single repository – multiple branches....104
Vendoring – the good, the bad, and the ugly....107
Benefits of vendoring dependencies....108
Is vendoring always a good idea?....109
Strategies and tools for vendoring dependencies....109
The dep tool....110
The Gopkg.toml file....112
The Gopkg.lock file....114
Go modules – the way forward....114
Fork packages....118
Summary....119
Questions....119
Further reading....120
Chapter 4: The Art of Testing....121
Technical requirements....121
Unit testing....122
Mocks, stubs, fakes, and spies – commonalities and differences....123
Stubs and spies!....123
Mocks....128
Introducing gomock....128
Exploring the details of the project we want to write tests for....130
Leveraging gomock to write a unit test for our application....132
Fake objects....134
Black-box versus white-box testing for Go packages – an example....136
The services behind the facade....137
Writing black-box tests....138
Boosting code coverage via white-box tests....140
Table-driven tests versus subtests....145
Table-driven tests....145
Subtests....148
The best of both worlds....149
Using third-party testing frameworks....150
Integration versus functional testing....153
Integration tests....153
Functional tests....154
Functional tests part deux – testing in production!....155
Smoke tests....156
Chaos testing – breaking your systems in fun and interesting ways!....158
Tips and tricks for writing tests....160
Using environment variables to set up or skip tests....160
Speeding up testing for local development....161
Excluding classes of tests via build flags....162
This is not the output you are looking for – mocking calls to external binaries....164
Testing timeouts is easy when you have all the time in the world!....167
Summary....172
Questions....173
Further reading....173
Section 3: Designing and Building a Multi-Tier System from Scratch....174
Chapter 5: The Links 'R'; Us Project....175
System overview – what are we going to be building?....176
Selecting an SDLC model for our project....177
Iterating faster using an Agile framework....177
Elephant carpaccio – how to iterate even faster!....178
Requirements analysis....179
Functional requirements....179
User story – link submission....180
User story – search....181
User story – crawl link graph....181
User story – calculate PageRank scores....182
User story – monitor Links 'R' Us health....182
Non-functional requirements....182
Service-level objectives....183
Security considerations....184
Being good netizens....186
System component modeling....187
The crawler....189
The link filter....189
The link fetcher....189
The content extractor....190
The link extractor....190
The content indexer....191
The link provider....191
The link graph....191
The PageRank calculator....192
The metrics store....193
The frontend....193
Monolith or microservices? The ultimate question....194
Summary....195
Questions....196
Further reading....196
Chapter 6: Building a Persistence Layer....198
Technical requirements....198
Running tests that require CockroachDB....199
Running tests that require Elasticsearch....200
Exploring a taxonomy of database systems....200
Key-value stores....201
Relational databases....202
NoSQL databases....204
Document databases....206
Understanding the need for a data layer abstraction....207
Designing the data layer for the link graph component....208
Creating an ER diagram for the link graph store....209
Listing the required set of operations for the data access layer....210
Defining a Go interface for the link graph....211
Partitioning links and edges for processing the graph in parallel....212
Iterating Links and Edges....213
Verifying graph implementations using a shared test suite....214
Implementing an in-memory graph store....215
Upserting links....217
Upserting edges....218
Looking up links....219
Iterating links/edges....220
Removing stale edges....223
Setting up a test suite for the graph implementation....223
Scaling across with a CockroachDB-backed graph implementation....224
Dealing with DB migrations....225
An overview of the DB schema for the CockroachDB implementation....226
Upserting links....227
Upserting edges....228
Looking up links....229
Iterating links/edges....229
Removing stale edges....231
Setting up a test suite for the CockroachDB implementation....232
Designing the data layer for the text indexer component....233
A model for indexed documents....234
Listing the set of operations that the text indexer needs to support....235
Defining the Indexer interface....235
Verifying indexer implementations using a shared test suite....237
An in-memory Indexer implementation using bleve....238
Indexing documents....239
Looking up documents and updating their PageRank score....240
Searching the index....241
Iterating the list of search results....242
Setting up a test suite for the in-memory indexer....244
Scaling across an Elasticsearch indexer implementation....245
Creating a new Elasticsearch indexer instance....246
Indexing and looking up documents....247
Performing paginated searches....250
Updating the PageRank score for a document....252
Setting up a test suite for the Elasticsearch indexer....253
Summary....254
Questions....254
Further reading....255
Chapter 7: Data-Processing Pipelines....257
Technical requirements....257
Building a generic data-processing pipeline in Go....258
Design goals for the pipeline package....259
Modeling pipeline payloads....260
Multistage processing....261
Stageless pipelines – is that even possible?....263
Strategies for handling errors....263
Accumulating and returning all errors....263
Using a dead-letter queue....264
Terminating the pipeline's execution if an error occurs....264
Synchronous versus asynchronous pipelines....266
Synchronous pipelines....266
Asynchronous pipelines....267
Implementing a stage worker for executing payload processors....269
FIFO....270
Fixed and dynamic worker pools....272
1-to-N broadcasting....277
Implementing the input source worker....279
Implementing the output sink worker....280
Putting it all together – the pipeline API....281
Building a crawler pipeline for the Links 'R' Us project....285
Defining the payload for the crawler....286
Implementing a source and a sink for the crawler....289
Fetching the contents of graph links ....291
Extracting outgoing links from retrieved webpages....294
Extracting the title and text from retrieved web pages....299
Inserting discovered outgoing links to the graph....300
Indexing the contents of retrieved web pages....303
Assembling and running the pipeline....304
Summary....306
Questions....307
Further reading....307
Chapter 8: Graph-Based Data Processing....309
Technical requirements....310
Exploring the Bulk Synchronous Parallel model....310
Building a graph processing system in Go....313
Queueing and delivering messages....314
The Message interface....315
Queues and message iterators....315
Implementing an in-memory, thread-safe queue....316
Modeling the vertices and edges of graphs....319
Defining the Vertex and Edge types....319
Inserting vertices and edges into the graph....321
Sharing global graph state through data aggregation....322
Defining the Aggregator interface....323
Registering and looking up aggregators....324
Implementing a lock-free accumulator for float64 values....324
Sending and receiving messages....327
Implementing graph-based algorithms using compute functions....329
Achieving vertical scaling by executing compute functions in parallel....330
Orchestrating the execution of super-steps....333
Creating and managing Graph instances....336
Solving interesting graph problems....339
Searching graphs for the shortest path....340
The sequential Dijkstra algorithm....341
Leveraging a gossip protocol to run Dijkstra in parallel....343
Graph coloring....345
A sequential greedy algorithm for coloring undirected graphs....346
Exploiting parallelism for undirected graph coloring....347
Calculating PageRank scores....350
The model of the random surfer....351
An iterative approach to PageRank score calculation....352
Reaching convergence – when should we stop iterating?....353
Web graphs in the real world – dealing with dead ends....354
Defining an API for the PageRank calculator....355
Implementing a compute function to calculate PageRank scores....361
Summary....363
Further reading....363
Chapter 9: Communicating with the Outside World....365
Technical requirements....366
Designing robust, secure, and backward-compatible REST APIs....366
Using human-readable paths for RESTful resources....367
Controlling access to API endpoints....369
Basic HTTP authentication....370
Securing TLS connections from eavesdropping....371
Authenticating to external service providers using OAuth2....375
Dealing with API versions....384
Including the API version as a route prefix....385
Negotiating API versions via HTTP Accept headers....386
Building RESTful APIs in Go....387
Building RPC-based APIs with the help of gRPC....388
Comparing gRPC to REST....388
Defining messages using protocol buffers....389
Defining messages....390
Versioning message definitions....392
Representing collections....393
Modeling field unions....393
The Any type....395
Implementing RPC services....397
Unary RPCs....397
Server-streaming RPCs....399
Client-streaming RPCs....400
Bi-directional streaming RPCs....402
Security considerations for gRPC APIs....405
Decoupling Links 'R' Us components from the underlying data stores....407
Defining RPCs for accessing a remote link-graph instance....408
Defining RPCs for accessing a text-indexer instance....410
Creating high-level clients for accessing data stores over gRPC ....412
Summary....414
Questions....414
Further reading....414
Chapter 10: Building, Packaging, and Deploying Software....416
Technical requirements....417
Building and packaging Go services using Docker....418
Benefits of containerization....418
Best practices for dockerizing Go applications....419
Selecting a suitable base container for your application....421
A gentle introduction to Kubernetes....422
Peeking under the hood....422
Summarizing the most common Kubernetes resource types....424
Running a Kubernetes cluster on your laptop!....426
Building and deploying a monolithic version of Links 'R' Us....429
Distributing computation across application instances....429
Carving the UUID space into non-overlapping partitions....430
Assigning a partition range to each pod....434
Building wrappers for the application services....435
The crawler service....436
The PageRank calculator service....438
Serving a fully functioning frontend to users....439
Specifying the endpoints for the frontend application....441
Performing searches and paginating results....442
Generating convincing summaries for search results....443
Highlighting search keywords....444
Orchestrating the execution of individual services....446
Putting everything together....447
Terminating the application in a clean way....448
Dockerizing and starting a single instance of the monolith....449
Deploying and scaling the monolith on Kubernetes....450
Setting up the required namespaces....451
Deploying CockroachDB and Elasticsearch using Helm....452
Deploying Links 'R' Us....453
Summary....456
Questions....456
Further reading....456
Section 4: Scaling Out to Handle a Growing Number of Users....458
Chapter 11: Splitting Monoliths into Microservices....459
Technical requirements....460
Monoliths versus service-oriented architectures....461
Is there something inherently wrong with monoliths?....461
Microservice anti-patterns and how to deal with them....462
Monitoring the state of your microservices....463
Tracing requests through distributed systems....464
The OpenTracing project....464
Stepping through a distributed tracing example....465
The provider service....466
The aggregator service....468
The gateway....470
Putting it all together....470
Capturing and visualizing traces using Jaeger....472
Making logging your trusted ally....476
Logging best practices....476
The devil is in the (logging) details....479
Shipping and indexing logs inside Kubernetes....480
Running a log collector on each Kubernetes node....481
Using a sidecar container to collect logs....482
Shipping logs directly from the application....483
Introspecting live Go services....483
Building a microservice-based version of Links 'R' Us....487
Decoupling access to the data stores....488
Breaking down the monolith into distinct services....489
Deploying the microservices that comprise the Links 'R' Us project....491
Deploying the link-graph and text-indexer API services....491
Deploying the web crawler....491
Deploying the PageRank service....492
Deploying the frontend service....492
Locking down access to our Kubernetes cluster using network policies....492
Summary....496
Questions....496
Further reading....496
Chapter 12: Building Distributed Graph-Processing Systems....498
Technical requirements....499
Introducing the master/worker model....499
Ensuring that masters are highly available....500
The leader-follower configuration....500
The multi-master configuration....501
Strategies for discovering nodes....501
Recovering from errors....502
Out-of-core distributed graph processing....502
Describing the system architecture, requirements, and limitations....503
Modeling a state machine for executing graph computations....505
Establishing a communication protocol between workers and masters....507
Defining a job queue RPC service....508
Establishing protocol buffer definitions for worker payloads....509
Establishing protocol buffer definitions for master payloads....510
Defining abstractions for working with bi-directional gRPC streams....511
Remote worker stream....512
Remote master stream....516
Creating a distributed barrier for the graph execution steps....518
Implementing a step barrier for individual workers....519
Implementing a step barrier for the master....522
Creating custom executor factories for wrapping existing graph instances....526
The workers' executor factory....527
The master's executor factory....530
Coordinating the execution of a graph job....532
Simplifying end user interactions with the dbspgraph package....532
The worker job coordinator....534
Running a new job....535
Transitioning through the stages of the graph's state machine....536
Handling incoming payloads from the master....537
Using the master as an outgoing message relay....539
The master job coordinator....540
Running a new job....542
Transitioning through the stages for the graph's state machine....544
Handling incoming worker payloads....545
Relaying messages between workers....546
Defining package-level APIs for working with master and worker nodes....548
Instantiating and operating worker nodes....548
Instantiating and operating master nodes....550
Handling incoming gRPC connections....551
Running a new job....553
Deploying a distributed version of the Links 'R' Us PageRank calculator....555
Retrofitting master and worker capabilities to the PageRank calculator service....555
Serializing PageRank messages and aggregator values....556
Defining job runners for the master and the worker....559
Implementing the job runner for master nodes....559
The worker job runner....560
Deploying the final Links 'R' Us version to Kubernetes....562
Summary....564
Questions....564
Further reading....565
Chapter 13: Metrics Collection and Visualization....566
Technical requirements....567
Monitoring from the perspective of a site reliability engineer....567
Service-level indicators (SLIs)....568
Service-level objectives (SLOs)....568
Service-level agreements (SLAs)....569
Exploring options for collecting and aggregating metrics....569
Comparing push versus pull systems....569
Capturing metrics using Prometheus....571
Supported metric types....572
Automating the detection of scrape targets....573
Static and file-based scrape target configuration....573
Querying the underlying cloud provider....574
Leveraging the API exposed by Kubernetes....574
Instrumenting Go code....576
Registering metrics with Prometheus....576
Vector-based metrics....578
Exporting metrics for scraping....578
Visualizing collected metrics using Grafana....580
Using Prometheus as an end-to-end solution for alerting....582
Using Prometheus as a source for alert events....583
Handling alert events....585
Grouping alerts together....585
Selectively muting alerts....585
Configuring alert receivers....587
Routing alerts to receivers....588
Summary....589
Questions....590
Further reading....590
Chapter 14: Epilogue....591
Appendix: Assessments....593
Other Books You May Enjoy....610
Index....613
Over the last few years, Go has become one of the favorite languages for building scalable and distributed systems. Its opinionated design and built-in concurrency features make it easy for engineers to author code that efficiently utilizes all available CPU cores.
This Golang book distills industry best practices for writing lean Go code that is easy to test and maintain, and helps you to explore its practical implementation by creating a multi-tier application called Links 'R' Us from scratch. You'll be guided through all the steps involved in designing, implementing, testing, deploying, and scaling an application. Starting with a monolithic architecture, you'll iteratively transform the project into a service-oriented architecture (SOA) that supports the efficient out-of-core processing of large link graphs. You'll learn about various cutting-edge and advanced software engineering techniques such as building extensible data processing pipelines, designing APIs using gRPC, and running distributed graph processing algorithms at scale. Finally, you'll learn how to compile and package your Go services using Docker and automate their deployment to a Kubernetes cluster.
By the end of this book, you'll know how to think like a professional software developer or engineer and write lean and efficient Go code.
This Golang programming book is for developers and software engineers looking to use Go to design and build scalable distributed systems effectively. Knowledge of Go programming and basic networking principles is required.