Copyright....4
Table of Contents....5
Foreword....13
Preface....15
Why I Wrote This Book and Why Now....16
Who Is This Book For?....18
How to Read or Use This Book....18
Conventions Used in This Book....20
O’Reilly Online Learning....20
How to Contact Us....21
Acknowledgments....21
Chapter 1. The Journey to Becoming Data-Driven....23
Recent Technology Developments and Industry Trends....24
Data Management....26
Analytics Is Fragmenting the Data Landscape....30
The Speed of Software Delivery Is Changing....31
The Cloud’s Impact on Data Management Is Immeasurable....32
Privacy and Security Concerns Are a Top Priority....33
Operational and Analytical Systems Need to Be Integrated....34
Organizations Operate in Collaborative Ecosystems....35
Enterprises Are Saddled with Outdated Data Architectures....36
The Enterprise Data Warehouse: A Single Source of Truth....36
The Data Lake: A Centralized Repository for Structured and Unstructured Data....39
The Pain of Centralization....40
Defining a Data Strategy....41
Wrapping Up....44
Chapter 2. Organizing Data Using Data Domains....47
Application Design Starting Points....48
Each Application Has a Data Store....48
Applications Are Always Unique....48
Golden Sources....48
The Data Integration Dilemma....49
Application Roles....49
Inspirations from Software Architecture....51
Data Domains....54
Domain-Driven Design....54
Business Architecture....57
Domain Characteristics....67
Principles for Distributed and Domain-Oriented Data Management....72
Design Principles for Data Domains....73
Best Practices for Data Providers....75
Domain Ownership Responsibilities....77
Transitioning Toward Distributed and Domain-Oriented Data Management....78
Wrapping Up....79
Chapter 3. Mapping Domains to a Technology Architecture....83
Domain Topologies: Managing Problem Spaces....84
Fully Federated Domain Topology....84
Governed Domain Topology....88
Partially Federated Domain Topology....91
Value Chain–Aligned Domain Topology....92
Coarse-Grained Domain Topology....93
Coarse-Grained and Partially Governed Domain Topology....95
Centralized Domain Topology....96
Picking the Right Topology....99
Landing Zone Topologies: Managing Solution Spaces....100
Single Data Landing Zone....102
Source- and Consumer-Aligned Landing Zones....109
Hub Data Landing Zone....110
Multiple Data Landing Zones....111
Multiple Data Management Landing Zones....114
Practical Landing Zones Example....115
Wrapping Up....117
Chapter 4. Data Product Management....121
What Are Data Products?....121
Problems with Combining Code, Data, Metadata, and Infrastructure....122
Data Products as Logical Entities....123
Data Product Design Patterns....125
What Is CQRS?....126
Read Replicas as Data Products....128
Design Principles for Data Products....129
Resource-Oriented Read-Optimized Design....130
Data Product Data Is Immutable....131
Using the Ubiquitous Language....131
Capture Directly from the Source....132
Clear Interoperability Standards....132
No Raw Data....132
Don’t Conform to Consumers....133
Missing Values, Defaults, and Data Types....134
Semantic Consistency....134
Atomicity....134
Compatibility....135
Abstract Volatile Reference Data....135
New Data Means New Ownership....135
Data Security Patterns....136
Establish a Metamodel....136
Allow Self-Service....137
Cross-Domain Relationships....137
Enterprise Consistency....137
Historization, Redeliveries, and Overwrites....138
Business Capabilities with Multiple Owners....138
Operating Model....138
Data Product Architecture....139
High-Level Platform Design....139
Capabilities for Capturing and Onboarding Data....141
Data Quality....143
Data Historization....144
Solution Design....149
Real-World Example....151
Alignment with Storage Accounts....155
Alignment with Data Pipelines....156
Capabilities for Serving Data....157
Data Serving Services....158
File Manipulation Service....159
De-Identification Service....159
Distributed Orchestration....160
Intelligent Consumption Services....160
Direct Usage Considerations....161
Getting Started....161
Wrapping Up....162
Chapter 5. Services and API Management....165
Introducing API Management....166
What Is Service-Oriented Architecture?....167
Enterprise Application Integration....170
Service Orchestration....172
Service Choreography....175
Public Services and Private Services....176
Service Models and Canonical Data Models....176
Parallels with Enterprise Data Warehousing Architecture....177
A Modern View of API Management....179
Federated Responsibility Model....179
API Gateway....180
API as a Product....182
Composite Services....182
API Contracts....183
API Discoverability....183
Microservices....183
Functions....184
Service Mesh....184
Microservice Domain Boundaries....186
Ecosystem Communication....187
Experience APIs....188
GraphQL....188
Backend for Frontend....189
Practical Example....189
Metadata Management....191
Read-Oriented APIs Serving Data Products....192
Wrapping Up....192
Chapter 6. Event and Notification Management....195
Introduction to Events....196
Notifications Versus Carried State....197
The Asynchronous Communication Model....198
What Do Modern Event-Driven Architectures Look Like?....199
Message Queues....199
Event Brokers....199
Event Processing Styles....201
Event Producers....202
Event Consumers....204
Event Streaming Platforms....206
Governance Model....213
Event Stores as Data Product Stores....214
Event Stores as Application Backends....215
Streaming as the Operational Backbone....215
Guarantees and Consistency....216
Consistency Level....216
Processing Methods....217
Message Order....218
Dead Letter Queue....218
Streaming Interoperability....218
Governance and Self-Service....219
Wrapping Up....220
Chapter 7. Connecting the Dots....223
Cross-Domain Interoperability....224
Quick Recap....225
Data Distribution Versus Application Integration....226
Data Distribution Patterns....227
Application Integration Patterns....228
Consistency and Discoverability....230
Inspiring, Motivating, and Guiding for Change....234
Setting Domain Boundaries....235
Exception Handling....237
Organizational Transformation....238
Team Topologies....240
Organizational Planning....243
Wrapping Up....244
Chapter 8. Data Governance and Data Security....245
Data Governance....245
The Governance Framework....246
Processes: Data Governance Activities....252
Making Governance Effective and Pragmatic....253
Supporting Services for Data Governance....256
Data Contracts....258
Data Security....263
Current Siloed Approach....263
Trust Boundaries....264
Data Classifications and Labels....265
Data Usage Classifications....266
Unified Data Security....267
Identity Providers....270
Real-World Example....270
Typical Security Process Flow....273
Securing API-Based Architectures....278
Securing Event-Driven Architectures....281
Wrapping Up....282
Chapter 9. Democratizing Data with Metadata....285
Metadata Management....287
The Enterprise Metadata Model....288
Practical Example of a Metamodel....289
Data Domains and Data Products....291
Data Models....292
Data Lineage....297
Other Metadata Areas....297
The Metalake Architecture....299
Role of the Catalog....299
Role of the Knowledge Graph....301
Wrapping Up....310
Chapter 10. Modern Master Data Management....313
Master Data Management Styles....315
Data Integration....317
Designing a Master Data Management Solution....318
Domain-Oriented Master Data Management....319
Reference Data....319
Master Data....321
MDM and Data Quality as a Service....324
MDM and Data Curation....325
Knowledge Exchange....326
Integrated Views....327
Reusable Components and Integration Logic....327
Republishing Data Through Integration Hubs....327
Republishing Data Through Aggregates....328
Data Governance Recommendations....330
Wrapping Up....331
Chapter 11. Turning Data into Value....333
The Challenges of Turning Data into Value....334
Domain Data Stores....336
Granularity of Consumer-Aligned Use Cases....340
DDSs Versus Data Products....342
Best Practices....344
Business Requirements....344
Target Audience and Operating Model....345
Nonfunctional Requirements....346
Data Pipelines and Data Models....348
Scoping the Role Your DDSs Play....351
Business Intelligence....353
Semantic Layers....353
Self-Service Tools and Data....355
Best Practices....357
Advanced Analytics (MLOps)....358
Initiating a Project....361
Experimentation and Tracking....362
Data Engineering....364
Model Operationalization....365
Exceptions....366
Wrapping Up....367
Chapter 12. Putting Theory into Practice....371
A Brief Reflection on Your Data Journey....371
Centralized or Decentralized?....372
Making It Real....373
Opportunistic Phase: Set Strategic Direction....373
Transformation Phase: Lay Out the Foundation....378
Optimization Phase: Professionalize Your Capabilities....383
Data-Driven Culture....387
DataOps....387
Governance and Literacy....391
The Role of Enterprise Architects....391
Blueprints and Diagrams....392
Modern Skills....392
Control and Governance....392
Last Words....393
Index....395
About the Author....411
Colophon....411
As data management continues to evolve rapidly, managing all of your data in a central place, such as a data warehouse, is no longer scalable. Today's world is about quickly turning data into value. This requires a paradigm shift in the way we federate responsibilities, manage data, and make it available to others. With this practical book, you'll learn how to design a next-gen data architecture that takes into account the scale you need for your organization.
Executives, architects and engineers, analytics teams, and compliance and governance staff will learn how to build a next-gen data landscape. Author Piethein Strengholt provides blueprints, principles, observations, best practices, and patterns to get you up to speed.