Designing Data-Intensive Applications

最新推荐文章于 2022-01-01 16:39:13 发布

wuhuaiyu

最新推荐文章于 2022-01-01 16:39:13 发布

阅读量2.6k

点赞数

分类专栏：算法、技术架构文章标签：分布式

算法、技术同时被 2 个专栏收录

29 篇文章 0 订阅

订阅专栏

架构

22 篇文章 0 订阅

订阅专栏

寻找翻译本书后续章节合作者微信：18600166191

-----------------------------------

Designing Data-Intensive

Applications

The Big Ideas Behind Reliable, Scalable, and MaintainableSystems

数据密集应用系统设计

高可用，易扩展，好运维系统背后的思想

Beijing Boston Farnham Sebastopol Tokyo

Martin Kleppmann

Designing Data-Intensive Applications

by Martin KleppmannCopyright © 2017 Martin Kleppmann. All rights reserved.Printed in the United States of America.Published by O’Reilly Media, Inc., 1005 Gravenstein HighwayNorth, Sebastopol, CA 95472.

O’Reilly books may be purchased for educational, business,or sales promotional use. Online editions are also available for most titles (http://oreilly.com/safari). For more information, contact ourcorporate/insti‐ tutional sales department:800-998-9938 orcorporate@oreilly.com.

Editors: Ann Spencer and Marie BeaugureauProduction Editor: KristenBrownCopyeditor:Rachel HeadProofreader:Amanda Kersey

March 2017: First Edition

Revision History for the First Edition

Indexer: Ellen Troutman-ZaigInterior Designer: David Futato CoverDesigner: Karen Montgomery Illustrator: Rebecca Demarest

2017-03-01: First ReleaseSeehttp://oreilly.com/catalog/errata.csp?isbn=9781449373320for release details.

The O’Reilly logo is a registered trademark of O’Reilly Media,Inc.Designing Data-Intensive Applications, the cover image, and relatedtrade dress are trademarks of O’Reilly Media, Inc.

While the publisher and the author have used good faithefforts to ensure that the information and instructions contained in this workare accurate, the publisher and the author disclaim all responsibility forerrors or omissions, including without limitation responsibility for damagesresulting from the use of or reliance on this work. Use of the information andinstructions contained in this work is at your own risk. If any code samples orother technology this work contains or describes is subject to open sourcelicenses or the intellectual property rights of others, it is yourresponsibility to ensure that your use thereof complies with such licensesand/or rights.

978-1-449-37332-0 [LSI]

Technology is a powerful force in our society. Data,software, and communication can

be used for bad: to entrench unfair power structures, toundermine human rights, and to protect vested interests. But they can also beused for good: to make underrepresented people’s voices heard, to createopportunities for everyone, and to avert disasters. This book is dedicated toeveryone working toward the good.

技术就是力量。技术可以用干坏事：加剧社会中的不公，妨碍一些人的权利，保护既得利益。同时，技术也可以干好事：让低微人的声音得以被倾听，为每个人创造机会，病免一些灾难。本书致力于让每个人都把技术用于好的方向。

Computing is pop culture. [...] Pop culture holds a disdainfor history. Pop culture is all about identity and feeling like you’reparticipating. It has nothing to do with cooperation, the past or thefuture—it’s living in the present. I think the same is true of most people whowrite code for money. They have no idea where [their culture came from].

—Alan Kay, in interview withDr Dobb’sJournal (2012)

计算机是一种流行文化。流行文化曾经被人歧视。流行文化在于发现自我，享受参与的乐趣。它与过去、将来、他人无关，它活在当下。我相信，这和对于仅仅为了钱写代码的人一样，他们根本不懂背后的原理。（译者注：如果你是一个有追求的程序员，就应该探究其背后的工作原理。）

Preface. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . xiii

Part I. Foundations of Data Systems1. Reliable, Scalable, and Maintainable Applications. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . 3

第一部分。数据系统基础。1. 高可用，易扩展，好运维应用

Thinking About Data Systems

关于数据系统的思考

4 6 7 8 9

Reliability（可靠性） Scalability（扩展性） 10 Describing Load（负载描述）11 DescribingPerformance（性能描述） 13 Approaches for Coping with Load（负载处理方法）17

Maintainability （运维性）18 Operability: Making Life Easy forOperations（可操作性：让生活更容易处理） 19 Simplicity: Managing Complexity （简洁性：管理复杂度）20 Evolvability: Making Change Easy（可扩展性：容易修改） 21

Summary 22

2. Data Models and Query Languages. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

数据模型和查询语言

Relational Model Versus Document Model （关系模型vs文档模型）28 The Birthof NoSQL（NoSQL诞生）29 TheObject-Relational Mismatch（对象-关系模型对比）29 Many-to-One and Many-to-Many Relationships（多对一和多对多模型）33 Are Document Databases Repeating History?（对象模型是历史的重现吗？）36

Table of Contents

vii

Relational Versus Document Databases Today（当前关系模型、对象模型对比） 38 Query Languages for Data（数据查询语言）42 Declarative Queries on the Web（Web上声明式查询）44 MapReduceQuerying 46 Graph-Like Data Models（图模型）49 Property Graphs（属性图） 50 The CypherQuery Language（Cypher查询语言） 52 GraphQueries in SQL （用SQL实现图查询）53 Triple-Stores and SPARQL 55 The Foundation: Datalog 60 Summary63

3. Storage and Retrieval. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

Summary 103

4. Encoding and Evolution. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111Formats for Encoding Data 112Language-Specific Formats 113 JSON, XML, and Binary Variants 114 Thrift andProtocol Buffers 117 Avro 122 The Merits of Schemas 127 Modes of Dataflow 128Dataflow Through Databases 129 Dataflow Through Services: REST and RPC 131Message-Passing Dataflow 136 Summary 139

Data Structures That Power Your Database Hash IndexesSSTables and LSM-TreesB-Trees

Comparing B-Trees and LSM-Trees

Other Indexing Structures Transaction Processing orAnalytics?

Data Warehousing

Stars and Snowflakes: Schemas for Analytics Column-OrientedStorage

Column CompressionSort Order in Column StorageWriting to Column-Oriented Storage 101 Aggregation: DataCubes and Materialized Views 101

viii | Table of Contents

Part II. Distributed Data

5. Replication. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 151Leaders and Followers 152 Synchronous Versus AsynchronousReplication 153 Setting Up New Followers 155 Handling Node Outages 156Implementation of Replication Logs 158 Problems with Replication Lag 161Reading Your Own Writes 162 Monotonic Reads 164 Consistent Prefix Reads 165Solutions for Replication Lag 167 Multi-Leader Replication 168 Use Cases forMulti-Leader Replication 168 Handling Write Conflicts 171 Multi-LeaderReplication Topologies 175 Leaderless Replication 177 Writing to the DatabaseWhen a Node Is Down 177 Limitations of Quorum Consistency 181 Sloppy Quorumsand Hinted Handoff 183 Detecting Concurrent Writes 184 Summary 192

6. Partitioning. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 199Partitioning and Replication 200 Partitioning of Key-Value Data201Partitioning by Key Range 202Partitioning by Hash of Key 203 Skewed Workloads and Relieving Hot Spots 205Partitioning and Secondary Indexes 206 PartitioningSecondary Indexes by Document 206 Partitioning Secondary Indexes by Term 208Rebalancing Partitions 209 Strategies for Rebalancing 210Operations: Automatic or Manual Rebalancing 213Request Routing 214 Parallel Query Execution 216 Summary 216

7. Transactions. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 221The Slippery Concept of a Transaction 222

Table of Contents | ix

The Meaning of ACID 223

Single-Object and Multi-Object Operations 228 Weak IsolationLevels 233 Read Committed 234 Snapshot Isolation and Repeatable Read 237Preventing Lost Updates 242 Write Skew and Phantoms 246 Serializability 251Actual Serial Execution 252 Two-Phase Locking (2PL) 257 Serializable SnapshotIsolation (SSI) 261 Summary 266

8. The Trouble with Distributed Systems. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 273Faults and Partial Failures 274 CloudComputing and Supercomputing 275 Unreliable Networks 277 Network Faults inPractice 279 Detecting Faults 280 Timeouts and Unbounded Delays 281 SynchronousVersus Asynchronous Networks 284 Unreliable Clocks 287 Monotonic VersusTime-of-Day Clocks 288 Clock Synchronization and Accuracy 289 Relying onSynchronized Clocks 291 Process Pauses 295 Knowledge, Truth, and Lies 300 TheTruth Is Defined by the Majority 300 Byzantine Faults 304 System Model andReality 306 Summary 310

9. Consistency and Consensus. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321Consistency Guarantees 322Linearizability 324

What Makes a System Linearizable? 325 Relying onLinearizability 330 Implementing Linearizable Systems 332 The Cost ofLinearizability 335

Ordering Guarantees 339 Ordering and Causality 339 SequenceNumber Ordering 343

x | Table of Contents

Total Order Broadcast 348 Distributed Transactions andConsensus 352 Atomic Commit and Two-Phase Commit (2PC) 354 DistributedTransactions in Practice 360 Fault-Tolerant Consensus 364 Membership andCoordination Services 370 Summary 373

Part III. Derived Data

10. Batch Processing. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389Batch Processing with Unix Tools 391Simple Log Analysis 391 The Unix Philosophy 394 MapReduce and DistributedFilesystems 397 MapReduce Job Execution 399 Reduce-Side Joins and Grouping 403Map-Side Joins 408 The Output of Batch Workflows 411 Comparing Hadoop toDistributed Databases 414 Beyond MapReduce 419 Materialization of IntermediateState 419 Graphs and Iterative Processing 424 High-Level APIs and Languages 426Summary 429

11. Stream Processing. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439Transmitting Event Streams 440Messaging Systems 441 Partitioned Logs 446 Databases and Streams 451 KeepingSystems in Sync 452 Change Data Capture 454 Event Sourcing 457 State, Streams,and Immutability 459 Processing Streams 464 Uses of Stream Processing 465Reasoning About Time 468 Stream Joins 472 Fault Tolerance 476 Summary 479

Table of Contents | xi

12. The Future of Data Systems. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489Data Integration 490 CombiningSpecialized Tools by Deriving Data 490 Batch and Stream Processing 494Unbundling Databases 499 Composing Data Storage Technologies 499 DesigningApplications Around Dataflow 504 Observing Derived State 509 Aiming forCorrectness 515 The End-to-End Argument for Databases 516 Enforcing Constraints521 Timeliness and Integrity 524 Trust, but Verify 528 Doing the Right Thing533 Predictive Analytics 533 Privacy and Tracking 536 Summary 543

Glossary. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 553 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .559

xii | Table of Contents

Preface

If you have worked in software engineering in recent years,especially in server-side and backend systems, you have probably been bombardedwith a plethora of buzz‐ words relating to storage andprocessing of data. NoSQL! Big Data! Web-scale! Sharding! Eventual consistency!ACID! CAP theorem! Cloud services! MapReduce! Real-time!

近几年如果你的工作与软件相关，尤其是服务端和后端系统方面，你可能已经听到了太多的关于数据存储和处理的五花八门的词汇。NoSQL！大数据！Web-scale!分片！最终一致性！ACID！CAP理论！云服务！MapReduce!实时性！

In the last decade we have seen many interestingdevelopments in databases, in distributed systems, and in the ways we buildapplications on top of them. There are various driving forces for thesedevelopments:

过去的几十年，我们见证了数据库、分布式系统领域以及构建在这些系统上应用系统的可喜进步。原因有以下几点：

• Internet companies such as Google,Yahoo!, Amazon, Facebook, LinkedIn, Microsoft, and Twitter are handling hugevolumes of data and traffic, forcing them to create new tools that enable themto efficiently handle such scale.

• 例如Google, Yahoo!, Amazon, Facebook, LinkedIn, Microsoft, andTwitter这些公司需要处理大量的数据及其访问，迫使他们发明各种有效处理大规模数据的工具

• Businesses need to be agile, testhypotheses cheaply, and respond quickly to new market insights by keepingdevelopment cycles short and data models flexible.

• 现代商业模式需要敏捷开发，迅速对市场变化做出相应，快速试错。这要求开发周期要短，数据模型必须灵活。

• Free and open source software hasbecome very successful and is now preferred to commercial or bespoke in-housesoftware in many environments.

• 免费和开源软件的大获成功。许多场景中需要商业化或者订制化的软件服务

• CPU clock speeds are barely increasing,but multi-core processors are standard, and networks are getting faster. Thismeans parallelism is only going to increase.

• CPU时钟周期几乎不再提高，多核处理器已经普及，网络变得更快。这意味着并行化正在成为趋势。

• Even if you work on a small team, youcan now build systems that are distributed across many machines and evenmultiple geographic regions, thanks to infra‐ structure as a service (IaaS) such as Amazon Web Services.

• 即使你在一个小团队，你也可以借助例如亚马逊的Web Services基础服务构建多机至多地域的系统。

• Many services are now expected to behighly available; extended downtime due to outages or maintenance is becomingincreasingly unacceptable.Data-intensiveapplications are pushingthe boundaries of what is possible by making use of these technological developments.We call an applicationdata-intensive if data is its primarychallenge—the quantity of data, the complexity of data, or the speed atwhich it is changing—as opposed to compute-intensive,where CPU cycles are the bottleneck.

• 许多服务需要是高可用的。因为电网或者运维原因导致的宕机变得越来越不可接受。数据密集型应用正在利用技术的发展拓展自己服务的边界。数据密集型应用的主要针对-大量数据，复杂数据，或者数据快速变化的场景。与其相对的是CPU为瓶颈的CPU密集型应用。

The tools and technologies that help data-intensiveapplications store and process data have been rapidly adapting to thesechanges. New types of database systems (“NoSQL”) have been getting lots ofattention, but message queues, caches, search indexes, frameworks for batch andstream processing, and related technologies are very important too. Manyapplications use some combination of these.

针对数据密集型应用的数据存储和处理技术和工具正在快速发展。新型数据库系统（NoSQL）已经走进大众视线，同时消息队列，caches，检索索引，批处理和流处理架构及其相关技术也变得日益重要。很多应用都利用到以上多项技术。

The buzzwords that fill this space are a sign of enthusiasmfor the new possibilities, which is a great thing. However, as softwareengineers and architects, we also need to have a technically accurate andprecise understanding of the various technologies and their trade-offs if wewant to build good applications. For that understanding, we have to dig deeperthan buzzwords.

这些流行词汇预示着无限的可能性。同时这又程序员和架构师要有关于这些技术的理解、积累以及利用其构建应用要做出的取舍。从这个角度说，我们要深刻理解而不仅仅浮于词汇表面。

Fortunately, behind the rapid changes in technology, thereare enduring principles that remain true, no matter which version of aparticular tool you are using. If you understand those principles, you’re in aposition to see where each tool fits in, how to make good use of it, and how toavoid its pitfalls. That’s where this book comes in.

幸运的是，在快速发展的技术、以及各个版本工具的背后，有一些共同的规律。如果你抓住了这些规律，你就能知道每个工具的适用场景，它擅长什么，怎么避免它的短板。这就是本书的目标。

The goal of this book is to help you navigate the diverseand fast-changing landscape of technologies for processing and storing data.This book is not a tutorial for one particular tool, nor is it a textbook fullof dry theory. Instead, we will look at examples of successful data systems:technologies that form the foundation of many popular applications and thathave to meet scalability, performance, and reliability require‐ ments in production every day.

本书目标就是帮忙你浏览纷乱而又快速发展的各种数据处理&存储技术的边界。本书不是一个各种工具的使用手册，也不是各种理论的干货。我们将会结合各种流行系统来介绍其底层技术如何满足高扩展，高性能，高可用的要求的。

We will dig into the internals of those systems, tease aparttheir key algorithms, discuss their principles and the trade-offs they have tomake. On this journey, we will try to find useful ways ofthinking about datasystems—not just how they work, but also why they work that way,and what questions we need to ask.

我们将会深入系统内部，抽丝剥茧，细究核心算法，详解他们做出各种取舍的准则。在这个过程中，我们将会形成自己对应数据系统的理解-不仅仅是它们的工作原理，还有它们为什么这样工作以及我们应该从什么角度去思考问题。

After reading this book, you will be in a great position todecide which kind of technology is appropriate for which purpose, andunderstand how tools can be combined to form the foundation of a goodapplication architecture. You won’t be ready to build your own database storageengine from scratch, but fortunately that is rarely necessary. You will,however, develop a good intuition for what your systems are doing under thehood so that you can reason about their behavior, make good design decisions,and track down any problems that may arise.

读完本书后，你能知道针对于你的系统做出自己的技术选型，理解这些工具如何组成好的应用架构。大多数情况下，你可能不必真的从零到一的构建自己的数据存储引擎。但是你却能形成一个好的应用系统的直觉，它能帮助你理解系统背后的行为准则，做出好的设计判断，更方便地追踪系统异常。

Who Should Read This Book?

谁应该读本书

If you develop applications that have some kind ofserver/backend for storing or pro‐ cessing data, and your applicationsuse the internet (e.g., web applications, mobile apps, or internet-connectedsensors), then this book is for you.

如果你正在开发一个互联网应用的数据存储或者处理服务端/后端，这本书就是为你准备的。

xiv | Preface

This book is for software engineers, software architects,and technical managers who love to code. It is especially relevant if you needto make decisions about the architecture of the systems you work on—forexample, if you need to choose tools for solving a given problem and figure outhow best to apply them. But even if you have no choice over your tools, thisbook will help you better understand their strengths and weaknesses.

如果你是软件工程师，架构师或者对代码感兴趣的技术负责人都可以读此书。尤其是你需要对系统架构设计做决策时更需要这本书，例如，你需要为一个特定问题选择一个工具并衡量起效果。即使你还没想好选用哪个工具，本书也会帮你更好理解各个选择的利害。

You should have some experience building web-basedapplications or network services, and you should be familiar with relationaldatabases and SQL. Any non- relational databases and other data-related toolsyou know are a bonus, but not required. A general understanding of commonnetwork protocols like TCP and HTTP is helpful. Your choice of programminglanguage or framework makes no difference for this book.

你可能

If any of the following are true for you, you’ll find thisbook valuable:

如果你符合以下任意一点，你会发现本书对你来说很有价值：

• You want to learn how to make datasystems scalable, for example, to support web or mobile apps with millions ofusers.

• 你想学习如何设计一个易扩展的支持数百万用户的web/移动app数据系统

• You need to make applications highlyavailable (minimizing downtime) and operationally robust.

• 你需要设计一个高可用（宕机时间短）和易操作的系统。

• You are looking for ways of makingsystems easier to maintain in the long run, even as they grow and asrequirements and technologies change.

• 你正在苦苦思索：从长远来看，随着需求和技术的不断变化，如何设计一个易维护的系统。

• You have a natural curiosity for theway things work and want to know what goes on inside major websites and onlineservices. This book breaks down the internals of various databases and dataprocessing systems, and it’s great fun to explore the bright thinking that wentinto their design.Sometimes, when discussing scalabledata systems, people make comments along the lines of, “You’re not Google orAmazon. Stop worrying about scale and just use a relational database.” There istruth in that statement: building for scale that you don’t need is wastedeffort and may lock you into an inflexible design. In effect, it is a form ofpremature optimization. However, it’s also important to choose the right toolfor the job, and different technologies each have their own strengths andweaknesses. As we shall see, relational databases are important but not thefinal word on dealing with data.

• 自然的好奇心驱使你去探究各种网络服务和在线应用的工作原理。本书将各种数据库系统和数据处理系统打散开来逐点分析，从设计者的角度来思考是一件很有趣的事情。当我们讨论大规模数据系统的时候经常有人泼冷水“你们公司又不是Google或者Amazon，别杞人忧天，考虑什么扩展性，关系型数据库已经够用了。”这个假设有个前提：为了大规模做出的设计和妥协是以损失灵活性为代价的。实际上，这是某种程度上的过度设计。但是，用合适的工具解决合适的问题，每种技术都有其优略点。我们应该知道，虽然关系型数据库很重要，但它不是万能的。

• Scope of This Book

• 本书边界

• This book does not attempt to givedetailed instructions on how to install or use specific software packages orAPIs, since there is already plenty of documentation for those things. Insteadwe discuss the various principles and trade-offs that are fundamental to datasystems, and we explore the different design decisions taken by differentproducts.

• 本书因为网络上各种软件包和api的安装使用很多了，本书不会涉及。相反，我们会详细分析数据系统背后的原理和取舍，以及不同系统做出的不同选择的原因。

Preface | xv

In the ebook editions we have included links to the fulltext of online resources. All links were verified at the time of publication,but unfortunately links tend to break frequently due to the nature of the web.If you come across a broken link, or if you are reading a print copy of thisbook, you can look up references using a search engine. For academic papers,you can search for the title in Google Scholar to find open-access PDF files.Alternatively, you can find all of the references at https:// github.com/ept/ddia-references, where we maintain up-to-date links.

在本书的电子版中我们有在线资源的所有链接。所有链接出版前都进行了校对，都是有效的。不过有些链接一定会随着时间流逝而失效。如果你遇到无效链接或者正在读纸质版图书，你可以搜索引擎自己查找文章。针对学术论文，你可以通过论文题目，在google学术搜索上找到pdf文件。你可以可以访问https://github.com/ept/ddia-references，它会保证一直更新

We look primarily at the architecture of data systemsand the ways they are integrated into data-intensive applications. This bookdoesn’t have space to cover deployment, operations, security, management, andother areas—those are complex and important topics, and we wouldn’t do themjustice by making them superficial side notes in this book. They deserve booksof their own.

我们主要研究数据系统的架构和他们在数据密集系统中的整合方式。本书无力涉及部署、操作、安全、管理和其它问题，它们各自都是既复杂又重要的专题。肤浅的介绍不如不做。

Many of the technologies described in this book fall withinthe realm of theBig Data buzzword. However, the term “Big Data” is sooverused and underdefined that it is not useful in a serious engineeringdiscussion. This book uses less ambiguous terms, such as single-node versusdistributed systems, or online/interactive versus offline/ batch processingsystems.

本书描述的很多技术都是大数据领域的流行词汇。但是“大数据”这个词在通俗领域被用烂了，因此在严肃工程领域失去了精确的定义。本书会尽量少用这种模糊词汇，例如：“单点”对应“分布式系统”、“在线/交互”对应“离线/批量处理”系统

This book has a bias toward free and open source software(FOSS), because reading, modifying, and executing source code is a great way tounderstand how something works in detail. Open platforms also reduce the riskof vendor lock-in. However, where appropriate, we also discuss proprietarysoftware (closed-source software, soft‐ ware as a service, or companies’in-house software that is only described in literature but not releasedpublicly).

本书倾向于免费开源软件，因为读、改、执行源码是最了解系统背后工作细节的最好的方法。开放平台中途被抛弃的风险也小一些。但是，如果有必要，我们也会讨论一些有版权系统（闭源软件，软件就是服务，或者没有公开发布只在文档中涉及的公司内部软件）

Outline of This Book

本书结构

This book is arranged into three parts:

本书分三个部分

1. In Part I, we discuss the fundamental ideas that underpin the design ofdata- intensive applications. We start inChapter 1 by discussing what we’re actually trying to achieve: reliability,scalability, and maintainability; how we need to think about them; and how wecan achieve them. InChapter 2 we compare several different data models and query languages, andsee how they are appropriate to different situations. InChapter 3 we talk about storage engines: how databases arrange data ondisk so that we can find it again efficiently.Chapter 4 turns to formats for data encoding (serialization) and evolutionof schemas over time.

第一部分：我们讲述数据密集系统之下的基本思路和准则。第一章，我们介绍我们要达到的终极目标：高可用，易扩展，易运维；针对每个点我们如果去思考；如何才能达到各个目标。第二章,我们将通过不同的数据模型和查询语言对比来说明它们各种适用于何种场景。第三章，我们介绍存储引擎：数据库系统的数据在磁盘上的组织方式及如何才能高效的访问。第四章，说明一下数据编码方式（序列化）和随着时间推移，schemas的变化。

2. In Part II, we move from data stored on one machine to data that isdistributed across multiple machines. This is often necessary for scalability,but brings with it a variety of unique challenges. We first discuss replication(Chapter 5), parti‐ tioning/sharding (Chapter 6), and transactions (Chapter 7). We then go into more detail on the problems with distributedsystems (Chapter 8) and what it means to achieveconsistency and consensus in a distributed system (Chapter 9).

第二部分：我们把数据载体从单机扩展到多机。这通常是为了满足扩展性的需求，但这也带来了很多前所未有的挑战。我们先在第五章讨论副本机制，第六章讨论分片机制，第七章讨论事务。然后我们在第八章探讨分布式系统中更细节信息。第九章会说明在分布式系统中的一贯性和一致性。

3. In Part III, we discuss systems that derive somedatasets from other datasets. Derived data often occurs in heterogeneoussystems: when there is no one database that can do everything well,applications need to integrate several different databases, caches, indexes,and so on. InChapter 10 we start with a batch processingapproach to derived data, and we build upon it with stream processing inChapter 11. Finally, inChapter 12 we put everything together and discuss approaches for buildingreliable, scalable, and maintainable applications in the future.

第三部分：我们介绍具有数据上下游关系的一系列数据系统。系统间数据交互经常发生：当一个数据库在某一方面有缺陷，同时应用需要把很多不同的数据库,caches,索引和其它系统整合在一起。第十章：我们介绍接收数据的批量处理系统，它的数据来源是第十一章介绍的流式处理系统。最后在十二章：我们把所有的系统放在一起讨论，将来如何才能构建一个高可用，易扩展，方便运维的应用。

References and Further Reading

参考文献和推荐阅读

Most of what we discuss in this book has already been saidelsewhere in some form or another—in conference presentations, research papers,blog posts, code, bug trackers, mailing lists, and engineering folklore. Thisbook summarizes the most important ideas from many different sources, and itincludes pointers to the original literature throughout the text. Thereferences at the end of each chapter are a great resource if you want toexplore an area in more depth, and most of them are freely available online.

本书讨论的大部分内容在其他地方（一些会议，研究报告，blog，源码中，bug跟踪，邮件列表和程序员读物）已经有所介绍。本书只是将从各种来源的思路进行汇总，本文保留到原文的引用。每章尾的引用都是更深了解各个原理的很好的资源，一般都是免费在线阅读的。

O’Reilly Safari

Members have access to thousands of books, training videos,Learning Paths, interac‐ tive tutorials, and curated playlistsfrom over 250 publishers, including O’Reilly Media, Harvard Business Review,Prentice Hall Professional, Addison-Wesley Pro‐ fessional, Microsoft Press, Sams, Que, Peachpit Press, Adobe,Focal Press, Cisco Press, John Wiley & Sons, Syngress, Morgan Kaufmann, IBMRedbooks, Packt, Adobe Press, FT Press, Apress, Manning, New Riders,McGraw-Hill, Jones & Bartlett, and Course Technology, among others.

For more information, please visit http://oreilly.com/safari.

Safari (formerly Safari Books Online) is a membership-based training andreference platform for enterprise, government, educators, and individuals.

Preface | xvii

How to Contact Us

Please address comments and questions concerning this bookto the publisher:

O’Reilly Media, Inc.1005 Gravenstein Highway NorthSebastopol, CA 95472800-998-9938 (in the United States or Canada) 707-829-0515(international or local) 707-829-0104 (fax)

We have a web page for this book, where we list errata,examples, and any additional information. You can access this page athttp://bit.ly/designing-data-intensive-apps.

To comment or ask technical questions about this book, sendemail tobookques‐ tions@oreilly.com.

For more information about our books, courses, conferences, andnews, see our web‐ site athttp://www.oreilly.com.

Find us on Facebook: http://facebook.com/oreillyFollow us on Twitter:http://twitter.com/oreillymediaWatch us on YouTube:http://www.youtube.com/oreillymedia

Acknowledgments

鸣谢

This book is an amalgamation and systematization of a largenumber of other people’s ideas and knowledge, combining experience from bothacademic research and industrial practice. In computing we tend to be attractedto things that are new and shiny, but I think we have a huge amount to learnfrom things that have been done before. This book has over 800 references toarticles, blog posts, talks, documentation, and more, and they have been aninvaluable learning resource for me. I am very grateful to the authors of thismaterial for sharing their knowledge.

本书是大量学术研究和工业实践的思想系统化思路结晶。相对于提出新的思路，我们更乐于再前人经验上进行思考。本书引用了800多处文献，blog，访谈，文档，这都是无价的学习资源。非常感谢将这些知识分享的作者。

I have also learned a lot from personal conversations,thanks to a large number of people who have taken the time to discuss ideas orpatiently explain things to me. In particular, I would like to thank Joe Adler,Ross Anderson, Peter Bailis, Márton Balassi, Alastair Beresford, MarkCallaghan, Mat Clayton, Patrick Collison, Sean Cribbs, Shirshanka Das, NiklasEkström, Stephan Ewen, Alan Fekete, Gyula Fóra, Camille Fournier, AndresFreund, John Garbutt, Seth Gilbert, Tom Haggett, Pat Hel‐ land, Joe Hellerstein, Jakob Homan, Heidi Howard, JohnHugg, Julian Hyde, Conrad Irwin, Evan Jones, Flavio Junqueira, Jessica Kerr,Kyle Kingsbury, Jay Kreps, Carl Lerche, Nicolas Liochon, Steve Loughran, LeeMallabone, Nathan Marz, Caitie McCaffrey, Josie McLellan, ChristopherMeiklejohn, Ian Meyers, Neha Narkhede, Neha Narula, Cathy O’Neil, OnoraO’Neill, Ludovic Orban, Zoran Perkov, Julia Powles, Chris Riccomini, HenryRobinson, David Rosenthal, Jennifer Rullmann, Matthew Sackman, Martin Scholl,Amit Sela, Gwen Shapira, Greg Spurrier, Sam Stokes, Ben Stopford, Tom Stuart,Diana Vasile, Rahul Vohra, Pete Warden, and Brett Wooldridge.

谢谢那些和我耐心讨论并解释给我听的人，这些会话也让我获益匪浅。尤其是以下各位：Joe Adler, Ross Anderson, Peter Bailis,Márton Balassi, Alastair Beresford, Mark Callaghan, Mat Clayton, PatrickCollison, Sean Cribbs, Shirshanka Das, Niklas Ekström, Stephan Ewen, AlanFekete, Gyula Fóra, Camille Fournier, Andres Freund, John Garbutt, SethGilbert, Tom Haggett, Pat Hel‐ land, Joe Hellerstein, Jakob Homan,Heidi Howard, John Hugg, Julian Hyde, Conrad Irwin, Evan Jones, FlavioJunqueira, Jessica Kerr, Kyle Kingsbury, Jay Kreps, Carl Lerche, NicolasLiochon, Steve Loughran, Lee Mallabone, Nathan Marz, Caitie McCaffrey, JosieMcLellan, Christopher Meiklejohn, Ian Meyers, Neha Narkhede, Neha Narula, CathyO’Neil, Onora O’Neill, Ludovic Orban, Zoran Perkov, Julia Powles, ChrisRiccomini, Henry Robinson, David Rosenthal, Jennifer Rullmann, Matthew Sackman,Martin Scholl, Amit Sela, Gwen Shapira, Greg Spurrier, Sam Stokes, BenStopford, Tom Stuart, Diana Vasile, Rahul Vohra, Pete Warden, and BrettWooldridge.

Several more people have been invaluable to the writing ofthis book by reviewing drafts and providing feedback. For these contributions Iam particularly indebted to Raul Agepati, Tyler Akidau, Mattias Andersson,Sasha Baranov, Veena Basavaraj, David Beyer, Jim Brikman, Paul Carey, RaulCastro Fernandez, Joseph Chow, Derek Elkins, Sam Elliott, Alexander Gallego,Mark Grover, Stu Halloway, Heidi Howard, Nicola Kleppmann, Stefan Kruppa, BjornMadsen, Sander Mak, Stefan Podkowinski, Phil Potter, Hamid Ramazani, SamStokes, and Ben Summers. Of course, I take all responsibility for any remainingerrors or unpalatable opinions in this book.

还有很多帮我校稿提供建议的人。我非常感谢以下人Raul Agepati, Tyler Akidau, MattiasAndersson, Sasha Baranov, Veena Basavaraj, David Beyer, Jim Brikman, PaulCarey, Raul Castro Fernandez, Joseph Chow, Derek Elkins, Sam Elliott, AlexanderGallego, Mark Grover, Stu Halloway, Heidi Howard, Nicola Kleppmann, StefanKruppa, Bjorn Madsen, Sander Mak, Stefan Podkowinski, Phil Potter, HamidRamazani, Sam Stokes, and Ben Summers.当然，本书遗留的错误和纰漏仍是我的错误。

For helping this book become real, and for their patiencewith my slow writing and unusual requests, I am grateful to my editors MarieBeaugureau, Mike Loukides, Ann Spencer, and all the team at O’Reilly. Forhelping find the right words, I thank Rachel Head. For giving me the time andfreedom to write in spite of other work commitments, I thank AlastairBeresford, Susan Goodhue, Neha Narkhede, and Kevin Scott.

非常感谢我的编辑谢谢Marie Beaugureau, Mike Loukides, AnnSpencer,和O’Reilly的整个团队，他们耐心等我写完此书，并不厌其烦的解答我的疑问。Rachel Head帮助我找到合适的词汇。感谢Alastair Beresford, Susan Goodhue, Neha Narkhede, and Kevin Scott，他们跟我写作的时间和无限制的自由。

Very special thanks are due to Shabbir Diwan and EdieFreedman, who illustrated with great care the maps that accompany the chapters.It’s wonderful that they took on the unconventional idea of creating maps, andmade them so beautiful and compelling.

非常感谢Shabbir Diwan and Edie Freedman，他们为每章提供了插图。它们让本书变得有趣。

Finally, my love goes to my family and friends, without whomI would not have been able to get through this writing process that has takenalmost four years. You’re the best.

最后，感谢我的家人和朋友，没有他们我不几乎不能为此书坚持四年。