在理论计算机科学中,CAP定理(CAP theorem),又被称作布鲁尔定理(Brewer's theorem),它指出对于一个分布式计算系统来说,不可能同时满足以下三点:[1][2]
- 一致性(Consistency)(等同于所有节点访问同一份最新的数据副本)
- 可用性(Availability)(对数据更新具备高可用性)
- 容忍网络分区(Partition tolerance)(以实际效果而言,分区相当于对通信的时限要求。系统如果不能在时限内达成数据一致性,就意味着发生了分区的情况,必须就当前操作在C和A之间做出选择[3]。)
根据定理,分布式系统只能满足三项中的两项而不可能满足全部三项[4]。理解CAP理论的最简单方式是想象两个节点分处分区两侧。允许至少一个节点更新状态会导致数据不一致,即丧失了C性质。如果为了保证数据一致性,将分区一侧的节点设置为不可用,那么又丧失了A性质。除非两个节点可以互相通信,才能既保证C又保证A,这又会导致丧失P性质。
The CAP/Brewer Conjecture
It is commonly desirable for distributed systems to exhbit Consistency, Availability, and Partition tolerance.
- By consistency we mean that all participating systems share the same view of the data. For example, if one system observes the value five, all systems would, if they looked, observe the value five at that time. None, for example, would be more stale or more fresh than others.
- By Availability we mean that the system is able to respond quickly enough for the user's needs. For example, if a Web page times out, or users abort before seeing the results, it is not available.
- By Partition tolerance we mean that, in the event of the failure or isolation of some participants, the other participants can continue to do whatever they can. For example, the loss of certain nodes might necessitate disconnecting certain clients or the inability to return certain results -- but should not unduely interfere with the ability of the functioning nodes to service clients and/or return results.
The CAP Conjecture, attributable to Eric Brewer of UC-B, is that we can build systems that guarantee up to two of these properties -- but not necessarily all three. Let's consider why.
Imagine a system with too much of a queue. We can fix that by adding more systems and dividing the work as seen below:
The picture above is nice, because we have availability. And, in the event of a partitioning, the reachable nodes can still respond, so we have partition tolerance. The problem, though, is that we've lostconsistency. Each host is operating independently, so the values can diverge if updates differ. This is the "AP"/"PA" case.
To add back consistency, we'll need to have communications among the hosts such that they can sync values:
But, notice what has happened. We gained consistency through communication. If we break that communication, we're back where we started. So, we now have consistency and availability, but not also partition tolerance. This is the "CA"/"AC" case.
What about the "PC"/"CP" case? How can we have consistency and partition tolerance without availability, at least in any meaningful way? One ansewer, which is I think a good example, is that we enable reads, but not updates. Now we have sacraficed the avaialability of writes in order to ensure maintaining consistency in light of a partitioning.
Note: You'll often see what I've called the "CAP Conjecture" described as the "CAP Theorem". I do not believe this to be correct. The conjecture, is, I think, important because it allows us to frame the trade-offs we're likely to see in real world systems. Attempts to "prove" the conjecture into a theorem have introduced constraints and formalisms that, I think, separate it from the generally applicable cases. As it turns out, there is a natural tension at work here: One can only prove something that is defined, but fuzzy edges makes things more applicable and gives them a broader reach.