前言
主要讲了有一定consistency保证的情况下有很不错性能的DB方案Causal Consistency, 其中notion of dependency一定程度上解决了写在不同servers上异步的特点
一、geo-replication
1.1 Spanner
– 通过Paxos保证强consistency
– Two phase commit
– no one site can --write on its own
– but has read transactions, consistent, fairly fast
1.2 Memchche
– 写操作只能通过primary site完成
– read非常快
1.3 新的需求
– 写操作可以再任何server上进行
– 先满足performance再满足consistency
二、预备方案One
每个server都可以有写操作,然后再同步给其它server
这个属于"eventually consistent" design
特点
- clients may see updates in different orders
- if no writes for long enough, all clients see same data
a pretty loose spec, many ways to implement, easy to get good performance
used in deployed systems, e.g. Dynamo and Cassandra
but can be tricky for app programmers
Example
C1 uploads photo, adds reference to public list:
C1: put(photo) put(list)
C2 reads:
C2: get(list) get(photo)
这里C2是不一定能看到photo的,因为在C1的put(photo) put(list)两个操作不能保证按照原来的顺序传到其它的每个server
决定哪一个write是最近的这个很难做到,但是要做到eventual consistency是需要让每个server都choose the same value
时间比对是一个解决方案,后面再跟上一个server的unique id保证同一时间的两个writes可以区别开来
不过每个server自己的时钟其实是会有差异,这个差异甚至可能是小时级别的,这样可能会prevents any other update for an hour
改进的时间对比方案
each server implements a “Lamport clock” or “logical clock”
Tmax = highest v# seen (from self and others)
T = max(Tmax + 1, wall-clock time)
if some server has a fast clock, everyone who sees a version from that server will advance their Lamport clock
三、预备方案Two
provide a sync(k, v#) operation
– sync() does not return until every datacenter has at least v# for k
这是一个"eventual plus barriers"方案
但是这会变得slow,因为需要等待其它datacenter
– it’s a straightforward, efficient design
– if you don’t need transactions, the semantics are pretty good
– it makes the photo list example work though requires some thought to get order and sync()s right
– read performance is excellent
四、COPS
– we want to forward puts asynchronously (no sync() or waiting)
– we want each shard to forward puts independently (no central log server)
引入了context的概念,也就是每个写操作都包括其dependenc
Example
get(X)->v2
context: Xv2
get(Y)->v4
context: Xv2, Yv4
put(Z, -)->v3
client sends Xv2, Yv4 to shard server along with new Z
context: Xv2, Yv4, Zv3
问题
- 这种dependency需要cops aware of,也就是有时候回出现dependency missing的情况
- ever-growing client contexts
put(K)->vN sends context, then clears context, replaces with KvN so next put(), e.g. put(L), depends only on KvN - ordered puts/gets aren’t sufficient
a photo list with an ACL (Access Control List)
get(ACL), then get(list)
someone deletes you from ACL, then adds a photo between the two get()s
通过COPS-GT get_trans() approach 解决
五、limitations
- conflicting writes are a serious difficulty
- awkward for clients to track causality
- limited notion of “transaction”
- significant overhead to track, communicate, obey causal dependencies
总结
Causal Consistency处于envetual consistency以及strongly consistent之间,没有太多被业界使用