分布式系统:ZooKeeper 用例
“慢”锁
(来自lec notes of 6.824)
- 用一个empheral znode,比如叫 /app/lock
- 加锁:create一个ephemeral节点
- 释放锁:delete节点
- 等锁:用getDate加watch
Example usage 1: slow lock
acquire lock:
retry:
r = create("app/lock", "", ephemeral)
if r:
return
else:
getData("app/lock", watch=True)
watch_event:
goto retry
release lock: (voluntarily or session timeout)
delete("app/lock")
“排队取号”锁
(来自lec notes of 6.824)
- 用一个ephemeral+ sequential 的Z节点。
- 获取锁:n = create()
用getChildren查询lock节点的子节点。找到序列号最小的。
如果我们的n是最小的。说明我们加锁成功。如果不是最小的,我们要等待锁。 - 释放锁:delete
- 等待锁:用exists()创建一个watch,等待序列号比我们小1的节点被删除。
Example usage 2: "ticket" locks
acquire lock:
n = create("app/lock/request-", "", ephemeral|sequential)
retry:
requests = getChildren("app/lock", false)
if n is lowest znode in requests:
return
p = "request-%d" % n - 1
if exists(p, watch = True)
wait for watch event
else
goto retry
watch_event:
goto retry
读写锁
(来自zookeeper论文)
acquire write lock:
n = create("app/lock/write-request-", "", ephemeral|sequential)
retry:
requests = getChildren("app/lock", false)
if n is lowest znode in requests:
return
p = "*-request-%d" % n - 1
if exists(p, watch = true)
wait for watch event
else
goto retry
acquire read lock:
n = create("app/lock/read-request-", "", ephemeral|sequential)
retry:
requests = getChildren("app/lock", false)
if there is no write request before n:
return
p = the write request with sequential number just before n
if exists(p, watch = true)
wait for watch event
else
goto retry
watch_event:
goto retry
配置管理
(来自zookeeper论文)
- 最简单的形式:将配置存在一个z节点, z c z_c zc。
- 刚启动的进程用 getData(path, watch=true) 读取配置信息。
- watches are used to make sure that a process has the most recent information. (Watch用来保证进程能得到最新的信息)
组成员(group membership)
(来自zookeeper论文)
- 一个节点 z g z_g zg代表组。
- 一个组成员进程启动时,就在 z g z_g zg下创建一个新的ephemeral型子节点代表这个成员。
- 要知道组成员有哪些,只要用getChildren读取 z g z_g zg的所有子节点。想监控成员变动,只要设置参数watch = true。
Double Barrier
(根据zookeeper论文叙述写的)
enter barrier:
n = create("app/barrier", ephemeral)
children = getChildren("app/barrier", false)
if len(children) > threshold
create("app/barrier/ready", empheral)
else !if exist("app/barrier/ready", true)
wait
start_computation()
watch_event:
start_computation()
leave barrier:
delete(n)
retry:
children = getChildren("app/barrier", false)
if len(children) == 0
return
e = any element in children
if exist(e, true)
wait
goto retry
watch_event:
goto retry
Leader Election(paxos)
来自Spinnaker论文
clean up old state under /r if neccessary
let n = this machine
let n.lst = our last LSN
zk.create("/r/candidates", n.lst ,ephemral | sequential)
retry:
c = getChildren("r/candidates")
if len(c) >= majority
读取所有candates的 lst,如果本节点是lst最大的,
就试着创建 /r/leader 节点,创建成功我们就是 leader
否则读取 /r/leader 节点,看看谁是新 leader
else
exists("r/candidates", true)
watch_event:
goto retry
选master
(来自lec notes of 6.824 “Check: can we just do this with lab 3’s KV service?”)
- master:
create("/master", my-ip:port, ephemral)
只有1台服务器能创建成功,成为master
如果master节点失效,这个znode会自动删除 - worker:
getData("/master", true)
worker节点和备用master节点可以watch这个znode的变化,得知最新master。
参见
https://pdos.csail.mit.edu/6.824/notes/l-zookeeper.txt
ZooKeeper论文
Spinnaker论文