Zookeeper选出Leader后,在对外提供服务之前,需要对Follower的状态进行同步。我想搞清这个过程,以及验证zookeeper如何解决之前提到的两个问题:
1) Never forget delivered messages
Leader在COMMIT投递到任何一台follower之前宕机,只有它自己commit了。新Leader必须保证这个事务也必须commit。
2) Let go of messages that are skipped
Leader产生某个proposal,但是在宕机之前,没有follower看到这个proposal。该server恢复时,必须丢弃这个proposal。
图片来自论文:Zab: High-performance broadcast for primary-backup systems
1) 寻找Leader:
所有Server启动后执行的操作是寻找Leader。
Leader选出来之后,Leader接收Follower的连接。//Leader.java#449 lead()
// Start thread that waits for connection requests from
// new followers.
cnxAcceptor = new LearnerCnxAcceptor();
cnxAcceptor.setName("LearnerCnxAcceptor-" + ss.getLocalSocketAddress());
cnxAcceptor.start();
// 发送NEWLEADER
newLeaderProposal.packet = new QuorumPacket(NEWLEADER, zk.getZxid(), null, null);
//等待应答
// We have to get at least a majority of servers in sync with
// us. We do this by waiting for the NEWLEADER packet to get
// acknowledged
waitForEpochAck(self.getId(), leaderStateSummary); //LeanerHandler.run 会调用它
Follower 连上 Leader ,并发送 FOLLOWERINFO //Follower.java #74registerWithLeader
<