- observeLeader()流程
Leader选举出来之后,处于OBSERVING状态的zookeeper服务会创建一个Observer实例,然后调用observeLeader()方法进行observeLeader流程,如下:
void observeLeader() throws Exception {
zk.registerJMX(new ObserverBean(this, zk), self.jmxLocalPeerBean);
try {
self.setZabState(QuorumPeer.ZabState.DISCOVERY);
QuorumServer master = findLearnerMaster();
try {
connectToLeader(master.addr, master.hostname);
long newLeaderZxid = registerWithLeader(Leader.OBSERVERINFO);
if (self.isReconfigStateChange()) {
throw new Exception("learned about role change");
}
self.setLeaderAddressAndId(master.addr, master.getId());
self.setZabState(QuorumPeer.ZabState.SYNCHRONIZATION);
syncWithLeader(newLeaderZxid);
self.setZabState(QuorumPeer.ZabState.BROADCAST);
QuorumPacket qp = new QuorumPacket();
while (this.isRunning() && nextLearnerMaster.get() == null) {
readPacket(qp);
processPacket(qp);
}
} catch (Exception e) {
LOG.warn("Exception when observing the leader", e);
closeSocket();
// clear pending revalidations
pendingRevalidations.clear();
}
} finally {
currentLearnerMaster = null;
zk.unregisterJMX(this);
}
}
分析如下:
- 1.注册JMX服务
- 2.寻找LearnerMaster
- 3.跟LearnerMaster建立连接(上一章节有讲到)
- 4.注册Learner,获取新的集群纪元(上一章节有讲到)
- 5.接下来进行Observer与LearnerMaster之间的数据同步,数据同步具体细节将在后面的章节进行讲解
- 6.循环处理LearnerMaster发来的数据包
查找LearnerMaster:findLearnerMaster()
private QuorumServer findLearnerMaster() {
QuorumPeer.QuorumServer prescribedLearnerMaster = nextLearnerMaster.getAndSet(null);
if (prescribedLearnerMaster != null && self.validateLearnerMaster(Long.toString(prescribedLearnerMaster.id)) == null) {
LOG.warn("requested next learner master {} is no longer valid", prescribedLearnerMaster);
prescribedLearnerMaster = null;
}
final QuorumPeer.QuorumServer master = (prescribedLearnerMaster == null) ?
self.findLearnerMaster(findLeader()) :
prescribedLearnerMaster;
currentLearnerMaster = master;
if (master == null) {
LOG.warn("No learner master found");
} else {
LOG.info("Observing new leader sid={} addr={}", master.id, master.addr);
}
return master;
}
QuorumServer validateLearnerMaster(String desiredMaster) {
if (useObserverMasters()) {
Long sid;
try {
sid = Long.parseLong(desiredMaster);
} catch (NumberFormatException e) {
sid = null;
}
for (QuorumServer server : observerMasters) {
if (sid == null) {
String serverAddr = server.addr.getAddress().getHostAddress() + ':' + server.addr.getPort();
if (serverAddr.startsWith(desiredMaster)) {
return server;
}
} else {
if (sid.equals(server.id)) {
return server;
}
}
}
if (sid == null) {
LOG.info("could not find learner master address={}", desiredMaster);
} else {
LOG.warn("could not find learner master sid={}", sid);
}
} else {
LOG.info("cannot validate request, observer masters not enabled");
}
return null;
}
QuorumServer findLearnerMaster(QuorumServer leader) {
if (useObserverMasters()) {
return nextObserverMaster();
} else {
// Add delay jitter to reduce the load on the leader
if (isRunning()) {
Observer.waitForReconnectDelay();
}
return leader;
}
}
private boolean useObserverMasters() {
return getLearnerType() == LearnerType.OBSERVER && observerMasters.size() > 0;
}
private int nextObserverMaster = 0;
private QuorumServer nextObserverMaster() {
if (nextObserverMaster >= observerMasters.size()) {
nextObserverMaster = 0;
// Add a reconnect delay only after the observer
// has exhausted trying to connect to all the masters
// from the observerMasterList
if (isRunning()) {
Observer.waitForReconnectDelay();
}
}
return observerMasters.get(nextObserverMaster++);
}
Observer对于寻找Leader的处理跟Follower略有不同,上文提到Follower在followLeader的时候会根据是否设置了ObserverMaster端口来决定是否启动ObserverMaster服务,这里也是如果启用了ObserverMaster服务那么将会随机连接一个ObserverMaster服务而不是连接Leader,默认是不启用ObserverMaster服务的,因此默认会连接集群Leader
循环处理LearnerMaster发来的数据包
protected void processPacket(QuorumPacket qp) throws Exception{
switch (qp.getType()) {
case Leader.PING:
ping(qp);
break;
case Leader.PROPOSAL:
LOG.warn("Ignoring proposal");
break;
case Leader.COMMIT:
LOG.warn("Ignoring commit");
break;
case Leader.UPTODATE:
LOG.error("Received an UPTODATE message after Observer started");
break;
case Leader.REVALIDATE:
revalidate(qp);
break;
case Leader.SYNC:
((ObserverZooKeeperServer)zk).sync();
break;
case Leader.INFORM:
ServerMetrics.getMetrics().LEARNER_COMMIT_RECEIVED_COUNT.add(1);
TxnHeader hdr = new TxnHeader();
Record txn = SerializeUtils.deserializeTxn(qp.getData(), hdr);
Request request = new Request (hdr.getClientId(), hdr.getCxid(), hdr.getType(), hdr, txn, 0);
request.logLatency(ServerMetrics.getMetrics().COMMIT_PROPAGATION_LATENCY);
ObserverZooKeeperServer obs = (ObserverZooKeeperServer)zk;
obs.commitRequest(request);
break;
case Leader.INFORMANDACTIVATE:
hdr = new TxnHeader();
// get new designated leader from (current) leader's message
ByteBuffer buffer = ByteBuffer.wrap(qp.getData());
long suggestedLeaderId = buffer.getLong();
byte[] remainingdata = new byte[buffer.remaining()];
buffer.get(remainingdata);
txn = SerializeUtils.deserializeTxn(remainingdata, hdr);
QuorumVerifier qv = self.configFromString(new String(((SetDataTxn)txn).getData()));
request = new Request (hdr.getClientId(), hdr.getCxid(), hdr.getType(), hdr, txn, 0);
obs = (ObserverZooKeeperServer)zk;
boolean majorChange =
self.processReconfig(qv, suggestedLeaderId, qp.getZxid(), true);
obs.commitRequest(request);
if (majorChange) {
throw new Exception("changes proposed in reconfig");
}
break;
default:
LOG.warn("Unknown packet type: {}", LearnerHandler.packetToString(qp));
break;
}
}
分析如下:
- PING:处理PING数据包(followLeader章节有讲到)
- PROPOSAL:Observer无权处理提案,忽略
- COMMIT:Observer无权处理提交请求,忽略
- UPTODATE:Leader最新的信息,但Observer已经启动,所以不做处理
- REVALIDATE:客户端连接服务端时重新验证并激活会话(followLeader章节有讲到)
- SYNC:处理同步请求
- INFORM:特殊的“提交请求”
- INFORMANDACTIVATE:处理reconfig事务请求,将新配置提交
INFORM与COMMIT、INFORMANDACTIVATE与COMMITANDACTIVATE之间的区别
- 因为Observer无权处理提案以及COMMIT请求,导致了Observer在事务投票以及提交结束之前并没有保存事务实例,因此Leader在事务处理结束之后会发送INFORM请求来通知Observer提交事务请求,这时候Leader发来的只是事务请求序列化后的字节数组,因此需要Observer自行反序列化之后再进行事务的提交
- COMMITANDACTIVATE同理