手把手带你撸zookeeper源码-zookeeper集群之间如何进行ping探活机制

上两篇文章

手把手带你撸zookeeper源码-zookeeper故障重启时如何恢复数据(一)

手把手带你撸zookeeper源码-zookeeper故障重启时如何恢复数据(二)

已经详细的把zookeeper故障重启时如何恢复数据给详细的剖析完了,这篇文章我们继续剖析一下在leader和follower数据同步完成之后,leader还会继续做哪些事情,以及如何实现ping机制的

在上篇文章讲到leader同步完数据之后,会发送一个UPTODATE的通知给follower,表示数据已经同步完毕,follower可以对外提供服务了

queuedPackets.add(new QuorumPacket(Leader.UPTODATE, -1, null, null));

这段代码之后就是下面的代码,因为leader和每个follower建立连接之后都会创建一个LearnerHandler线程来和follower交互,此线程同步完数据之后,会进入一个while(true)循环,然后等待着follower转发过来的写请求,follower发送过来的ack,还有ping探活等

//2PC + 过半写机制----等待写请求
            while (true) {
                qp = new QuorumPacket();
                ia.readRecord(qp, "packet");

                long traceMask = ZooTrace.SERVER_PACKET_TRACE_MASK;
                if (qp.getType() == Leader.PING) {
                    traceMask = ZooTrace.SERVER_PING_TRACE_MASK;
                }
                if (LOG.isTraceEnabled()) {
                    ZooTrace.logQuorumPacket(LOG, traceMask, 'i', qp);
                }
                tickOfNextAckDeadline = leader.self.tick.get() + leader.self.syncLimit;


                ByteBuffer bb;
                long sessionId;
                int cxid;
                int type;

                switch (qp.getType()) {
                case Leader.ACK:
                    if (this.learnerType == LearnerType.OBSERVER) {
                        if (LOG.isDebugEnabled()) {
                            LOG.debug("Received ACK from Observer  " + this.sid);
                        }
                    }
                    syncLimitCheck.updateAck(qp.getZxid());
                    leader.processAck(this.sid, qp.getZxid(), sock.getLocalSocketAddress());
                    break;
                case Leader.PING:
                    // Process the touches
                    ByteArrayInputStream bis = new ByteArrayInputStream(qp
                            .getData());
                    DataInputStream dis = new DataInputStream(bis);
                    while (dis.available() > 0) {
                        long sess = dis.readLong();
                        int to = dis.readInt();
                        leader.zk.touch(sess, to);
                    }
                    break;
                case Leader.REVALIDATE:
                    bis = new ByteArrayInputStream(qp.getData());
                    dis = new DataInputStream(bis);
                    long id = dis.readLong();
                    int to = dis.readInt();
                    ByteArrayOutputStream bos = new ByteArrayOutputStream();
                    DataOutputStream dos = new DataOutputStream(bos);
                    dos.writeLong(id);
                    boolean valid = leader.zk.touch(id, to);
                    if (valid) {
                        try {
                            //set the session owner
                            // as the follower that
                            // owns the session
                            leader.zk.setOwner(id, this);
                        } catch (SessionExpiredException e) {
                            LOG.error("Somehow session " + Long.toHexString(id) + " expired right after being renewed! (impossible)", e);
                        }
                    }
                    if (LOG.isTraceEnabled()) {
                        ZooTrace.logTraceMessage(LOG,
                                                 ZooTrace.SESSION_TRACE_MASK,
                                                 "Session 0x" + Long.toHexString(id)
                                                 + " is valid: "+ valid);
                    }
                    dos.writeBoolean(valid);
                    qp.setData(bos.toByteArray());
                    queuedPackets.add(qp);
                    break;
                case Leader.REQUEST:                    
                    bb = ByteBuffer.wrap(qp.getData());
                    sessionId = bb.getLong();
                    cxid = bb.getInt();
                    type = bb.getInt();
                    bb = bb.slice();
                    Request si;
                    if(type == OpCode.sync){
                        si = new LearnerSyncRequest(this, sessionId, cxid, type, bb, qp.getAuthinfo());
                    } else {
                        si = new Request(null, sessionId, cxid, type, bb, qp.getAuthinfo());
                    }
                    si.setOwner(this);
                    leader.zk.submitRequest(si);
                    break;
                default:
                    LOG.warn("unexpected quorum packet, type: {}", packetToString(qp));
                    break;
                }
            }

LearnerHandler这个线程的启动代码分析完毕之后,接下里继续Leader.lead()中的代码,我们接着往下看

            readyToStart = true;
            // 获取epoch
            long epoch = getEpochToPropose(self.getId(), self.getAcceptedEpoch());
            
            zk.setZxid(ZxidUtils.makeZxid(epoch, 0));
            
            synchronized(this){
                lastProposed = zk.getZxid();
            }
            
            newLeaderProposal.packet = new QuorumPacket(NEWLEADER, zk.getZxid(),
                    null, null);


            //等待epoch ack
            waitForEpochAck(self.getId(), leaderStateSummary);
            self.setCurrentEpoch(epoch);

            try {
                // 等待new ack
                waitForNewLeaderAck(self.getId(), zk.getZxid());
            } catch (InterruptedException e) {
                shutdown("Waiting for a quorum of followers, only synced with sids: [ "
                        + getSidSetString(newLeaderProposal.ackSet) + " ]");
                HashSet<Long> followerSet = new HashSet<Long>();
                for (LearnerHandler f : learners)
                    followerSet.add(f.getSid());
                    
                if (self.getQuorumVerifier().containsQuorum(followerSet)) {
                    LOG.warn("Enough followers present. "
                            + "Perhaps the initTicks need to be increased.");
                }
                Thread.sleep(self.tickTime);
                self.tick.incrementAndGet();
                return;
            }

这块代码上两篇文章都分析过,此处不再分析

接下来我们主要看一下下面这一行代码

startZkServer();

启动一个zkServer(),我们之前分析follower启动之后,会调用zk.startup()方法来初始化follower processor调用链,对这块有疑问的小伙伴可以看看之前的这篇文章  手把手带你撸zookeeper源码-zookeeper的sessionId生成策略和follower调用链初始化

我们猜想一下这个方法也肯定会初始化leader的processor调用链,当一个写请求进来之后,然后去一步步执行初始化好的Processor调用链

    private synchronized void startZkServer() {
        // Update lastCommitted and Db's zxid to a value representing the new epoch
        lastCommitted = zk.getZxid();
        // 是不是似曾相识?
        zk.startup();
        self.updateElectionVote(getEpoch());

        zk.getZKDatabase().setlastProcessedZxid(zk.getZxid());
    }

zk.startup()这个方法是不是似曾相识?它也是走到了父类ZookeeperServer这个方法里面的startup()

    public synchronized void startup() {
        if (sessionTracker == null) {
            createSessionTracker();
        }
        // 启动一个定时任务,清理session
        startSessionTracker();
        // 初始化 leader 、 follower、 observer的调用链
        setupRequestProcessors();

        registerJMX();

        setState(State.RUNNING);
        notifyAll();
    }

唯一不同的一点就是调用setupRequestProcessors()这个方法的实现这不同,leader对应于LeaderZookeeperServer类,follower对应于FollowerZookeeperServer类,而observer对应于ObserverZookeeperServer类,这块之前的文章都有剖析,感兴趣的小伙伴可以去之前的文章看一下 手把手带你撸zookeeper源码-zookeeper中follower启动的时候会做什么?

我们进入LeaderZookeeperServer.setupRequestProcessors()方法中

@Override
    protected void setupRequestProcessors() {
        RequestProcessor finalProcessor = new FinalRequestProcessor(this);
        RequestProcessor toBeAppliedProcessor = new Leader.ToBeAppliedRequestProcessor(
                finalProcessor, getLeader().toBeApplied);
        
        commitProcessor = new CommitProcessor(toBeAppliedProcessor,
                Long.toString(getServerId()), false,
                getZooKeeperServerListener());
        commitProcessor.start();
        
        ProposalRequestProcessor proposalProcessor = new ProposalRequestProcessor(this,
                commitProcessor);
        proposalProcessor.initialize();
        firstProcessor = new PrepRequestProcessor(this, proposalProcessor);
        ((PrepRequestProcessor)firstProcessor).start();
    }

画个图来分析一下

大概就是如上图来进行一些列的初始化和线程的启动,这块如何调用会在客户端发送请求,zookeeper集群如何处理请求时详细分析如何进行调用的

接下来的代码

        String initialZxid = System.getProperty("zookeeper.testingonly.initialZxid");
            if (initialZxid != null) {
                long zxid = Long.parseLong(initialZxid);
                zk.setZxid((zk.getZxid() & 0xffffffff00000000L) | zxid);
            }
            
            if (!System.getProperty("zookeeper.leaderServes", "yes").equals("no")) {
                self.cnxnFactory.setZooKeeperServer(zk);
            }

这块代码没什么好分析的,不重要,跳过

            boolean tickSkip = true;
            
            while (true) {
                Thread.sleep(self.tickTime / 2);
                if (!tickSkip) {
                    self.tick.incrementAndGet();
                }
                HashSet<Long> syncedSet = new HashSet<Long>();

                // lock on the followers when we use it.
                syncedSet.add(self.getId());

                for (LearnerHandler f : getLearners()) {
                    if (f.synced() && f.getLearnerType() == LearnerType.PARTICIPANT) {
                        syncedSet.add(f.getSid());
                    }
                    f.ping();
                }

                // check leader running status
                if (!this.isRunning()) {
                    shutdown("Unexpected internal error");
                    return;
                }

              if (!tickSkip && !self.getQuorumVerifier().containsQuorum(syncedSet)) {
                //if (!tickSkip && syncedCount < self.quorumPeers.size() / 2) {
                    return;
              } 
              tickSkip = !tickSkip;
            }

然后会进入一个while循环,中间每隔tickTime / 2时间执行一次,即在tickTime时间内执行两次,看关键代码

                for (LearnerHandler f : getLearners()) {
                    // Synced set is used to check we have a supporting quorum, so only
                    // PARTICIPANT, not OBSERVER, learners should be used
                    if (f.synced() && f.getLearnerType() == LearnerType.PARTICIPANT) {
                        syncedSet.add(f.getSid());
                    }
                    f.ping();
                }

getLearners()是一个set集合,里面保存了所有和leader建立连接的follower的处理连接请求的LearnerHandler,对这块有疑问的可以看一下上篇文章,当follower和leader创建连接之后,会创建LearnerHandler线程并启动去处理和follower的读写请求,启动LearnerHandler线程池,会把其保存到learners集合中

遍历所有的LearnerHandler,判断此LearnerHandler对应的follower是否一个PARTICIPANT参与者,如果是则加入到syncedSet集合中,此集合保存的是所有具有投票权的法定人数集合,接着重要中的关键点来了,就是f.ping()

    public void ping() {
        long id;
        if (syncLimitCheck.check(System.nanoTime())) {
            synchronized(leader) {
                id = leader.lastProposed;
            }
            QuorumPacket ping = new QuorumPacket(Leader.PING, id, null, null);
            queuePacket(ping);
        } else {
            LOG.warn("Closing connection to peer due to transaction timeout.");
            shutdown();
        }
    }

其实就是保障了一个ping请求发送出去,给follower和observer

 

接下来看看follower如何处理Ping请求的

上篇文章分析了,当follower和leader同步完数据之后会执行下面的代码等待leader发送请求过来

                 while (this.isRunning()) {
                    readPacket(qp); //从leader读数据
                    processPacket(qp);
                 }

readPacket(qp)就是解析leader发送过来的数据,反序列化到qp对象之中

    void readPacket(QuorumPacket pp) throws IOException {
        synchronized (leaderIs) {
            leaderIs.readRecord(pp, "packet");
        }
        long traceMask = ZooTrace.SERVER_PACKET_TRACE_MASK;
        if (pp.getType() == Leader.PING) {
            traceMask = ZooTrace.SERVER_PING_TRACE_MASK;
        }
        if (LOG.isTraceEnabled()) {
            ZooTrace.logQuorumPacket(LOG, traceMask, 'i', pp);
        }
    }

紧接着执行processPacket(qp),把读取到的数据交给processPacket方法去执行

我直接截取相关的很小部分代码

    protected void processPacket(QuorumPacket qp) throws IOException{
        switch (qp.getType()) {
        case Leader.PING:            
            ping(qp);            
            break;
        }
    }

如果qp.getType == Leader.PING, 直接调用ping方法

    protected void ping(QuorumPacket qp) throws IOException {
        // Send back the ping with our session data
        ByteArrayOutputStream bos = new ByteArrayOutputStream();
        DataOutputStream dos = new DataOutputStream(bos);
        HashMap<Long, Integer> touchTable = zk
                .getTouchSnapshot();
        for (Entry<Long, Integer> entry : touchTable.entrySet()) {
            dos.writeLong(entry.getKey());
            dos.writeInt(entry.getValue());
        }
        qp.setData(bos.toByteArray());
        writePacket(qp, true);
    }

简单来看就是包装了一些数据,然后给leader发送过去,这里可以稍微看一下,关键点就是zk.getTouchSnapshot()方法

    protected HashMap<Long, Integer> getTouchSnapshot() {
        if (sessionTracker != null) {
            return ((LearnerSessionTracker) sessionTracker).snapshot();
        }
        return new HashMap<Long, Integer>();
    }

其实就是获取当前follower保存的所有session,然后每次ping时,每个follower都会把本地维护的所有session相关信息发送给leader

 

然后可以再去看看leader如何处理的,代码就是本篇文章最上面的代码,会执行如下的分支

                case Leader.PING:
                    // Process the touches
                    ByteArrayInputStream bis = new ByteArrayInputStream(qp
                            .getData());
                    DataInputStream dis = new DataInputStream(bis);
                    while (dis.available() > 0) {
                        long sess = dis.readLong();
                        int to = dis.readInt();
                        leader.zk.touch(sess, to);
                    }
                    break;

执行leader.zk.touch()方法

synchronized public boolean touchSession(long sessionId, int timeout) {
        SessionImpl s = sessionsById.get(sessionId);
        // Return false, if the session doesn't exists or marked as closing
        if (s == null || s.isClosing()) {
            return false;
        }
        //expireTime 计算下一次session过期时间
        long expireTime = roundToInterval(Time.currentElapsedTime() + timeout);
        //
        if (s.tickTime >= expireTime) {
            // Nothing needs to be done
            return true;
        }
        SessionSet set = sessionSets.get(s.tickTime);
        if (set != null) {
            set.sessions.remove(s);
        }
        s.tickTime = expireTime;
        set = sessionSets.get(s.tickTime);
        if (set == null) {
            set = new SessionSet();
            sessionSets.put(expireTime, set);
        }
        set.sessions.add(s);

        return true;
    }

最后执行到org.apache.zookeeper.server.SessionTrackerImpl#touchSession

执行的逻辑就是leader会判断发送过来的session来同步到自己本地维护的session中,如果expire则会移除,这块我们后面进行分析,大概知道这个流程即可

 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值