zookeeper(七) 集群调用链（源码结束）

最新推荐文章于 2021-12-06 08:41:08 发布

xqcode

最新推荐文章于 2021-12-06 08:41:08 发布

阅读量195

点赞数

分类专栏： zookeeper 文章标签： java

本文链接：https://blog.csdn.net/qq_37822914/article/details/115320821

版权

zookeeper 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

还是上图：
在这里插入图片描述
至于请求调用链参考上图和下面的源码。我们先来通过客户端实际请求来分析流程。

1、客户端发起非事务请求，且请求落在follower上

此时的请求调用链为：
在这里插入图片描述

我们先看下follower处理的总逻辑。
org.apache.zookeeper.server.quorum.FollowerZooKeeperServer#setupRequestProcessors

    @Override
    protected void setupRequestProcessors() {
        //FollowerRequestProcessor。其用作识别当前请求是否是事务请求，若是，那么Follower就会将该请求转发给Leader服务器，Leader服务器是在接收到这个事务请求后，就会将其提交到请求处理链，按照正常事务请求进行处理。
        //CommitProcessor：事务提交处理器，对于非事务请求，该处理器会直接将请求交给nextProcessor处理；对于事务请求，它会等待集群内针对Proposal的投票直到Proposal可被提交，它保证了事务请求的顺序处理。
        //SendAckRequestProcessor。其承担了事务日志记录反馈的角色，在完成事务日志记录后，会向Leader服务器发送ACK消息以表明自身完成了事务日志的记录工作。
        //FollowerRequestProcessor-->CommitProcessor-->FinalRequestProcessor
        //SyncRequestProcessor用于打快照和写日志的。
        AckRequestProcessor：负责在SyncRequestProcessor处理器完成事务日志记录后，向Proposal投票收集器发送ACK反馈，表示当前leader服务器已经完成了对该Proposal的事务日志记录。
        //SyncRequestProcessor-->SendAckRequestProcessor
        RequestProcessor finalProcessor = new FinalRequestProcessor(this);
        commitProcessor = new CommitProcessor(finalProcessor,
                Long.toString(getServerId()), true,
                getZooKeeperServerListener());
        commitProcessor.start();
        firstProcessor = new FollowerRequestProcessor(this, commitProcessor);
        ((FollowerRequestProcessor) firstProcessor).start();
        syncProcessor = new SyncRequestProcessor(this,
                new SendAckRequestProcessor((Learner)getFollower()));
        syncProcessor.start();
    }

上面注释写得很详细，可以仔细阅读一下。
这里有两条处理链：FollowerRequestProcessor–>CommitProcessor–>FinalRequestProcessor和SyncRequestProcessor–>SendAckRequestProcessor。后面那条是leader发起提议使用的，目前场景使用不到，所以我们暂时只看前面那条链路。
FollowerRequestProcessor–>CommitProcessor–>FinalRequestProcessor。
先看FollowerRequestProcessor处理了什么事情。

    @Override
    public void run() {
        try {
            //https://www.cnblogs.com/sunshine-2015/p/10977200.html
            //https://blog.csdn.net/u014634338/article/details/106039491
            while (!finished) {
                Request request = queuedRequests.take();
                if (LOG.isTraceEnabled()) {
                    ZooTrace.logRequest(LOG, ZooTrace.CLIENT_REQUEST_TRACE_MASK,
                            'F', request, "");
                }
                if (request == Request.requestOfDeath) {
                    break;
                }
                // We want to queue the request to be processed before we submit
                // the request to the leader so that we are ready to receive
                // the response
                /**
                 * 直接调用下一个处理器？
                 * 它仅仅是加入queuedRequests队列中，再唤醒所有线程，并没有做其它，但是当调用到它的start()时，即CommitProcessor这个线程执行时就不一样了。
                 * 现在有个疑问，就是processRequest方法中的notifyAll()是做什么的？start里面是做了什么？
                 * 请看CommitProcessor的run()方法
                 * 看TestCommitProcessor测试，虽然逻辑不是很符合，但是大致是这个意思
                 */
                nextProcessor.processRequest(request);
                
                // We now ship the request to the leader. As with all
                // other quorum operations, sync also follows this code
                // path, but different from others, we need to keep track
                // of the sync operations this follower has pending, so we
                // add it to pendingSyncs.
                switch (request.type) {
                case OpCode.sync:
                    zks.pendingSyncs.add(request);
                    zks.getFollower().request(request);
                    break;
                case OpCode.create:
                case OpCode.delete:
                case OpCode.setData:
                case OpCode.setACL:
                case OpCode.createSession:
                case OpCode.closeSession:
                case OpCode.multi:
                    //把写请求转发给leader
                    zks.getFollower().request(request);
                    break;
                }
            }
        } catch (Exception e) {
            handleException(this.getName(), e);
        }
        LOG.info("FollowerRequestProcessor exited loop!");
    }

从代码和注释中其实就能了解到FollowerRequestProcessor主要做了一件事。那就是判断是不是事务性请求，是的话就转发一份给leader，同时也调用CommitProcessor的processRequest方法，方法实现如下图。这个方法有notifyAll()，用于唤醒CommitProcessor的wait的线程，我这么说就是说CommitProcessor的run方法有wait了。

    synchronized public void processRequest(Request request) {
        // request.addRQRec(">commit");
        if (LOG.isDebugEnabled()) {
            LOG.debug("Processing request:: " + request);
        }
        
        if (!finished) {
            queuedRequests.add(request);
            notifyAll();
        }
    }

接着上面的问题，那么CommitProcessor是如何接着执行的呢？
org.apache.zookeeper.server.quorum.CommitProcessor#run

    @Override
    public void run() {
        try {
            Request nextPending = null;            
            while (!finished) {
                //假设就一条请求。
                //如果是第一次进入，则len肯定是0.因为压根就没添加过
                //第二次：如果len>0，证明是非事务性的，就直接调用FinalRequestProcessor的processRequest方法，然后就返回结果了。如果queuedRequests没有数据了（我们假设只有一条，所以没有了），那么就会进入wait方法，阻塞住这个线程。所以这里是什么时候去唤醒的？
                //这种情况一般人都是想有新的请求加到queuedRequests中吧。还记得processRequest这个方法么？上个处理器直接调用的这个方法，直接notifyAll唤醒。
                //      如果len=0，证明是事务性的。不会调用最终的处理器，然后就被wait阻塞住。这里又是什么时候被唤醒的？
                //这种情况想一想数据库，事务性的要么commit，要么rollback。所以这里肯定有一个commit咯。没错就是本类的commit方法让committedRequests>0并唤醒线程。那么啥时候调用commit方法？
                //leader确认好了就发出一个COMMIT指令，然后leader、follower包括observer就会提交这个请求！follower就从followLeader一直往下找，其它类似，现在最后一个问题，leader什么时候发出这个指令呢？
                //请看本类的commit方法注释。
                int len = toProcess.size();
                for (int i = 0; i < len; i++) {
                    nextProcessor.processRequest(toProcess.get(i));
                }
                toProcess.clear();
                synchronized (this) {
                    //第一次：前面一个处理器调用了processRequest方法，所以queuedRequests肯定有数据，nextPending也为null，第一个条件就不符合，所以为false。
                    if ((queuedRequests.size() == 0 || nextPending != null)
                            && committedRequests.size() == 0) {
                        wait();
                        continue;
                    }
                    // First check and see if the commit came in for the pending
                    // request
                    //第一次：同上
                    //第二次：这是什么情况呢？就是请求处理完了，但是可能前面的没有来得及提交。此时就需要把committedRequests取出来赋给toProcess执行下一个处理器。
                    if ((queuedRequests.size() == 0 || nextPending != null)
                            && committedRequests.size() > 0) {
                        Request r = committedRequests.remove();
                        /*
                         * We match with nextPending so that we can move to the
                         * next request when it is committed. We also want to
                         * use nextPending because it has the cnxn member set
                         * properly.
                         */
                        if (nextPending != null
                                && nextPending.sessionId == r.sessionId
                                && nextPending.cxid == r.cxid) {
                            // we want to send our version of the request.
                            // the pointer to the connection in the request
                            nextPending.hdr = r.hdr;
                            nextPending.txn = r.txn;
                            nextPending.zxid = r.zxid;
                            toProcess.add(nextPending);
                            nextPending = null;
                        } else {
                            // this request came from someone else so just
                            // send the commit packet
                            toProcess.add(r);
                        }
                    }
                }

                // We haven't matched the pending requests, so go back to
                // waiting
                if (nextPending != null) {
                    continue;
                }

                synchronized (this) {
                    // Process the next requests in the queuedRequests
                    //第一次是符合这个条件的：那么首先取出请求。我们假设是create请求。那么nextPending就有值了，break，进入第二次while循环。
                    //第二次：一般进入不了。因为非事务性的会一次取完，事务性的nextPending存在值。
                    while (nextPending == null && queuedRequests.size() > 0) {
                        Request request = queuedRequests.remove();
                        switch (request.type) {
                        case OpCode.create:
                        case OpCode.delete:
                        case OpCode.setData:
                        case OpCode.multi:
                        case OpCode.setACL:
                        case OpCode.createSession:
                        case OpCode.closeSession:
                            nextPending = request;
                            break;
                        case OpCode.sync:
                            if (matchSyncs) {
                                nextPending = request;
                            } else {
                                toProcess.add(request);
                            }
                            break;
                        default:
                            //以上不是，说明是非事务性请求（相当于读请求）
                            toProcess.add(request);
                        }
                    }
                }
            }
        } catch (InterruptedException e) {
            LOG.warn("Interrupted exception while waiting", e);
        } catch (Throwable e) {
            LOG.error("Unexpected exception causing CommitProcessor to exit", e);
        }
        LOG.info("CommitProcessor exited loop!");
    }

这里面的注释请结合代码仔细理解。主要作用是：读请求直接转发给下一个processor。写请求其会等待集群内针对Proposal的投票直到该Proposal可被提交，利用CommitProcessor，每个服务器都可以很好地控制对事务请求的顺序处理。由于事务请求比较复杂，这种情况先不讨论。

到了最后的FinalRequestProcessor的processRequest方法。其路径在org.apache.zookeeper.server.FinalRequestProcessor#processRequest。由于代码量比较多且判断比较多就不贴出来了。记住作用即可。
FinalRequestProcessor：用来进行客户端请求返回之前的操作，包括创建客户端请求的响应，针对事务请求，该处理还会负责将事务应用到内存数据库中去。如果Request对象包含事务数据，该处理器将会接受对ZooKeeper数据树的修改，否则，该处理器会从数据树中读取数据并返回给客户端。

2、客户端发起非事务请求，且请求落在leader上

此时的请求调用链为：
在这里插入图片描述
在这里只管是读请求不再考虑写请求的情况。
首先看代码及注释。只看当前读请求调用链情况，忽略事务请求。

    @Override
    protected void setupRequestProcessors() {
        //PrepRequestProcessor：Leader服务器的请求预处理器，进行一些创建请求事务头,事务体，ACL检查和版本检查等的预处理操作。
        //ProposalRequestProcessor：Leader服务器的事务投票处理器，也是事务处理流程的发起者。对于非事务请求，它会直接将请求流转到 CommitProcessor处理器。对于事务请求，除了将请求交给CommitProcessor处理器外，还会根据请求类型创建对应的Proposal提议，并发送给所有的Follewer服务器来发起一次集群内的事务投票。同时，它还会将事务请求交给SyncRequestProcessor处理器进行事务日志的记录。
        //SyncRequestProcessor：是事务日志记录处理器，主要用来将事务请求记录到事务日志文件中，同时会根据条件触发zookeeper进行数据快照。
        //AckRequestProcessor：负责在SyncRequestProcessor处理器完成事务日志记录后，向Proposal投票收集器发送ACK反馈，表示当前leader服务器已经完成了对该Proposal的事务日志记录。
        //CommitProcessor：事务提交处理器，对于非事务请求，该处理器会直接将请求交给nextProcessor处理；对于事务请求，它会等待集群内针对Proposal的投票直到Proposal可被提交，它保证了事务请求的顺序处理。
        //ToBeCommitProcessor。该处理器有一个toBeApplied队列，用来存储那些已经被CommitProcessor处理过的可被提交的Proposal。其会将这些请求交付给FinalRequestProcessor处理器处理，待其处理完后，再将其从toBeApplied队列中移除。
        //FinalRequestProcessor。用来进行客户端请求返回之前的操作，包括创建客户端请求的响应，针对事务请求，该处理还会负责将事务应用到内存数据库中去。
        //PrepRequestProcessor(预处理)-->ProposalRequestProcessor（提议）-->CommitProcessor（提交（这里会得到follower的过半确认））-->
        // Leader.ToBeAppliedRequestProcessor（应用）-->FinalRequestProcessor（返回）
        //（事务请求生效）PrepRequestProcessor(预处理)-->ProposalRequestProcessor（提议）-->SyncRequestProcessor(持久化) -->AckRequestProcessor（持久化成功）
        RequestProcessor finalProcessor = new FinalRequestProcessor(this);
        RequestProcessor toBeAppliedProcessor = new Leader.ToBeAppliedRequestProcessor(
                finalProcessor, getLeader().toBeApplied);
        commitProcessor = new CommitProcessor(toBeAppliedProcessor,
                Long.toString(getServerId()), false,
                getZooKeeperServerListener());
        commitProcessor.start();
        ProposalRequestProcessor proposalProcessor = new ProposalRequestProcessor(this,
                commitProcessor);
        proposalProcessor.initialize();
        firstProcessor = new PrepRequestProcessor(this, proposalProcessor);
        ((PrepRequestProcessor)firstProcessor).start();
    }

结合调用链可以知道大概的流程，里面的具体实现代码就不展示了，因为大体都跟follower的模式比较像。所以这里只说非事务请求经果了链路做的操作。
1.PrepRequestProcessor：读请求到这里的时候，基本就是一些检查性的工作，而并没有做什么实际性的处理。
2.ProposalRequestProcessor：非事务请求到这里时。直接将这个流转到下一个processor。
3.CommitProcessor：非事务请求到这里时。也是直接取出来交给下一个processor处理。
4.ToBeCommitProcessor：非事务请求到这里时。也是直接取出来交给下一个processor处理。
5.FinalRequestProcessor：用来进行客户端请求返回之前的操作，包括创建客户端请求的响应，针对事务请求，该处理还会负责将事务应用到内存数据库中去。如果Request对象包含事务数据，该处理器将会接受对ZooKeeper数据树的修改，否则，该处理器会从数据树中读取数据并返回给客户端。
读请求简单的逻辑就完了，接下来就是比较复杂的写请求了。

3、客户端发起事务请求，且请求落在leader上

此时的流程图就非常复杂了。如下：
在这里插入图片描述
上面的非事务请求知道非事务请求几乎都是直接流转各个请求链，而并没有做实际的处理。现在是事务请求了，所以基本所有没处理的组件都需要使用起来了。
首先看整体的链路：做事比较多且不是很容易理解的应该是ProposalRequestProcessor，CommitProcessor，SyncRequestProcessor，AckRequestProcessor。其它的processor做的事情应该就比较清楚了，而且相对来视也不是非常重要，所以我们就直接说这4个processor。其中ProposalRequestProcessor最为重要。
ProposalRequestProcessor：首先我们知道了PrepRequestProcessor中的run方法调用了ProposalRequestProcessor的processRequest方法，所以会流转到CommitProcessor中，同时因为调用了proposalProcessor.initialize();
具体实现如下：

    /**
     * initialize this processor
     */
    public void initialize() {
        syncProcessor.start();
    }

org.apache.zookeeper.server.SyncRequestProcessor#run

    @Override
    public void run() {
        try {
            int logCount = 0;

            // we do this in an attempt to ensure that not all of the servers
            // in the ensemble take a snapshot at the same time
            //确保不是所有的服务器同时去拍一个快照
            setRandRoll(r.nextInt(snapCount/2));
            while (true) {
                Request si = null;
                if (toFlush.isEmpty()) {
                    //这里取阻塞了。啥时候放进去？processRequest这个方法放进去。放的是命令。
                    //它是通过PrepRequestProcessor里面调用了SyncRequestProcessor的processRequest去放的
                    si = queuedRequests.take();
                } else {
                    si = queuedRequests.poll();
                    if (si == null) {
                        flush(toFlush);//这里就会发起commit指令
                        continue;
                    }
                }
                if (si == requestOfDeath) {
                    break;
                }
                if (si != null) {
                    // track the number of records written to the log
                    if (zks.getZKDatabase().append(si)) {
                        logCount++;
                        if (logCount > (snapCount / 2 + randRoll)) {
                            setRandRoll(r.nextInt(snapCount/2));
                            // roll the log
                            zks.getZKDatabase().rollLog();
                            // take a snapshot
                            if (snapInProcess != null && snapInProcess.isAlive()) {
                                LOG.warn("Too busy to snap, skipping");
                            } else {
                                snapInProcess = new ZooKeeperThread("Snapshot Thread") {
                                        public void run() {
                                            try {
                                                zks.takeSnapshot();
                                            } catch(Exception e) {
                                                LOG.warn("Unexpected exception", e);
                                            }
                                        }
                                    };
                                snapInProcess.start();
                            }
                            logCount = 0;
                        }
                    } else if (toFlush.isEmpty()) {
                        // optimization for read heavy workloads
                        // iff this is a read, and there are no pending
                        // flushes (writes), then just pass this to the next
                        // processor
                        if (nextProcessor != null) {
                            nextProcessor.processRequest(si);
                            if (nextProcessor instanceof Flushable) {
                                ((Flushable)nextProcessor).flush();
                            }
                        }
                        continue;
                    }
                    toFlush.add(si);
                    if (toFlush.size() > 1000) {
                        flush(toFlush);
                    }
                }
            }
        } catch (Throwable t) {
            handleException(this.getName(), t);
            running = false;
        }
        LOG.info("SyncRequestProcessor exited!");
    }

所以同时也存在到SyncRequestProcessor这条链路上。

现在还缺一条链路。就是向follower发起提议。怎么发起的呢？其实就在org.apache.zookeeper.server.quorum.ProposalRequestProcessor#processRequest中。

    public void processRequest(Request request) throws RequestProcessorException {
        // LOG.warn("Ack>>> cxid = " + request.cxid + " type = " +
        // request.type + " id = " + request.sessionId);
        // request.addRQRec(">prop");
                
        
        /* In the following IF-THEN-ELSE block, we process syncs on the leader. 
         * If the sync is coming from a follower, then the follower
         * handler adds it to syncHandler. Otherwise, if it is a client of
         * the leader that issued the sync command, then syncHandler won't 
         * contain the handler. In this case, we add it to syncHandler, and 
         * call processRequest on the next processor.
         */
        
        if(request instanceof LearnerSyncRequest){
            zks.getLeader().processSync((LearnerSyncRequest)request);
        } else {
                nextProcessor.processRequest(request);
            if (request.hdr != null) {
                // We need to sync and get consensus on any transactions
                try {
                    zks.getLeader().propose(request);
                } catch (XidRolloverException e) {
                    throw new RequestProcessorException(e.getMessage(), e);
                }
                syncProcessor.processRequest(request);
            }
        }
    }

注意到这行代码：

zks.getLeader().propose(request);

对应的实现为：org.apache.zookeeper.server.quorum.Leader#propose

    /**
     * create a proposal and send it out to all the members
     * 
     * @param request
     * @return the proposal that is queued to send to all the members
     */
    public Proposal propose(Request request) throws XidRolloverException {
        /**
         * Address the rollover issue. All lower 32bits set indicate a new leader
         * election. Force a re-election instead. See ZOOKEEPER-1277
         */
        //zxid低32位写满了，类似于当选总统到期了，需要重新来进行一波选举
        if ((request.zxid & 0xffffffffL) == 0xffffffffL) {
            String msg =
                    "zxid lower 32 bits have rolled over, forcing re-election, and therefore new epoch start";
            shutdown(msg);
            throw new XidRolloverException(msg);
        }
        byte[] data = SerializeUtils.serializeRequest(request);
        proposalStats.setLastProposalSize(data.length);
        QuorumPacket pp = new QuorumPacket(Leader.PROPOSAL, request.zxid, data, null);
        
        Proposal p = new Proposal();
        p.packet = pp;
        p.request = request;
        synchronized (this) {
            if (LOG.isDebugEnabled()) {
                LOG.debug("Proposing:: " + request);
            }

            lastProposed = p.packet.getZxid();
            outstandingProposals.put(lastProposed, p);
            sendPacket(pp);
        }
        return p;
    }

首先创建了Proposal p = new Proposal();并向里面填值，然后通过sendPacket(pp);向follower发送数据包。
至此，ProposalRequestProcessor中的3条链路就齐了。

然后我们来看SyncRequestProcessor这条链路。这里有两条。第一条是直接发送给了本机，第二条路是发送给了 follower。本机怎么到的已经说过了，说说怎么到follower的SyncRequestProcessor上面的。
前面文章提到过，我们在初始化follower的时候。使用了 follower.followLeader();方法。在org.apache.zookeeper.server.quorum.Follower#followLeader中。有一段非常重要的代码。

                while (this.isRunning()) {
                    readPacket(qp);
                    processPacket(qp);
                }

顾名思义：readPacket读取packet。processPacket处理packet。读取没什么好说的，主要看处理Packet。org.apache.zookeeper.server.quorum.Follower#processPacket

    protected void processPacket(QuorumPacket qp) throws IOException{
        switch (qp.getType()) {
        case Leader.PING:  //ping 检测存活性
            ping(qp);            
            break;
        case Leader.PROPOSAL: //接到提议
            TxnHeader hdr = new TxnHeader();
            Record txn = SerializeUtils.deserializeTxn(qp.getData(), hdr);
            if (hdr.getZxid() != lastQueued + 1) {
                LOG.warn("Got zxid 0x"
                        + Long.toHexString(hdr.getZxid())
                        + " expected 0x"
                        + Long.toHexString(lastQueued + 1));
            }
            lastQueued = hdr.getZxid();
            fzk.logRequest(hdr, txn);
            break;
        case Leader.COMMIT://提交
            fzk.commit(qp.getZxid());//接到了leader的commit指令
            break;
        case Leader.UPTODATE://更新
            LOG.error("Received an UPTODATE message after Follower started");
            break;
        case Leader.REVALIDATE://重新验证
            revalidate(qp);
            break;
        case Leader.SYNC://同步
            fzk.sync();
            break;
        default:
            LOG.error("Invalid packet type: {} received by Observer", qp.getType());
        }
    }

我们主要看到case Leader.PROPOSAL；而leader发起的包是QuorumPacket pp = new QuorumPacket(Leader.PROPOSAL, request.zxid, data, null);所以可以确定发的包到这里来了，而这个case的最后一行fzk.logRequest(hdr, txn);完整引用如下：org.apache.zookeeper.server.quorum.FollowerZooKeeperServer#logRequest

    public void logRequest(TxnHeader hdr, Record txn) {
        Request request = new Request(null, hdr.getClientId(), hdr.getCxid(),
                hdr.getType(), null, null);
        request.hdr = hdr;
        request.txn = txn;
        request.zxid = hdr.getZxid();
        if ((request.zxid & 0xffffffffL) != 0) {
            pendingTxns.add(request);
        }
        syncProcessor.processRequest(request);
    }

可以看到最后就是调用了syncProcessor.processRequest(request);跟前面的图符合。

接下来的SyncRequestProcessor和AckRequestProcessor逻辑比较清楚了。记住它们的作用即可。
SyncRequestProcessor：是事务日志记录处理器，主要用来将事务请求记录到事务日志文件中，同时会根据条件触发zookeeper进行数据快照。
AckRequestProcessor：负责在SyncRequestProcessor处理器完成事务日志记录后，向Proposal投票收集器发送ACK反馈，表示当前leader服务器已经完成了对该Proposal的事务日志记录。

最后一个***CommitProcessor***：接受follower和自己的commit指令，follower怎么发回以及leader怎么接到packet与leader法提议包类似。大概复习一下。
follower端发送：org.apache.zookeeper.server.quorum.SendAckRequestProcessor#processRequest---->org.apache.zookeeper.server.quorum.Learner#writePacket发给了leader。
leader端接收：Leader.lead()–>LearnerCnxAcceptor.run()–>LearnerHandler.run()–>Leader.processAck()
且org.apache.zookeeper.server.quorum.Leader#processAck会有过半验证机制，然后通过调用zk.commitProcessor.commit(p.request);去调用本地方法。使committedRequests队列有值，能够唤醒CommitProcessor线程，同时也发送相应的COMMIT数据包给follower和observer的CommitProcessor，然后执行各自的逻辑处理。接下来怎么执行就不用多说了。