zookeeper-源码解读

最新推荐文章于 2023-03-31 18:29:27 发布

early_or_later

最新推荐文章于 2023-03-31 18:29:27 发布

阅读量260

点赞数 2

本文链接：https://blog.csdn.net/early_or_later/article/details/106606608

版权

谈到一个中间件，肯定要先学会如何应用（锁，leader选举），所以先读下应用实现方的源码：

curator：封装了zookeeper原生方式，相比于zkClient 抽取的更高

1、curator 实现分布式锁：

实现方式：
 CuratorFramework curatorFramework = CuratorFrameworkFactory.builder()
            .connectString(CONNECTION_STR)
            .retryPolicy(new RetryForever(3))
            .sessionTimeoutMs(3000)
            .build();
        curatorFramework.start();
        
        //创建锁
        InterProcessMutex lock = new InterProcessMutex(curatorFramework, "/lock");

         try
        {
                lock.acquire(); //获取锁
                System.out.println(Thread.currentThread().getName() + "获取锁成功");
                Thread.sleep(300000);
                lock.release(); // 释放锁
                System.out.println(Thread.currentThread().getName() + "释放锁成功");
        }
         catch (Exception e)
         {
              e.printStackTrace();
         }

来一把curator源码，其实还是比较简单的：

InterProcessMutex的构造方法本质就是构造了一个内部类

获取锁 lock.acquire() ： acquire -> internalLock -> attemptLock -> createsTheLock、internalLockLoop ->getsTheLock

//获取锁，api其实模仿jdk的锁实现
 public void acquire() throws Exception
    {
        if ( !internalLock(-1, null) )
        {
            throw new IOException("Lost connection while trying to acquire lock: " + basePath);
        }
    }

==>internalLock

private boolean internalLock(long time, TimeUnit unit) throws Exception
    {
        /*
           Note on concurrency: a given lockData instance
           can be only acted on by a single thread so locking isn't necessary
        */

        Thread currentThread = Thread.currentThread();
        
        LockData lockData = threadData.get(currentThread);
        // 利用一个concurrentHashMap 实现锁的可重入
        if ( lockData != null )
        {
            // re-entering 重入
            lockData.lockCount.incrementAndGet();
            return true;
        }
        // 核心方法 attemptLock
        String lockPath = internals.attemptLock(time, unit, getLockNodeBytes());
        if ( lockPath != null )
        {
            LockData newLockData = new LockData(currentThread, lockPath);
            threadData.put(currentThread, newLockData);
            return true;
        }

        return false;
    }

    ===> attemptLock
    String attemptLock(long time, TimeUnit unit, byte[] lockNodeBytes) throws Exception
    {
        final long      startMillis = System.currentTimeMillis();
        final Long      millisToWait = (unit != null) ? unit.toMillis(time) : null;
        final byte[]    localLockNodeBytes = (revocable.get() != null) ? new byte[0] : lockNodeBytes;
        int             retryCount = 0;

        String          ourPath = null;
        boolean         hasTheLock = false;
        boolean         isDone = false;
        while ( !isDone )
        {
            isDone = true;

            try
            {
                // 创建临时有序节点，用来是预置锁信息
                ourPath = driver.createsTheLock(client, path, localLockNodeBytes);
                //根据临时有序的index和maxLease，给做小的获取锁
                hasTheLock = internalLockLoop(startMillis, millisToWait, ourPath);
            }
            catch ( KeeperException.NoNodeException e )
            {
                // gets thrown by StandardLockInternalsDriver when it can't find the lock node
                // this can happen when the session expires, etc. So, if the retry allows, just try it all again
                if ( 
                 client.getZookeeperClient().getRetryPolicy().allowRetry(retryCount++, 
                  System.currentTimeMillis() - startMillis, 
                  RetryLoop.getDefaultRetrySleeper()) )
                {
                    isDone = false;
                }
                else
                {
                    throw e;
                }
            }
        }

        if ( hasTheLock )
        {
            return ourPath;
        }

        return null;
    }

===> createsTheLock 创建临时有序节点 EPHEMERAL_SEQUENTIAL
 public String createsTheLock(CuratorFramework client, String path, byte[] lockNodeBytes) throws Exception
    {
        String ourPath;
        if ( lockNodeBytes != null )
        {
            ourPath = client.create().creatingParentContainersIfNeeded().withProtection().withMode(CreateMode.EPHEMERAL_SEQUENTIAL).forPath(path, lockNodeBytes);
        }
        else
        {
            ourPath = client.create().creatingParentContainersIfNeeded().withProtection().withMode(CreateMode.EPHEMERAL_SEQUENTIAL).forPath(path);
        }
        return ourPath;
    }


===> internalLockLoop 内部获取锁
private boolean internalLockLoop(long startMillis, Long millisToWait, String ourPath) throws Exception
    {
        boolean     haveTheLock = false;
        boolean     doDelete = false;
        try
        {
            if ( revocable.get() != null )
            {
                client.getData().usingWatcher(revocableWatcher).forPath(ourPath);
            }

            while ( (client.getState() == CuratorFrameworkState.STARTED) && !haveTheLock )
            {
                // 按照序列的名称排序
                List<String>        children = getSortedChildren();
                // 截取最后的序列名称 
                String  sequenceNodeName = 
               ourPath.substring(basePath.length() + 1); // +1 to include the slash
                //  根据sequeneNodeName 在list中的位置
                PredicateResults    predicateResults = driver.getsTheLock(client, children, sequenceNodeName, maxLeases);
                if ( predicateResults.getsTheLock() )
                {
                    haveTheLock = true; //获取到的直接设置
                }
                else
                {
                    //没获取到锁，需要等待获取
                    String  previousSequencePath = basePath + "/" + predicateResults.getPathToWatch();

                    synchronized(this)
                    {
                        try 
                        {
                            // use getData() instead of exists() to avoid leaving unneeded watchers which is a type of resource leak
                            client.getData().usingWatcher(watcher).forPath(previousSequencePath);
                            if ( millisToWait != null )
                             {
                               //休眠指定时间
                                millisToWait -= (System.currentTimeMillis() - startMillis);
                                startMillis = System.currentTimeMillis();
                                if ( millisToWait <= 0 )
                                {
                                    doDelete = true;    // timed out - delete our node
                                    break;
                                }

                                wait(millisToWait);
                            }
                            else
                            {
                             //直接等待，会在注册的监听事件触发时（锁节点发生变化） notifyAll
                                wait();
                            }
                        }
                        catch ( KeeperException.NoNodeException e ) 
                        {
                            // it has been deleted (i.e. lock released). Try to acquire again
                        }
                    }
                }
            }
        }
        catch ( Exception e )
        {
            ThreadUtils.checkInterrupted(e);
            doDelete = true;
            throw e;
        }
        finally
        {
            if ( doDelete )
            {
                deleteOurPath(ourPath);
            }
        }
        return haveTheLock;
    }

===> getsTheLock
@Override
    public PredicateResults getsTheLock(CuratorFramework client, List<String> children, String sequenceNodeName, int maxLeases) throws Exception
    {
        int             ourIndex = children.indexOf(sequenceNodeName);
        validateOurIndex(sequenceNodeName, ourIndex);

        boolean         getsTheLock = ourIndex < maxLeases;
        String          pathToWatch = getsTheLock ? null : children.get(ourIndex - maxLeases);

        return new PredicateResults(pathToWatch, getsTheLock);
    }

释放锁 release

 public void release() throws Exception
    {
        /*
            Note on concurrency: a given lockData instance
            can be only acted on by a single thread so locking isn't necessary
         */

        Thread currentThread = Thread.currentThread();
        LockData lockData = threadData.get(currentThread);
        if ( lockData == null )
        {
            throw new IllegalMonitorStateException("You do not own the lock: " + basePath);
        }
           
        //因为可重入锁，所以删除的时候，需要全部都释放
        int newLockCount = lockData.lockCount.decrementAndGet();
        if ( newLockCount > 0 )
        {
            return;
        }
        if ( newLockCount < 0 )
        {
            throw new IllegalMonitorStateException("Lock count has gone negative for lock: " + basePath);
        }
        try
        {
            // 释放
            internals.releaseLock(lockData.lockPath);
        }
        finally
        {
            threadData.remove(currentThread);
        }
    }

====> releaseLock
 void releaseLock(String lockPath) throws Exception
    {
        revocable.set(null);
        deleteOurPath(lockPath);
    }
//本质
private void deleteOurPath(String ourPath) throws Exception
    {
        try
        {
            client.delete().guaranteed().forPath(ourPath); //删除节点信息
        }
        catch ( KeeperException.NoNodeException e )
        {
            // ignore - already deleted (possibly expired session, etc.)
        }
    }

总结一下：获取锁，就是在指定路径，注册一个临时有序节点，根据maxLease（默认是1）只取索引最小的有序节点，是获取锁。其他所有的节点，都wait() 等待这个锁的节点是的监听事件，触发的notifyAll，来重新竞争锁

释放锁：当所有的重入锁全都扣减完成，直接删除对应的临时有序节点，触发节点变化事件，唤醒等待的锁。

大概说了一下curator获取锁的流程，zookeeper除了实现分布式锁，还可以提供为分布式服务，leader选举也是（kafka内部就是使用zookeeper来实现的 controller 选举）

应用
public class LeaderSelectorClient extends LeaderSelectorListenerAdapter implements Closeable {

@Override
    public void takeLeadership(CuratorFramework client) throws Exception {
        //如果进入当前的方法，意味着当前的进程获得了锁。获得锁以后，这个方法会被回调
        //这个方法执行结束之后，表示释放leader权限
        System.out.println(name+"->现在是leader了");
//        countDownLatch.await(); //阻塞当前的进程防止leader丢失
    }

 public static void main(String[] args) throws IOException {
        CuratorFramework curatorFramework = CuratorFrameworkFactory.builder().
                connectString(CONNECTION_STR).sessionTimeoutMs(50000000).
                retryPolicy(new ExponentialBackoffRetry(1000, 3)).build();
        curatorFramework.start();
        LeaderSelectorClient leaderSelectorClient=new LeaderSelectorClient("ClientA");
        LeaderSelector leaderSelector=new 
                    LeaderSelector(curatorFramework,"/leader",leaderSelectorClient);
        leaderSelectorClient.setLeaderSelector(leaderSelector);
        leaderSelectorClient.start(); //开始选举
        System.in.read();
    }


}

核心方法： leaderSelector.start()

主要代码：

requeue();  ->  internalRequeue  ->

void doWork() throws Exception
    {
        hasLeadership = false;
        try
        {
            //leader 选举其实就是竞争锁，拿到锁的就是leader
            mutex.acquire();

            hasLeadership = true;
            try
            {
                if ( debugLeadershipLatch != null )
                {
                    debugLeadershipLatch.countDown();
                }
                if ( debugLeadershipWaitLatch != null )
                {
                    debugLeadershipWaitLatch.await();
                }
                //获取锁的线程，进入我们复写的方法，这个方法走完，就会释放锁，可以在这里阻塞
                listener.takeLeadership(client);
            }
            catch ( InterruptedException e )
            {
                Thread.currentThread().interrupt();
                throw e;
            }
            catch ( Throwable e )
            {
                ThreadUtils.checkInterrupted(e);
            }
            finally
            {
                clearIsQueued();
            }
        }
        catch ( InterruptedException e )
        {
            Thread.currentThread().interrupt();
            throw e;
        }
        finally
        {
            if ( hasLeadership )
            {
                hasLeadership = false;
                try
                {
                    mutex.release(); //释放锁
                }
                catch ( Exception e )
                {
                    ThreadUtils.checkInterrupted(e);
                    log.error("The leader threw an exception", e);
                    // ignore errors - this is just a safety
                }
            }
        }
    }

总结： curator利用zk实现leader选举，其实就是利用的分布式锁的机制，第一个注册的就是leader。当takeLeaderShip方法结束后，leader节点会被释放。

zab协议：

zookeeper借鉴了googleChuby的 paxous协议，来实现提供分布式一致性服务。

zab协议总共提供了两种模式：
崩溃恢复消息广播

消息广播：

实质就是一个2pc，很多框架也有这种思想（例如mysql的redo log）

步骤： 1、leader收到了事物请求，本地记录事物日志，并生成一个自增的zxid（前32位是epoche，后32位是自增数）

2、将携带zxid的事物请求，发送一个proposal请求给所有的follow

3、follow节点记录proposal到本地，构建提交的准备工作后，发送一个ack回执给leader

4、leader收到过半的ack，提交本地事物，并发送commit给所有的follow

5、follow收到commit请求，提交事物，完成同步

崩溃恢复：

做两件事： 1、选出leader 2、做数据同步

leader选举的源码：

找到程序入口;

打开zkServer.sh

 1、main.initializeAndRun(args); 

 2、
   加载配置文件到config里面
   QuorumPeerConfig config = new QuorumPeerConfig();
        if (args.length == 1) {
            config.parse(args[0]);
        }
  如果配置文件有值，就
  if (args.length == 1 && config.servers.size() > 0) {
            runFromConfig(config);//如果args==1，走这段代码
   }

3、 public void runFromConfig(QuorumPeerConfig config) throws IOException {
      try {
          ManagedUtil.registerLog4jMBeans();
      } catch (JMException e) {
          LOG.warn("Unable to register log4j JMX control", e);
      }
  
      LOG.info("Starting quorum peer");
      try {
          //构建cnxnFactor对象，用来创建服务端连接信息
          ServerCnxnFactory cnxnFactory = ServerCnxnFactory.createFactory();
          cnxnFactory.configure(config.getClientPortAddress(),
                                config.getMaxClientCnxns());

          quorumPeer = getQuorumPeer();
          //getView()
          quorumPeer.setQuorumPeers(config.getServers()); //zoo.cfg里面解析的servers节点
          quorumPeer.setTxnFactory(new FileTxnSnapLog(
                  new File(config.getDataLogDir()),
                  new File(config.getDataDir())));
          quorumPeer.setElectionType(config.getElectionAlg());
          quorumPeer.setMyid(config.getServerId());
          quorumPeer.setTickTime(config.getTickTime());
          .......
          quorumPeer.setQuorumCnxnThreadsSize(config.quorumCnxnThreadsSize);
          quorumPeer.initialize();

          quorumPeer.start(); //上面全都是创建quorumpeer对象，这里是核心
          quorumPeer.join();  // 线程获得cpu使用权
      } catch (InterruptedException e) {
          // warn, but generally this is ok
          LOG.warn("Quorum Peer interrupted", e);
      }
    }


4、加载初始化信息
 public synchronized void start() {
        loadDataBase(); //加载快照文件数据
        cnxnFactory.start();      //Nio或者netty 跟通信有关系. ->暴露一个2181的端口号
        startLeaderElection();  //开始leader选举-> 启动一个投票的监听、初始化一个选举算法
        super.start(); //当前的QuorumPeer继承Thread，调用Thread.start() ->QuorumPeer.run()
    }

5、start方法是线程的方法，最终会执行run
 public void run() {
       ....
        try {
            /*
             * Main loop
             * 死循环
             */
            while (running) {
                switch (getPeerState()) {//第一次启动的时候，LOOKING
                case LOOKING:
                    LOG.info("LOOKING");

                    if (Boolean.getBoolean("readonlymode.enabled")) {
                      .....readonly.....
                    } else {
                        try {
                            setBCVote(null);
                            //setCurrentVote -> 确定了谁是leader了。 
                            setCurrentVote(makeLEStrategy().lookForLeader());
                        } catch (Exception e) {
                            LOG.warn("Unexpected exception", e);
                            setPeerState(ServerState.LOOKING);
                        }
                    }
                    break;
                case OBSERVING:
                   ......
                    break;
                case FOLLOWING:
                   ......
                    break;
                case LEADING:
                   ......
                    break;
                }
            }
        }
    }

6、lookForLeader()最终获取到leader，这是关键
public Vote lookForLeader() throws InterruptedException {
        
        try {
            //接收到的票据的集合
            HashMap<Long, Vote> recvset = new HashMap<Long, Vote>();

            //
            HashMap<Long, Vote> outofelection = new HashMap<Long, Vote>();

            int notTimeout = finalizeWait;

            synchronized(this){
                //逻辑时钟->epoch
                logicalclock.incrementAndGet();
                //proposal
                updateProposal(getInitId(), getInitLastLoggedZxid(), getPeerEpoch());
            }

            
            sendNotifications();//我要广播自己的票据，将自己的vote发送到队列发出去

            /*
             * Loop in which we exchange notifications until we find a leader
             */

            //接收到了票据
            while ((self.getPeerState() == ServerState.LOOKING) &&
                    (!stop)){
                /*
                 * Remove next notification from queue, times out after 2 times
                 * the termination time
                 */
                //recvqueue是从网络上接收到的其他机器的Notification
                Notification n = recvqueue.poll(notTimeout,
                        TimeUnit.MILLISECONDS);

                /*
                 * Sends more notifications if haven't received enough.
                 * Otherwise processes new notification.
                   收到为空，重新发送一次，防止网络问题
                 */
                if(n == null){
                    if(manager.haveDelivered()){
                        sendNotifications();
                    } else {
                        manager.connectAll();//重新连接集群中的所有节点
                    }

                    /*
                     * Exponential backoff
                     */
                    int tmpTimeOut = notTimeout*2;
                    notTimeout = (tmpTimeOut < maxNotificationInterval?
                            tmpTimeOut : maxNotificationInterval);
                    LOG.info("Notification time out: " + notTimeout);
                }

                else if(validVoter(n.sid) && validVoter(n.leader)) {//判断是否是一个有效的票据
                    /*
                     * Only proceed if the vote comes from a replica in the
                     * voting view for a replica in the voting view.
                     */
                    switch (n.state) {
                    case LOOKING: //第一次进入到这个case
                        // If notification > current, replace and send messages out  
                        // 当收到的epoche > 当前，选对方为leader
                        if (n.electionEpoch > logicalclock.get()) { 
                            logicalclock.set(n.electionEpoch);
                            recvset.clear();//清空
                            //收到票据之后，当前的server要听谁的。
                            //可能是听server1的、也可能是听server2，也可能是听server3
                            //zab  leader选举算法
                            if(totalOrderPredicate(n.leader, n.zxid, n.peerEpoch,
                                    getInitId(), getInitLastLoggedZxid(), getPeerEpoch())) {
                                //把自己的票据更新成对方的票据，那么下一次，发送的票据就是新的票据
                                updateProposal(n.leader, n.zxid, n.peerEpoch);
                            } else {
                                //收到的票据小于当前的节点的票据，下一次发送票据，仍然发送自己的
                                updateProposal(getInitId(),
                                        getInitLastLoggedZxid(),
                                        getPeerEpoch());
                            }
                            //继续发送通知
                            sendNotifications();
                        } else if (n.electionEpoch < logicalclock.get()) { //说明当前的数据已经过期了
                            if(LOG.isDebugEnabled()){
                                LOG.debug("Notification election epoch is smaller than logicalclock. n.electionEpoch = 0x"
                                        + Long.toHexString(n.electionEpoch)
                                        + ", logicalclock=0x" + Long.toHexString(logicalclock.get()));
                            }
                            break;
                        } else if (totalOrderPredicate(n.leader, n.zxid, n.peerEpoch,
                                proposedLeader, proposedZxid, proposedEpoch)) {
                            updateProposal(n.leader, n.zxid, n.peerEpoch);
                            sendNotifications();
                        }

                        if(LOG.isDebugEnabled()){
                            LOG.debug("Adding vote: from=" + n.sid +
                                    ", proposed leader=" + n.leader +
                                    ", proposed zxid=0x" + Long.toHexString(n.zxid) +
                                    ", proposed election epoch=0x" + Long.toHexString(n.electionEpoch));
                        }

                        recvset.put(n.sid, new Vote(n.leader, n.zxid, n.electionEpoch, n.peerEpoch));
                        //决断时刻（当前节点的更新后的vote信息，和recvset集合中的票据进行归纳，）
                        if (termPredicate(recvset,
                                new Vote(proposedLeader, proposedZxid,
                                        logicalclock.get(), proposedEpoch))) {

                            // Verify if there is any change in the proposed leader
                            while((n = recvqueue.poll(finalizeWait,
                                    TimeUnit.MILLISECONDS)) != null){
                                if(totalOrderPredicate(n.leader, n.zxid, n.peerEpoch,
                                        proposedLeader, proposedZxid, proposedEpoch)){
                                    recvqueue.put(n);
                                    break;
                                }
                            }

                            /*
                             * This predicate is true once we don't read any new
                             * relevant message from the reception queue
                             */
                            if (n == null) {
                                self.setPeerState((proposedLeader == self.getId()) ?
                                        ServerState.LEADING: learningState());

                                Vote endVote = new Vote(proposedLeader,
                                                        proposedZxid,
                                                        logicalclock.get(),
                                                        proposedEpoch);
                                leaveInstance(endVote);
                                return endVote;
                            }
                        }
                        break;
                    case OBSERVING:
                        LOG.debug("Notification from observer: " + n.sid);
                        break;
                    case FOLLOWING:
                    case LEADING:
                        /*
                         * Consider all notifications from the same epoch
                         * together.
                         */
                        if(n.electionEpoch == logicalclock.get()){
                            recvset.put(n.sid, new Vote(n.leader,
                                                          n.zxid,
                                                          n.electionEpoch,
                                                          n.peerEpoch));
                           
                            if(ooePredicate(recvset, outofelection, n)) {
                                self.setPeerState((n.leader == self.getId()) ?
                                        ServerState.LEADING: learningState());

                                Vote endVote = new Vote(n.leader, 
                                        n.zxid, 
                                        n.electionEpoch, 
                                        n.peerEpoch);
                                leaveInstance(endVote);
                                return endVote;
                            }
                        }

                        /*
                         * Before joining an established ensemble, verify
                         * a majority is following the same leader.
                         */
                        outofelection.put(n.sid, new Vote(n.version,
                                                            n.leader,
                                                            n.zxid,
                                                            n.electionEpoch,
                                                            n.peerEpoch,
                                                            n.state));
           
                        if(ooePredicate(outofelection, outofelection, n)) {
                            synchronized(this){
                                logicalclock.set(n.electionEpoch);
                                self.setPeerState((n.leader == self.getId()) ?
                                        ServerState.LEADING: learningState());
                            }
                            Vote endVote = new Vote(n.leader,
                                                    n.zxid,
                                                    n.electionEpoch,
                                                    n.peerEpoch);
                            leaveInstance(endVote);
                            return endVote;
                        }
                        break;
                    default:
                        LOG.warn("Notification state unrecognized: {} (n.state), {} (n.sid)",
                                n.state, n.sid);
                        break;
                    }
                } else {
                    if (!validVoter(n.leader)) {
                        LOG.warn("Ignoring notification for non-cluster member sid {} from sid {}", n.leader, n.sid);
                    }
                    if (!validVoter(n.sid)) {
                        LOG.warn("Ignoring notification for sid {} from non-quorum member sid {}", n.leader, n.sid);
                    }
                }
            }
            return null;
        } finally {
            try {
                if(self.jmxLeaderElectionBean != null){
                    MBeanRegistry.getInstance().unregister(
                            self.jmxLeaderElectionBean);
                }
            } catch (Exception e) {
                LOG.warn("Failed to unregister with JMX", e);
            }
            self.jmxLeaderElectionBean = null;
            LOG.debug("Number of connection processing threads: {}",
                    manager.getConnectionThreadCount());
        }
    }

7、两张票进行battle的方法 totalOrderPredicate
核心逻辑就是： 先比epoch  相等再比zxid  还是相等，再比myid（server.id,这个是自己设置的，不可能再相等了）
((newEpoch > curEpoch) || 
                ((newEpoch == curEpoch) &&
                ((newZxid > curZxid) || ((newZxid == curZxid) && (newId > curId)))));

8、什么情况下终止投票，大于一般就是leader  termPredicate()
  (set.size() > half);  当某一个节点的票据，大于一般的时候，直接胜出

上面大概就是zookeeper自己，如何选举出leader的。

思想就是：第一轮，先投自己，然后互相比，赢得一方，就获得其他节点的投票，过半直接就是leader节点。

网络交互图：

send流程： 1、发送自己的vote ： sendqueue.offer(notmsg)

2、WorkerSender 一直监听，获取数据 sendqueue.poll()

3、根据sid判断，发给自己还是别人，自己直接添加receiveQueue，否则走socket通信 queueSendMap

4、SendWorker. start();

5、write flush

receive流程：

1、RecvWorker run() 监听

2、recvQueue.add(msg); 消息放到recvQueue中

3、在选举leader时，会 recvqueue.poll，拿到信息，出来对比

上面我们看了Looking的状态下的操作，leader选举完了以后，还有一些其他操作，简单看下：

makeFollower：构建FollowerZooKeeperServer

follower.followLeader():

关键代码：

connectToLeader(leaderServer.addr, leaderServer.hostname); 连接到leader，构建socket通信
syncWithLeader(newEpochZxid); 同步Leader数据

同步数据对于follower来说，就是针对leader发送的不同命令，做不同处理

服务端如何处理客户端发来的请求的？

请求流程：当serverCnx建立了socket连接后，监听读和写事件，来进行相应操作。

上面已经获取到了请求的buffer信息，下面开始处理

// 接收到请求后 
Request si = new Request(cnxn, cnxn.getSessionId(), h.getXid(),
                  h.getType(), incomingBuffer, cnxn.getAuthInfo());
 si.setOwner(ServerCnxn.me);
 submitRequest(si); // 提交请求

public void submitRequest(Request si) {
     ......

     firstProcessor.processRequest(si);

     .....
}

firstProcessor的生成：
protected void setupRequestProcessors() {
        // 构建链信息：  
        RequestProcessor finalProcessor = new FinalRequestProcessor(this);
        RequestProcessor syncProcessor = new SyncRequestProcessor(this,
                finalProcessor);
        ((SyncRequestProcessor)syncProcessor).start();
        firstProcessor = new PrepRequestProcessor(this, syncProcessor);
        ((PrepRequestProcessor)firstProcessor).start();
    }

从这里我们可以得到，firstProcessor处理请求，其实是链式调用，上面初始化了构建过程：
 PrepRequestProcessor(SyncRequestProcessor(FinalRequestProcessor))
PrepRequestProcessor： 主要是序列化header请求，将zxid cversion empireOwner等stat信息都创建好
SyncRequestProcessor： 主要是将请求信息全部都写到snap快照日志中，完成同步（zk数据全在内存）、
FinalRequestProcessor：根据请求的类型（create、delete、setData...），完成事物提交，并触发watch

纯手绘的一张zk的源码图，可能会有偏差，大致方向应该是没问题，可以参考一下

early_or_later

关注

2
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
zookeeper-源码解读

curator：封装了zookeeper原生方式，相比于zkClient 抽取的更高1、curator 实现分布式锁：实现方式： CuratorFramework curatorFramework = CuratorFrameworkFactory.builder() .connectString(CONNECTION_STR) .retryPolicy(new RetryForever(3)) .sessionTimeou
复制链接

扫一扫