Zookeep服务端请求处理流程浅析

提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档


前言

在Zookeeper服务端启动过程中调用了ServerCnxnFactory的Start方法,这里就开始了客户端跟服务端通信的通道,这里以netty实现方式(NettyServerCnxnFactory)来分析下

 public void start() {
        LOG.info("binding to port " + localAddress);
        parentChannel = bootstrap.bind(localAddress);
    }

服务端执行bind操作就可以监听端口接收客户端请求了,这个localAddress地址信息在QuorumPeerMain runFromConfig方法中赋值,也就是配置项中的clientPort

  cnxnFactory.configure(config.getClientPortAddress(), config.getMaxClientCnxns());

再看看NettyServerCnxnFactory这个类

  NettyServerCnxnFactory() {
        bootstrap = new ServerBootstrap(
                new NioServerSocketChannelFactory(
                        Executors.newCachedThreadPool(),
                        Executors.newCachedThreadPool()));
        // parent channel
        bootstrap.setOption("reuseAddress", true);
        // child channels
        bootstrap.setOption("child.tcpNoDelay", true);
        /* set socket linger to off, so that socket close does not block */
        bootstrap.setOption("child.soLinger", -1);

        bootstrap.getPipeline().addLast("servercnxnfactory", channelHandler);
    }

添加了一个处理Handler 实现类是CnxnChannelHandler 接下来我们看看这个类是怎样处理请求的


一、CnxnChannelHandler

在这里插入图片描述
定义了Channel事件的处理方法

  1. 处理客户端连接
 public void channelConnected(ChannelHandlerContext ctx,
                ChannelStateEvent e) throws Exception
        {
            if (LOG.isTraceEnabled()) {
                LOG.trace("Channel connected " + e);
            }
            allChannels.add(ctx.getChannel());
            NettyServerCnxn cnxn = new NettyServerCnxn(ctx.getChannel(),
                    zkServer, NettyServerCnxnFactory.this);
            ctx.setAttachment(cnxn);
            addCnxn(cnxn);
        }

当有客户端连接进来后创建一个NettyServerCnxn对象,可以看到这个对象包含了客户端与服务器的连接通道(Channel)

  1. 处理客户端消息
 public void messageReceived(ChannelHandlerContext ctx, MessageEvent e)
            throws Exception
        {
            if (LOG.isTraceEnabled()) {
                LOG.trace("message received called " + e.getMessage());
            }
            try {
                if (LOG.isDebugEnabled()) {
                    LOG.debug("New message " + e.toString()
                            + " from " + ctx.getChannel());
                }
                /**
                 * 建立连接过程中 已经创建了NettyServerCnxn对象并存在ctx中
                 */
                NettyServerCnxn cnxn = (NettyServerCnxn)ctx.getAttachment();
                synchronized(cnxn) {
                    processMessage(e, cnxn);
                }
            } catch(Exception ex) {
                LOG.error("Unexpected exception in receive", ex);
                throw ex;
            }
        }
 private void processMessage(MessageEvent e, NettyServerCnxn cnxn) {
            if (LOG.isDebugEnabled()) {
                LOG.debug(Long.toHexString(cnxn.sessionId) + " queuedBuffer: "
                        + cnxn.queuedBuffer);
            }

            if (e instanceof NettyServerCnxn.ResumeMessageEvent) {
                LOG.debug("Received ResumeMessageEvent");
                if (cnxn.queuedBuffer != null) {
                    if (LOG.isTraceEnabled()) {
                        LOG.trace("processing queue "
                                + Long.toHexString(cnxn.sessionId)
                                + " queuedBuffer 0x"
                                + ChannelBuffers.hexDump(cnxn.queuedBuffer));
                    }
                    cnxn.receiveMessage(cnxn.queuedBuffer);
                    if (!cnxn.queuedBuffer.readable()) {
                        LOG.debug("Processed queue - no bytes remaining");
                        cnxn.queuedBuffer = null;
                    } else {
                        LOG.debug("Processed queue - bytes remaining");
                    }
                } else {
                    LOG.debug("queue empty");
                }
                cnxn.channel.setReadable(true);
            } else {
                ChannelBuffer buf = (ChannelBuffer)e.getMessage();
                if (LOG.isTraceEnabled()) {
                    LOG.trace(Long.toHexString(cnxn.sessionId)
                            + " buf 0x"
                            + ChannelBuffers.hexDump(buf));
                }
                
                if (cnxn.throttled) {
                    LOG.debug("Received message while throttled");
                    // we are throttled, so we need to queue
                    if (cnxn.queuedBuffer == null) {
                        LOG.debug("allocating queue");
                        cnxn.queuedBuffer = dynamicBuffer(buf.readableBytes());
                    }
                    cnxn.queuedBuffer.writeBytes(buf);
                    if (LOG.isTraceEnabled()) {
                        LOG.trace(Long.toHexString(cnxn.sessionId)
                                + " queuedBuffer 0x"
                                + ChannelBuffers.hexDump(cnxn.queuedBuffer));
                    }
                } else {
                    LOG.debug("not throttled");
                    if (cnxn.queuedBuffer != null) {
                        if (LOG.isTraceEnabled()) {
                            LOG.trace(Long.toHexString(cnxn.sessionId)
                                    + " queuedBuffer 0x"
                                    + ChannelBuffers.hexDump(cnxn.queuedBuffer));
                        }
                        cnxn.queuedBuffer.writeBytes(buf);
                        if (LOG.isTraceEnabled()) {
                            LOG.trace(Long.toHexString(cnxn.sessionId)
                                    + " queuedBuffer 0x"
                                    + ChannelBuffers.hexDump(cnxn.queuedBuffer));
                        }

                        cnxn.receiveMessage(cnxn.queuedBuffer);
                        if (!cnxn.queuedBuffer.readable()) {
                            LOG.debug("Processed queue - no bytes remaining");
                            cnxn.queuedBuffer = null;
                        } else {
                            LOG.debug("Processed queue - bytes remaining");
                        }
                    } else {
                        cnxn.receiveMessage(buf);
                        if (buf.readable()) {
                            if (LOG.isTraceEnabled()) {
                                LOG.trace("Before copy " + buf);
                            }
                            cnxn.queuedBuffer = dynamicBuffer(buf.readableBytes()); 
                            cnxn.queuedBuffer.writeBytes(buf);
                            if (LOG.isTraceEnabled()) {
                                LOG.trace("Copy is " + cnxn.queuedBuffer);
                                LOG.trace(Long.toHexString(cnxn.sessionId)
                                        + " queuedBuffer 0x"
                                        + ChannelBuffers.hexDump(cnxn.queuedBuffer));
                            }
                        }
                    }
                }
            }
        }

最终处理客户端请求的逻辑在NettyServerCnxn receiveMessage方法里面实现

二、NettyServerCnxn

  1. receiveMessage
  public void receiveMessage(ChannelBuffer message) {
        try {
            while(message.readable() && !throttled) {
                if (bb != null) {
                    if (LOG.isTraceEnabled()) {
                        LOG.trace("message readable " + message.readableBytes()
                                + " bb len " + bb.remaining() + " " + bb);
                        ByteBuffer dat = bb.duplicate();
                        dat.flip();
                        LOG.trace(Long.toHexString(sessionId)
                                + " bb 0x"
                                + ChannelBuffers.hexDump(
                                        ChannelBuffers.copiedBuffer(dat)));
                    }

                    if (bb.remaining() > message.readableBytes()) {
                        int newLimit = bb.position() + message.readableBytes();
                        bb.limit(newLimit);
                    }
                    message.readBytes(bb);
                    bb.limit(bb.capacity());

                    if (LOG.isTraceEnabled()) {
                        LOG.trace("after readBytes message readable "
                                + message.readableBytes()
                                + " bb len " + bb.remaining() + " " + bb);
                        ByteBuffer dat = bb.duplicate();
                        dat.flip();
                        LOG.trace("after readbytes "
                                + Long.toHexString(sessionId)
                                + " bb 0x"
                                + ChannelBuffers.hexDump(
                                        ChannelBuffers.copiedBuffer(dat)));
                    }
                    // remaining()函数 计算从当前位置到上界还剩余的元素数目
                    if (bb.remaining() == 0) {
                        packetReceived();
                        bb.flip();
                        /**
                         * zkServer 根据选举结果的不同角色 可以是 LeaderZooKeeperServer FollowerZooKeeperServer ObserverZooKeeperServer
                         */
                        ZooKeeperServer zks = this.zkServer;
                        /**
                         * zk服务未启动
                         */
                        if (zks == null || !zks.isRunning()) {
                            throw new IOException("ZK down");
                        }
                        if (initialized) {
                            zks.processPacket(this, bb);

                            if (zks.shouldThrottle(outstandingCount.incrementAndGet())) {
                                disableRecvNoWait();
                            }
                        } else {
                            LOG.debug("got conn req request from "
                                    + getRemoteSocketAddress());
                            zks.processConnectRequest(this, bb);
                            initialized = true;
                        }
                        bb = null;
                    }
                } else {
                    if (LOG.isTraceEnabled()) {
                        LOG.trace("message readable "
                                + message.readableBytes()
                                + " bblenrem " + bbLen.remaining());
                        ByteBuffer dat = bbLen.duplicate();
                        dat.flip();
                        LOG.trace(Long.toHexString(sessionId)
                                + " bbLen 0x"
                                + ChannelBuffers.hexDump(
                                        ChannelBuffers.copiedBuffer(dat)));
                    }

                    if (message.readableBytes() < bbLen.remaining()) {
                        bbLen.limit(bbLen.position() + message.readableBytes());
                    }
                    message.readBytes(bbLen);
                    bbLen.limit(bbLen.capacity());
                    if (bbLen.remaining() == 0) {
                        bbLen.flip();

                        if (LOG.isTraceEnabled()) {
                            LOG.trace(Long.toHexString(sessionId)
                                    + " bbLen 0x"
                                    + ChannelBuffers.hexDump(
                                            ChannelBuffers.copiedBuffer(bbLen)));
                        }
                        int len = bbLen.getInt();
                        if (LOG.isTraceEnabled()) {
                            LOG.trace(Long.toHexString(sessionId)
                                    + " bbLen len is " + len);
                        }

                        bbLen.clear();
                        if (!initialized) {
                            if (checkFourLetterWord(channel, message, len)) {
                                return;
                            }
                        }
                        if (len < 0 || len > BinaryInputArchive.maxBuffer) {
                            throw new IOException("Len error " + len);
                        }
                        bb = ByteBuffer.allocate(len);
                    }
                }
            }
        } catch(IOException e) {
            LOG.warn("Closing connection to " + getRemoteSocketAddress(), e);
            close();
        }
    }

初始化后调用NettyServerCnxn内部的 ZooKeeperServer (这里指向一个引用开始时候是没值的)processPacket 方法处理连接信息

在这里插入图片描述
ZookeeperServer在Master选举结束后设置

三 ZooKeeperServer processPacket处理连接信息

   public void processPacket(ServerCnxn cnxn, ByteBuffer incomingBuffer) throws IOException {
        // We have the request, now process and setup for next
        InputStream bais = new ByteBufferInputStream(incomingBuffer);
        BinaryInputArchive bia = BinaryInputArchive.getArchive(bais);
        /**
         *   int xid;
         *   int type;
         */
        RequestHeader h = new RequestHeader();
        h.deserialize(bia, "header");
        // Through the magic of byte buffers, txn will not be
        // pointing
        // to the start of the txn
        incomingBuffer = incomingBuffer.slice();
        if (h.getType() == OpCode.auth) {
            LOG.info("got auth packet " + cnxn.getRemoteSocketAddress());
            /**
             *    int type;
             *    ustring scheme;
             *    buffer auth;
             */
            AuthPacket authPacket = new AuthPacket();
            ByteBufferInputStream.byteBuffer2Record(incomingBuffer, authPacket);
            String scheme = authPacket.getScheme();
            AuthenticationProvider ap = ProviderRegistry.getProvider(scheme);
            Code authReturn = KeeperException.Code.AUTHFAILED;
            if(ap != null) {
                try {
                    authReturn = ap.handleAuthentication(cnxn, authPacket.getAuth());
                } catch(RuntimeException e) {
                    LOG.warn("Caught runtime exception from AuthenticationProvider: " + scheme + " due to " + e);
                    authReturn = KeeperException.Code.AUTHFAILED;                   
                }
            }
            if (authReturn!= KeeperException.Code.OK) {
                if (ap == null) {
                    LOG.warn("No authentication provider for scheme: "
                            + scheme + " has "
                            + ProviderRegistry.listProviders());
                } else {
                    LOG.warn("Authentication failed for scheme: " + scheme);
                }
                // send a response...
                ReplyHeader rh = new ReplyHeader(h.getXid(), 0,
                        KeeperException.Code.AUTHFAILED.intValue());
                cnxn.sendResponse(rh, null, null);
                // ... and close connection
                cnxn.sendBuffer(ServerCnxnFactory.closeConn);
                cnxn.disableRecv();
            } else {
                if (LOG.isDebugEnabled()) {
                    LOG.debug("Authentication succeeded for scheme: "
                              + scheme);
                }
                LOG.info("auth success " + cnxn.getRemoteSocketAddress());
                ReplyHeader rh = new ReplyHeader(h.getXid(), 0,
                        KeeperException.Code.OK.intValue());
                cnxn.sendResponse(rh, null, null);
            }
            return;
        } else {
            if (h.getType() == OpCode.sasl) {
                Record rsp = processSasl(incomingBuffer,cnxn);
                ReplyHeader rh = new ReplyHeader(h.getXid(), 0, KeeperException.Code.OK.intValue());
                cnxn.sendResponse(rh,rsp, "response"); // not sure about 3rd arg..what is it?
                return;
            }
            else {
                /**
                 * 创建请求类型
                 */
                Request si = new Request(cnxn, cnxn.getSessionId(), h.getXid(),
                  h.getType(), incomingBuffer, cnxn.getAuthInfo());
                si.setOwner(ServerCnxn.me);
                /**
                 * 处理提交  调用 RequestProcessor链处理请求
                 * OpCode 操作吗
                 */
                submitRequest(si);
            }
        }
        cnxn.incrOutstandingRequests(h);
    }

处理我们常规命令的逻辑在72-79行

  // ZooKeeperServer  
  public void submitRequest(Request si) {
        if (firstProcessor == null) {
            synchronized (this) {
                try {
                    // Since all requests are passed to the request
                    // processor it should wait for setting up the request
                    // processor chain. The state will be updated to RUNNING
                    // after the setup.
                    while (state == State.INITIAL) {
                        wait(1000);
                    }
                } catch (InterruptedException e) {
                    LOG.warn("Unexpected interruption", e);
                }
                if (firstProcessor == null || state != State.RUNNING) {
                    throw new RuntimeException("Not started");
                }
            }
        }
        try {
            touch(si.cnxn);
            boolean validpacket = Request.isValid(si.type);
            if (validpacket) {
                /**
                 * RequestProcessor  是一个线程对象 已启动
                 */
                firstProcessor.processRequest(si);
                if (si.cnxn != null) {
                    incInProcess();
                }
            } else {
                LOG.warn("Received packet at server of unknown type " + si.type);
                new UnimplementedRequestProcessor().processRequest(si);
            }
        } catch (MissingSessionException e) {
            if (LOG.isDebugEnabled()) {
                LOG.debug("Dropping request: " + e.getMessage());
            }
        } catch (RequestProcessorException e) {
            LOG.error("Unable to process request:" + e.getMessage(), e);
        }
    }

调用ZooKeeperServerRequestProcessor 类的processRequest方法处理提交过来的请求

在这里插入图片描述
ZK 中有实现了不同类型的RequestProcessor,实际上具体的实现类内部还有下一个RequestProcessor的引用,就构成链表结构,不同的服务器类型,处理链最后一步都是FinalRequestProcessor

四 集群中各节点ZK RequestProcessor 创建流程

我们知道在选举结束后ZK中的节点类型有Follower Leader Observer 节点都持有一个ZK对象

节点类型ZK类型
FollowerFollowerZooKeeperServer
LeaderLeaderZooKeeperServer
ObserverObserverZooKeeperServer
 protected Follower makeFollower(FileTxnSnapLog logFactory) throws IOException {
        return new Follower(this, new FollowerZooKeeperServer(logFactory, 
                this,new ZooKeeperServer.BasicDataTreeBuilder(), this.zkDb));
    }
     
    protected Leader makeLeader(FileTxnSnapLog logFactory) throws IOException {
        return new Leader(this, new LeaderZooKeeperServer(logFactory,
                this,new ZooKeeperServer.BasicDataTreeBuilder(), this.zkDb));
    }
    
    protected Observer makeObserver(FileTxnSnapLog logFactory) throws IOException {
        return new Observer(this, new ObserverZooKeeperServer(logFactory,
                this, new ZooKeeperServer.BasicDataTreeBuilder(), this.zkDb));
    }

ZK对象内部都有一个RequestProcessor链表,这个在什么时候设置呢,我们具体来看下

1、Leader

  • 选举结束后调用 leader.lead();
  • lead 方法里面调用了 startZkServer
  • 调用到 ZookeeperServer setupRequestProcessors方法(LeaderZooKeeperServer子类重写)
    setupRequestProcessors 方法启动了PrepRequestProcessor CommitProcessor
  • 设置firstProcessor 为 PrepRequestProcessor

Leader Processor处理链表

PrepRequestProcessor -> ProposalRequestProcessor -> CommitProcessor -> ToBeAppliedRequestProcessor -> FinalRequestProcessor

在这里插入图片描述

2、Observer

  • 选举结束后调用 observer.observeLeader()
  • 同步syncWithLeader里面调用zk.startup();
  • 调用到 ZookeeperServer setupRequestProcessors方法(ObserverZooKeeperServer子类重写)
    setupRequestProcessors 方法启动了ObserverRequestProcessor CommitProcessor
  • 设置firstProcessor 为 ObserverRequestProcessor

ObserverProcessor处理链表

ObserverRequestProcessor -> CommitProcessor-> FinalRequestProcessor

在这里插入图片描述

3、Follower

  • 选举结束后调用 lfollower.followLeader()
  • 同步syncWithLeader里面调用zk.startup()
  • 调用到 ZookeeperServer setupRequestProcessors方法(LeaderZooKeeperServer子类重写)
    setupRequestProcessors 方法启动了 SyncRequestProcessor FollowerRequestProcessorCommitProcessor
  • 设置firstProcessor 为 FollowerRequestProcessor

Follower处理链表

FollowerRequestProcessor-> CommitProcessor-> FinalRequestProcessor

在这里插入图片描述

同时也创建了一个SyncRequestProcessor
在这里插入图片描述

4、不同Processor的具体作用

Processor作用
PrepRequestProcessorLeader服务器的请求预处理器,也是Leader的第一个服务器 会对请求做一些预处理比如回话检查 ACL检查 也能识别出请求是否事务请求
ProposalRequestProcessorLeader服务器的投票处理器,对应非事务请求直接转发到CommitProcessor ,如果是事务请求同时还会创建Proposal让所有Follower服务器来投票
SyncRequestProcessor日志记录处理器 ,将请求同步到快照
CommitProcessor事务提交处理器,非事务请求直接提交给下一级处理器处理
ToBeAppliedRequestProcessorLeader服务器中事务处理器的下一级处理器 用于存储CommitProcessor 处理过的可被提交的Proposal
FinalRequestProcessor在所有服务器类型中这都是最后一个处理器 创建客户端响应
PrepRequestProcessorLeader服务器的请求预处理器,也是Leader的第一个服务器 会对请求做一些预处理比如回话检查 ACL检查 也能识别出请求是否事务请求
ObserverRequestProcessorObserver的第一个处理器 事务请求(有可能改变服务器数据的请求)都会调用request方法转发到Leader服务器
FollowerRequestProcessorFollower的第一个处理器 事务请求(有可能改变服务器数据的请求)都会调用request方法转发到Leader服务器
SendAckRequestProcessorFollower服务器SyncRequestProcessor 的下一个处理器 用于反馈Leader服务器发送过来的Proposal
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值