zookeeper源码学习(二)——选举

本文深入探讨Zookeeper的选举机制,详细解析WorkerSender、WorkerReceiver、WorkerReceiver和选举状态(LOOKING、OBSERVING、FOLLOWING、LEADING)。重点讲解了FastLeaderElection算法在选举过程中的逻辑时钟更新、选票比较和投票策略,以及不同状态下的服务器行为。
摘要由CSDN通过智能技术生成

zookeeper源码学习(一)——启动流程

首先,来回顾一下之前说FastLeaderElection 执行start方法,其实就是启动两个线程WorkerSender、WorkerReceiver这两分别用于向其他所有节点发送自己的投票信息、接收并处理其他节点发送给自己的投票信息的线程。我们来具体看看这两个线程的run方法具体做了什么吧。

WorkerSender

首先看WorkerSender#run

public void run() {
   
                while (!stop) {
   
                    try {
   
                        ToSend m = sendqueue.poll(3000, TimeUnit.MILLISECONDS);
                        if(m == null) continue;

                        process(m);
                    } catch (InterruptedException e) {
   
                        break;
                    }
                }
                LOG.info("WorkerSender is down");
            }

从sendqueue队列中取出ToSend信息,不为空则调用process方法进行处理FastLeaderElection.Messenger.WorkerSender#process

void process(ToSend m) {
   
                ByteBuffer requestBuffer = buildMsg(m.state.ordinal(),
                                                    m.leader,
                                                    m.zxid,
                                                    m.electionEpoch,
                                                    m.peerEpoch,
                                                    m.configData);

                manager.toSend(m.sid, requestBuffer);

            }

将ToSend转换为需要的消息格式,委托给QuorumCnxManager的toSend进行处理QuorumCnxManager#toSend

 public void toSend(Long sid, ByteBuffer b) {
   
        /*
         * If sending message to myself, then simply enqueue it (loopback).
         */
        if (this.mySid == sid) {
   
             b.position(0);
             addToRecvQueue(new Message(b.duplicate(), sid));
            /*
             * Otherwise send to the corresponding thread to send.
             */
        } else {
   
             /*
              * Start a new connection if doesn't have one already.
              */
              //这个SEND_CAPACITY的大小是1,所以如果之前已经有一个还在等待发送,则会把之前的一个删除掉,发送新的
             ArrayBlockingQueue<ByteBuffer> bq = new ArrayBlockingQueue<ByteBuffer>(
                SEND_CAPACITY);
             ArrayBlockingQueue<ByteBuffer> oldq = queueSendMap.putIfAbsent(sid, bq);
             if (oldq != null) {
   
                 addToSendQueue(oldq, b);
             } else {
   
                 addToSendQueue(bq, b);
             }
             connectOne(sid);
                
        }
    }

先判断推选者id是否是自己,如果是则直接放入recvQueue接受投票请求的队列等待线程 读取,如果队列已满,则此方法从队列的开头删除一个元素,然后在队列的末尾插入该元素。如果不是自己,将消息放入对应sid发送队列中,最终会调用QuorumCnxManager#connectOne()方法

synchronized void connectOne(long sid){
   
        if (senderWorkerMap.get(sid) != null) {
   
            LOG.debug("There is a connection already for server " + sid);
            return;
        }
        synchronized (self.QV_LOCK) {
   
            boolean knownId = false;
            // Resolve hostname for the remote server before attempting to
            // connect in case the underlying ip address has changed.
            self.recreateSocketAddresses(sid);
            Map<Long, QuorumPeer.QuorumServer> lastCommittedView = self.getView();
            QuorumVerifier lastSeenQV = self.getLastSeenQuorumVerifier();
            Map<Long, QuorumPeer.QuorumServer> lastProposedView = lastSeenQV.getAllMembers();
            if (lastCommittedView.containsKey(sid)) {
   
                knownId = true;
                if (connectOne(sid, lastCommittedView.get(sid).electionAddr))
                    return;
            }
            if (lastSeenQV != null && lastProposedView.containsKey(sid)
                    && (!knownId || (lastProposedView.get(sid).electionAddr !=
                    lastCommittedView.get(sid).electionAddr))) {
   
                knownId = true;
                if (connectOne(sid, lastProposedView.get(sid).electionAddr))
                    return;
            }
            if (!knownId) {
   
                LOG.warn("Invalid server id: " + sid);
                return;
            }
        }
    }

首先从senderWorkerMap中获取,是否已经连接过这个id的服务。如果没有会去调用QuorumCnxManager#connectOne(long, java.net.InetSocketAddress)方法去连接,在建立好socket后,再调用同步QuorumCnxManager#initiateConnection方法

public void initiateConnection(final Socket sock, final Long sid) {
   
        try {
   
            startConnection(sock, sid);
        } catch (IOException e) {
   
            LOG.error("Exception while connecting, id: {}, addr: {}, closing learner connection",
                    new Object[] {
    sid, sock.getRemoteSocketAddress() }, e);
            closeSocket(sock);
            return;
        }
    }


 private boolean startConnection(Socket sock, Long sid)
            throws IOException {
   
        DataOutputStream dout = null;
        DataInputStream din = null;
        try {
   
            // Use BufferedOutputStream to reduce the number of IP packets. This is
            // important for x-DC scenarios.
            BufferedOutputStream buf = new BufferedOutputStream(sock.getOutputStream());
            dout = new DataOutputStream(buf);

            // Sending id and challenge
            // represents protocol version (in other words - message type)
            dout.writeLong(PROTOCOL_VERSION);
            dout.writeLong(self.getId());
            String addr = self.getElectionAddress().getHostString() + ":" + self.getElectionAddress().getPort();
            byte[] addr_bytes = addr.getBytes();
            dout.writeInt(addr_bytes.length);
            dout.write(addr_bytes);
            dout.flush();

            din = new DataInputStream(
                    new BufferedInputStream(sock.getInputStream()));
        } catch (IOException e) {
   
            LOG.warn("Ignoring exception reading or writing challenge: ", e);
            closeSocket(sock);
            return false;
        }

        // authenticate learner
        QuorumPeer.QuorumServer qps = self.getVotingView().get(sid);
        if (qps != null) {
   
            // TODO - investigate why reconfig makes qps null.
            authLearner.authenticate(sock, qps.hostname);
        }

        // If lost the challenge, then drop the new connection
        if (sid > self.getId()) {
   
            LOG.info("Have smaller server identifier, so dropping the " +
                    "connection: (" + sid + ", " + self.getId() + ")");
            closeSocket(sock);
            // Otherwise proceed with the connection
        } else {
   
            SendWorker sw = new SendWorker(sock, sid);
            RecvWorker rw = new RecvWorker(sock, din, sid, sw);
            sw.setRecv(rw);

            SendWorker vsw = senderWorkerMap.get(sid);

            if(vsw != null)
                vsw.finish();

            senderWorkerMap.put(sid, sw);
            queueSendMap.putIfAbsent(sid, new ArrayBlockingQueue<ByteBuffer>(
                    SEND_CAPACITY));

            sw.start();
            rw.start();

            return true;

        }
        
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值