kafka broker shutdown过程分析

kafka broker shutdown过程分析

controlled shutdown通过给controller发送命令实现停止指定broker

实现方式很诡异,controller并没有提供任何socket或者http方式开放接口,而是提供了一个 jmx bean,命令行工具通过jmx revoke方式调用controller中提供的接口shutdownBroker

val jmxUrl = newJMXServiceURL("service:jmx:rmi:///jndi/rmi://%s:%d/jmxrmi".format(controllerHost, controllerJmxPort))

info("Connecting to jmx url "+ jmxUrl)

val jmxc = JMXConnectorFactory.connect(jmxUrl,null)

val mbsc = jmxc.getMBeanServerConnection

val leaderPartitionsRemaining =mbsc.invoke(new ObjectName(KafkaController.MBeanName),

                                            "shutdownBroker",

                                             Array(params.brokerId),

                                             Array(classOf[Int].getName)).asInstanceOf[Set[TopicAndPartition]]

 

shutdown broker的逻辑

  • 检查请求的controller是否还存活
  • 检查此broker是否还存活,如果存活,在controllerContext中更新shuttingDownBrokerId列表;
  • 获取此broker上所有的partition;
  • 对所有的partition做处理,分两种情况:

     (1)partition的leader是此broker,调用 partitionStateMachine.handleStateChanges,

     (2)partition的leader不是此broker,给其发送stopReplicaRequest,并调用 replicaStateMachine.handleStateChanges

def shutdownBroker(id: Int) : Set[TopicAndPartition]= {

    if(!isActive()) {

      thrownew ControllerMovedException("Controller moved to another broker. Aborting controlled shutdown")

    }

    controllerContext.brokerShutdownLock synchronized {

      info("Shutting down broker "+ id)

      inLock(controllerContext.controllerLock) {

        if(!controllerContext.liveOrShuttingDownBrokerIds.contains(id))

          thrownew BrokerNotAvailableException("Broker id %d does not exist.".format(id))

        controllerContext.shuttingDownBrokerIds.add(id)

        debug("All shutting down brokers: "+ controllerContext.shuttingDownBrokerIds.mkString(","))

        debug("Live brokers: "+ controllerContext.liveBrokerIds.mkString(","))

      }

      //获取此broker上所有partition的副本因子

      valallPartitionsAndReplicationFactorOnBroker:Set[(TopicAndPartition, Int)] =

        inLock(controllerContext.controllerLock) {

          controllerContext.partitionsOnBroker(id)

            .map(topicAndPartition => (topicAndPartition, controllerContext.partitionReplicaAssignment(topicAndPartition).size))

        }

       

      allPartitionsAndReplicationFactorOnBroker.foreach {

        case(topicAndPartition, replicationFactor)=>

        // Move leadership serially to relinquish lock.

        inLock(controllerContext.controllerLock) {

          controllerContext.partitionLeadershipInfo.get(topicAndPartition).foreach { currLeaderIsrAndControllerEpoch=>

            if(currLeaderIsrAndControllerEpoch.leaderAndIsr.leader ==id) {

              // If the broker leads the topic partition, transition the leader and update isr. Updates zk and

              // notifies all affected brokers

              partitionStateMachine.handleStateChanges(Set(topicAndPartition), OnlinePartition,

                controlledShutdownPartitionLeaderSelector)

            }

            else {

              // Stop the replica first. The state change below initiates ZK changes which should take some time

              // before which the stop replica request should be completed (in most cases)

              // all requests are send in batch group by broker

              brokerRequestBatch.newBatch()

              brokerRequestBatch.addStopReplicaRequestForBrokers(Seq(id), topicAndPartition.topic,

                topicAndPartition.partition, deletePartition= false)

              brokerRequestBatch.sendRequestsToBrokers(epoch, controllerContext.correlationId.getAndIncrement)

              // If the broker is a follower, updates the isr in ZK and notifies the current leader

              replicaStateMachine.handleStateChanges(Set(PartitionAndReplica(topicAndPartition.topic,

                topicAndPartition.partition, id)), OfflineReplica)

            }

          }

        }

      }

      defreplicatedPartitionsBrokerLeads() =inLock(controllerContext.controllerLock) {

        trace("All leaders = "+ controllerContext.partitionLeadershipInfo.mkString(","))

        controllerContext.partitionLeadershipInfo.filter {

          case(topicAndPartition, leaderIsrAndControllerEpoch) =>

            leaderIsrAndControllerEpoch.leaderAndIsr.leader== id && controllerContext.partitionReplicaAssignment(topicAndPartition).size >1

        }.map(_._1)

      }

      replicatedPartitionsBrokerLeads().toSet

    }

  }

 

partitionStateMachine.handleStateChanges 处理逻辑

  • 从zk获取partition的controller epoch,防止controller发生变化,已经被其他controller更新了partition信息;
  • 读取zk上partition的信息,从当前isr列表里清除已经shuttingDown的broker,然后选取第一个broker作为leader,返回partition最新的状态信息(leader, isr, 存活的replicas);
  • 使用新的partition信息更新zk上partition的信息;
  • 更新controllerContext中缓存的partition信息;
  • 更新partitionStateMachine中的partition状态(onlineState)
  • 发送新的leaderAndIsrRequest给此partition当前可用的replica(通知它们新的leader是谁),发送updateMetaRequest给所有broker (此过程失败可能导致其他broker上的状态不一致,需要再次触发state change才行,处于TODO状态

replicaStateMachine.handleStateChanges 处理逻辑

  • 给此broker发送stopReplicaRequest
  • 调用controller.removeReplicaFromIsr,从zk读取当前partition的状态,从isr中移除此broker,并更新zk信息(如果leader为此broker,则新leader被置为-1,代表没有leader,为什么没有选择isr中其他broker为leader?)
  • 发送leaderAndIsrRequest到此partition的leader,发送updateMetaRequest给所有broker
  • 更新ReplicaStateMachine中replicaState的状态

引用自陈尚安wiki

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值