Kafka的Replica分配策略之二 Replica变为0了怎么办

本文探讨了Kafka集群中当Replica变为0时的情况。在broker故障后,Kafka不会自动分配新的机器作为partition leader,而是将partition标记为无leader状态。实验显示,即使一个partition的ISR全部挂掉,该partition仍能提供服务,但若所有partition挂掉,整个topic将停止服务。解决方案是从剩余replica中选取新leader,且新加入的broker不会包含旧消息,需要从其他broker拉取。
摘要由CSDN通过智能技术生成
这一篇文章准备讨论当kafka集群的broker发生变化,诸如broker崩溃,退出时,kafka集群会如何分配该broker上的Replica和Partition.

在讨论这个问题之前,需要先搞清kafka集群中,leader与follower的分工.可以看我写的这篇文章 Kafka的leader选举过程

在之前介绍kafka的选举过程时,提到成功选举出的leader会向zookeeper注册各种监视其中

replicaStateMachine.registerListeners()    //"/brokers/ids" 重点,监视所有的follower的加入,离开集群的行为
这一句注册了对/brokers/ids的监视,跟进这句命令

private def registerBrokerChangeListener() = {
  zkUtils.zkClient.subscribeChildChanges(ZkUtils.BrokerIdsPath, brokerChangeListener)
}
来到了这里,ZkUtils.BrokerIdsPath就是/brokers/ids这个路径,那由此可以得知,重点就在brokerChangeListener上.这个listener被定义在kafka.controller包的ReplicaStateMachine.scala下.

class BrokerChangeListener() extends IZkChildListener with Logging {
  this.logIdent = "[BrokerChangeListener on Controller " + controller.config.brokerId + "]: "
  def handleChildChange(parentPath : String, currentBrokerList : java.util.List[String]) {
    info("Broker change listener fired for path %s with children %s".format(parentPath, currentBrokerList.sorted.mkString(",")))
    inLock(controllerContext.controllerLock) {
      if (hasStarted.get) {
        ControllerStats.leaderElectionTimer.time {
          try {
            val curBrokers = currentBrokerList.map(_.toInt).toSet.flatMap(zkUtils.getBrokerInfo)
            val curBrokerIds = curBrokers.map(_.id)
            val liveOrShuttingDownBrokerIds = controllerContext.liveOrShuttingDownBrokerIds
            val newBrokerIds = curBrokerIds -- liveOrShuttingDownBrokerIds
            val deadBrokerIds = liveOrShuttingDownBrokerIds -- curBrokerIds
            val newBrokers = curBrokers.filter(broker => newBrokerIds(broker.id))
            //上面几句很好理解,筛选出新加入的broker,与退出的broker
            controllerContext.liveBrokers = curBrokers
            val newBrokerIdsSorted = newBrokerIds.toSeq.sorted
            val deadBrokerIdsSorted = deadBrokerIds.toSeq.sorted
            val liveBrokerIdsSorted = curBrokerIds.toSeq.sorted
            info("Newly added brokers: %s, deleted brokers: %s, all live brokers: %s"
              .format(newBrokerIdsSorted.mkString(","), deadBrokerIdsSorted.mkString(","), liveBrokerIdsSorted.mkString(",")))
            newBrokers.foreach(controllerContext.controllerChannelManager.addBroker)
            deadBrokerIds.foreach(controllerContext.controllerChannelManager.removeBroker)
            //上面两句维护存有broker信息的map
            if(newBrokerIds.nonEmpty)
              controller.onBrokerStartup(newBrokerIdsSorted)
            if(deadBrokerIds.nonEmpty)
              //这一句是重点,如何处理failbroker
              controller.onBrokerFailure(deadBrokerIdsSorted)
          } catch {
            case e: Throwable => error("Error while handling broker changes", e)
          }
        }
      }
    }
  }
}

可以看到,针对broker变化这一情况.Kafka controller从znode节点的变化,推测出了新加入与新离开的节点.对于离开的节点调用了onBrokerFailure函数.

继续跟进这里只截取了部分onBrokerFailure的源码,一段一段来分析.

def onBrokerFailure(deadBrokers: Seq[Int]){
.....
....
val deadBrokersSet = deadBrokers.toSet
// trigger OfflinePartition state for all partitions whose current leader is one amongst the dead brokers
//筛选出deadbroker中所有担任partition leader
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值