原文链接:https://fxbing.github.io/2021/06/05/kafka%E6%BA%90%E7%A0%81%E5%AD%A6%E4%B9%A0%EF%BC%9AKafkaApis-LEADER-AND-ISR/
本文源码基于kafka 0.10.2版本
每当controller发生状态变更时,都会通过调用sendRequestsToBrokers
方法发送leaderAndIsrRequest
请求,本文主要介绍kafka服务端处理该请求的逻辑和过程。
LEADER_AND_ISR
整体逻辑流程
case ApiKeys.LEADER_AND_ISR => handleLeaderAndIsrRequest(request)
在server端收到LEADER_AND_ISR请求后,会调用handleLeaderAndIsrRequest
方法进行处理,该方法的处理流程如图所示:
源码
handleLeaderAndIsrRequest
handleLeaderAndIsrRequest
函数的逻辑结果主要分为以下几个部分:
- 构造callback函数
onLeadershipChange
,用来回调coordinator处理新增的leader或者follower节点 - 校验请求权限,如果校验成功调用
replicaManager.becomeLeaderOrFollower(correlationId, leaderAndIsrRequest, metadataCache, onLeadershipChange)
进行后续处理【此处该函数的主流程】,否则,直接返回错误码Errors.CLUSTER_AUTHORIZATION_FAILED.code
def handleLeaderAndIsrRequest(request: RequestChannel.Request) {
// ensureTopicExists is only for client facing requests
// We can't have the ensureTopicExists check here since the controller sends it as an advisory to all brokers so they
// stop serving data to clients for the topic being deleted
val correlationId = request.header.correlationId
val leaderAndIsrRequest = request.body.asInstanceOf[LeaderAndIsrRequest]
try {
def onLeadershipChange(updatedLeaders: Iterable[Partition], updatedFollowers: Iterable[Partition]) {
// for each new leader or follower, call coordinator to handle consumer group migration.
// this callback is invoked under the replica state change lock to ensure proper order of
// leadership changes
updatedLeaders.foreach {
partition =>
if (partition.topic == Topic.GroupMetadataTopicName)
coordinator.handleGroupImmigration(partition.partitionId)
}
updatedFollowers.foreach {
partition =>
if (partition.topic == Topic.GroupMetadataTopicName)
coordinator.handleGroupEmigration(partition.partitionId)
}
}
val leaderAndIsrResponse =
if (authorize(request.session, ClusterAction, Resource.ClusterResource)) {
val result = replicaManager.becomeLeaderOrFollower(correlationId, leaderAndIsrRequest, metadataCache, onLeadershipChange)
new LeaderAndIsrResponse(result.errorCode, result.responseMap.mapValues(new JShort(_)).asJava)
} else {
val result = leaderAndIsrRequest.partitionStates.asScala.keys.map((_, new JShort(Errors.CLUSTER_AUTHORIZATION_FAILED.code))).toMap
new LeaderAndIsrResponse(Errors.CLUSTER_AUTHORIZATION_FAILED.code, result.asJava)
}
requestChannel.sendResponse(new Response(request, leaderAndIsrResponse))
} catch {
case e: KafkaStorageException =>
fatal("Disk error during leadership change.", e)
Runtime.getRuntime.halt(1)
}
}
becomeLeaderOrFollower
ReplicaManager
的主要工作有以下几个部分,具体代码位置见中文注释:
- 校验controller epoch是否合规,只处理比自己epoch大且本地有副本的tp的请求
- 调用
makeLeaders
和makeFollowers
方法构造新增的leader partition和follower partition【此处为主要逻辑,后面小结详细介绍】 - 如果是第一次收到请求,启动定时更新hw的线程
- 停掉空的Fetcher线程
- 调用回调函数,coordinator处理新增的leader partition和follower partition
def becomeLeaderOrFollower(correlationId: Int,leaderAndISRRequest: LeaderAndIsrRequest,
metadataCache: MetadataCache,
onLeadershipChange: (Iterable[Partition], Iterable[Partition]) => Unit): BecomeLeaderOrFollowerResult = {
leaderAndISRRequest.partitionStates.asScala.foreach {
case (topicPartition, stateInfo) =>
stateChangeLogger.trace("Broker %d received LeaderAndIsr request %s correlation id %d from controller %d epoch %d for partition [%s,%d]"
.format(localBrokerId, stateInfo, correlationId,
leaderAndISRRequest.controllerId, leaderAndISRRequest.controllerEpoch