Controller 注册TopicDeletionListener监听器,监听ZK的/admin/delete_topics节点,定义了 TopicDeletionHandler,用它来实现对删除主题的监听,用来实际执行删除Topic的动作。
class TopicDeletionHandler(eventManager: ControllerEventManager) extends ZNodeChildChangeHandler {
// ZooKeeper节点:/admin/delete_topics
override val path: String = DeleteTopicsZNode.path
// 向事件队列写入TopicDeletion事件
override def handleChildChange(): Unit = eventManager.put(TopicDeletion)
}
删除流程:
- 删除主题都是在 /admin/delete_topics 节点下创建名为待删除主题名的子节点,比如/admin/delete_topics/xxx 节点
- 一旦监听到该节点被创建,TopicDeletionHandler 的 handleChildChange 方法就会被触发
- Controller 会向事件队列写入 TopicDeletion 事件
- 由EventExecuteThread执行processTopicDeletion方法
private def processTopicDeletion(): Unit = {
if (!isActive) return
// 从ZooKeeper中获取待删除主题列表
var topicsToBeDeleted = zkClient.getTopicDeletions.toSet
debug(s"Delete topics listener fired for topics ${topicsToBeDeleted.mkString(",")} to be deleted")
// 找出不存在的主题列表
val nonExistentTopics = topicsToBeDeleted -- controllerContext.allTopics
if (nonExistentTopics.nonEmpty) {
warn(s"Ignoring request to delete non-existing topics ${nonExistentTopics.mkString(",")}")
zkClient.deleteTopicDeletions(nonExistentTopics.toSeq, controllerContext.epochZkVersion)
}
topicsToBeDeleted --= nonExistentTopics
// 如果delete.topic.enable参数设置成true
if (config.deleteTopicEnable) {
if (topicsToBeDeleted.nonEmpty) {
info(s"Starting topic deletion for topics ${topicsToBeDeleted.mkString(",")}")
topicsToBeDeleted.foreach { topic =>
val partitionReassignmentInProgress = controllerContext.partitionsBeingReassigned.map(_.topic).contains(topic)
if (partitionReassignmentInProgress)
topicDeletionManager.markTopicIneligibleForDeletion(
Set(topic), reason = "topic reassignment in progress")
}
// 将待删除主题插入到删除等待集合交由TopicDeletionManager处理
topicDeletionManager.enqueueTopicsForDeletion(topicsToBeDeleted)
}
} else { // 不允许删除主题
info(s"Removing $topicsToBeDeleted since delete topic is disabled")
// 清除ZooKeeper下/admin/delete_topics下的子节点
zkClient.deleteTopicDeletions(topicsToBeDeleted.toSeq, controllerContext.epochZkVersion)
}
}
- 方法内首页获取ZooKeeper 的 /admin/delete_topics 下子节点列表(待删除主题列表)
- 与元数据缓存中的主题列表比较,找出不存在的主题列表,删除 /admin/delete_topics 下对应的子节点
- 对于已存在的主题列表,检查 Broker 端参数 delete.topic.enable 的值
- 如果该参数为 false,即不允许删除主题,删除 /admin/delete_topics 下对应的子节点
- 如果该参数为 true,遍历待删除主题列表,将那些正在执行分区迁移的主题暂时设置成“不可删除”状态;把待删除主题列表中的主题交由 TopicDeletionManager,由它执行真正的删除逻辑
class TopicDeletionManager(config: KafkaConfig, // KafkaConfig类,保存Broker端参数
controllerContext: ControllerContext, // 集群元数据
replicaStateMachine: ReplicaStateMachine, // 副本状态机,用于设置副本状态
partitionStateMachine: PartitionStateMachine, // 分区状态机,用于设置分区状态
client: DeletionClient // 负责实现删除主题以及后续的动作,比如更新元数据等。
// 这个接口里定义了 4 个方法,分别是 deleteTopic、deleteTopicDeletions、mutePartitionModifications 和 sendMetadataUpdate。
// ControllerDeletionClient 实现 DeletionClient 接口的类,分别实现了刚刚说到的那 4 个方法。
) extends Logging {
this.logIdent = s"[Topic Deletion Manager ${config.brokerId}] "
// 是否允许删除主题
val isDeleteTopicEnabled: Boolean = config.deleteTopicEnable
}
class ControllerDeletionClient(controller: KafkaController, zkClient: KafkaZkClient) extends DeletionClient {
// 删除给定主题,用于删除主题在 ZooKeeper 上的所有“痕迹”
// 分别调用 KafkaZkClient 的 3 个方法去删除 ZooKeeper 下 /brokers/topics/节点、/config/topics/节点和 /admin/delete_topics/节点。
override def deleteTopic(topic: String, epochZkVersion: Int): Unit = {
// 删除/brokers/topics/<topic>节点
zkClient.deleteTopicZNode(topic, epochZkVersion)
// 删除/config/topics/<topic>节点
zkClient.deleteTopicConfigs(Seq(topic), epochZkVersion)
// 删除/admin/delete_topics/<topic>节点
zkClient.deleteTopicDeletions(Seq(topic), epochZkVersion)
}
// 删除/admin/delete_topics下的给定topic子节点
override def deleteTopicDeletions(topics: Seq[String], epochZkVersion: Int): Unit = {
zkClient.deleteTopicDeletions(topics, epochZkVersion)
}
// 取消/brokers/topics/<topic>节点数据变更的监听
// 具体实现原理其实就是取消 /brokers/topics/节点数据变更的监听。
override def mutePartitionModifications(topic: String): Unit = {
controller.unregisterPartitionModificationsHandlers(Seq(topic))
}
// 向集群Broker发送指定分区的元数据更新请求
// 给集群所有 Broker 发送更新请求,告诉它们不要再为已删除主题的分区提供服务
override def sendMetadataUpdate(partitions: Set[TopicPartition]): Unit = {
// 给集群所有Broker发送UpdateMetadataRequest
// 通知它们给定partitions的状态变化
controller.sendUpdateMetadataRequest(controller.controllerContext.liveOrShuttingDownBrokerIds.toSeq, partitions)
}
}
- 删除给定主题,用于删除主题在 ZooKeeper 上的所有“痕迹;删除 ZooKeeper 下 /brokers/topics/节点、/config/topics/节点和 /admin/delete_topics/节点
- 取消/brokers/topics/节点数据变更的监听
- 向集群Broker发送UpdateMetadataRequest元数据更新请求,通知它们给定partitions的状态变化,告诉它们不要再为已删除主题的分区提供服务
- 若主题因为某些事件可能一时无法完成删除,比如主题分区正在进行副本重分配等,则调用 resumeDeletions 重启删除操作
private def resumeDeletions(): Unit = {
// 从元数据缓存中获取要删除的主题列表
val topicsQueuedForDeletion = Set.empty[String] ++ controllerContext.topicsToBeDeleted
if (topicsQueuedForDeletion.nonEmpty)
info(s"Handling deletion for topics ${topicsQueuedForDeletion.mkString(",")}")
topicsQueuedForDeletion.foreach { topic =>
// if all replicas are marked as deleted successfully, then topic deletion is done
if (controllerContext.areAllReplicasInState(topic, ReplicaDeletionSuccessful)) {
// 如果该主题所有副本已经是ReplicaDeletionSuccessful状态
// 即该主题已经被删除
// clear up all state for this topic from controller cache and zookeeper
// 调用completeDeleteTopic方法完成后续操作即可
completeDeleteTopic(topic)
info(s"Deletion of topic $topic successfully completed")
} else if (!controllerContext.isAnyReplicaInState(topic, ReplicaDeletionStarted)) {
// 如果主题删除尚未开始并且主题当前无法执行删除的话
// if you come here, then no replica is in TopicDeletionStarted and all replicas are not in
// TopicDeletionSuccessful. That means, that either given topic haven't initiated deletion
// or there is at least one failed replica (which means topic deletion should be retried).
if (controllerContext.isAnyReplicaInState(topic, ReplicaDeletionIneligible)) {
// 把该主题加到待重试主题列表中用于后续重试
retryDeletionForIneligibleReplicas(topic)
}
}
// Try delete topic if it is eligible for deletion.
if (isTopicEligibleForDeletion(topic)) {
// 如果该主题能够被删除
info(s"Deletion of topic $topic (re)started")
// topic deletion will be kicked off
onTopicDeletion(Set(topic))
}
}
}
- 如果该主题所有副本已经是被删除状态,调用completeDeleteTopic方法完成后续操作,清除元数据和zk状态;反之加到重试列表后续进行重试
private def completeDeleteTopic(topic: String): Unit = {
// deregister partition change listener on the deleted topic. This is to prevent the partition change listener
// firing before the new topic listener when a deleted topic gets auto created
// 第1步:注销分区变更监听器,防止删除过程中因分区数据变更
// 导致监听器被触发,引起状态不一致
client.mutePartitionModifications(topic)
// 第2步:获取该主题下处于ReplicaDeletionSuccessful状态的所有副本对象,
// 即所有已经被成功删除的副本对象
val replicasForDeletedTopic = controllerContext.replicasInState(topic, ReplicaDeletionSuccessful)
// controller will remove this replica from the state machine as well as its partition assignment cache
// 第3步:利用副本状态机将这些副本对象转换成NonExistentReplica状态。
// 等同于在状态机中删除这些副本
replicaStateMachine.handleStateChanges(replicasForDeletedTopic.toSeq, NonExistentReplica)
// 第4步:更新元数据缓存中的待删除主题列表和已开始删除的主题列表
// 因为主题已经成功删除了,没有必要出现在这两个列表中了
controllerContext.topicsToBeDeleted -= topic
controllerContext.topicsWithDeletionStarted -= topic
// 第5步:移除ZooKeeper上关于该主题的一切“痕迹”
client.deleteTopic(topic, controllerContext.epochZkVersion)
// 第6步:移除元数据缓存中关于该主题的一切“痕迹”
controllerContext.removeTopic(topic)
}
- 找出给定待删除主题列表中那些尚未开启删除操作的所有主题, 开启主题删除操作;进行分区删除操作,将物理磁盘文件删除
private def onTopicDeletion(topics: Set[String]): Unit = {
info(s"Topic deletion callback for ${topics.mkString(",")}")
// send update metadata so that brokers stop serving data for topics to be deleted
val partitions = topics.flatMap(controllerContext.partitionsForTopic)
// 找出给定待删除主题列表中那些尚未开启删除操作的所有主题
val unseenTopicsForDeletion = topics -- controllerContext.topicsWithDeletionStarted
if (unseenTopicsForDeletion.nonEmpty) {
// 获取到这些主题的所有分区对象
val unseenPartitionsForDeletion = unseenTopicsForDeletion.flatMap(controllerContext.partitionsForTopic)
// 将这些分区的状态依次调整成OfflinePartition和NonExistentPartition
// 等同于将这些分区从分区状态机中删除
partitionStateMachine.handleStateChanges(unseenPartitionsForDeletion.toSeq, OfflinePartition)
partitionStateMachine.handleStateChanges(unseenPartitionsForDeletion.toSeq, NonExistentPartition)
// adding of unseenTopicsForDeletion to topics with deletion started must be done after the partition
// state changes to make sure the offlinePartitionCount metric is properly updated
// 把这些主题加到“已开启删除操作”主题列表中
controllerContext.beginTopicDeletion(unseenTopicsForDeletion)
}
// 给集群所有Broker发送元数据更新请求,告诉它们不要再为这些主题处理数据了
client.sendMetadataUpdate(partitions)
topics.foreach { topic =>
// 分区删除操作会执行底层的物理磁盘文件删除动作
onPartitionDeletion(controllerContext.partitionsForTopic(topic))
}
}