记录groupMetadata信息的消息和记录消费offset位置的消息都是通过partitionFor()选择合适的分区,他们的会分到同一个__CONSUMER_OFFSETS中。
GroupMetadataManager提供了对于groupsCache集合的管理方法,这里特别介绍removeGroup。
把GroupMetadata对象删除,向__CONSUMER_OFFSET中写入一个value为空的消息作为删除标志。
def removeGroup(group: GroupMetadata) {
// guard this removal in case of concurrent access (e.g. if a delayed join completes with no members
// while the group is being removed due to coordinator emigration)
// 删除groupCache中的GroupMetadata
if (groupsCache.remove(group.groupId, group)) {
// Append the tombstone messages to the partition. It is okay if the replicas don't receive these (say,
// if we crash or leaders move) since the new leaders will still expire the consumers with heartbeat and
// retry removing this group.
// 获取consumerGroup在__CONSUMER_OFFSET中对应的分区ID
val groupPartition = partitionFor(group.groupId)
val (magicValue, timestamp) = getMessageFormatVersionAndTimestamp(groupPartition)
// 产生删除标志消息,注意value是null,key由groupID封装而来
val tombstone = new Message(bytes = null, key = GroupMetadataManager.groupMetadataKey(group.groupId),
timestamp = timestamp, magicValue = magicValue)
// 获取offsets Topic对应的Partition对象
val partitionOpt = replicaManager.getPartition(TopicConstants.GROUP_METADATA_TOPIC_NAME, groupPartition)
partitionOpt.foreach { partition =>
val appendPartition = TopicAndPartition(TopicConstants.GROUP_METADATA_TOPIC_NAME, groupPartition)
trace("Marking group %s as deleted.".format(group.groupId))
try {
// do not need to require acks since even if the tombstone is lost,
// it will be appended again by the new leader
// TODO KAFKA-2720: periodic purging instead of immediate removal of groups
//写入消息
partition.appendMessagesToLeader(new ByteBufferMessageSet(config.offsetsTopicCompressionCodec, tombstone))
} catch {
case t: Throwable =>
error("Failed to mark group %s as deleted in %s.".format(group.groupId, appendPartition), t)
// ignore and continue
}
}
}
}
GroupMetadataManager在初始化时会创建并且启动一个KafkaScheduler对象,周期性地调用的deleteExpiredOffsets()方法
deleteExpiredOffsets除了删除offsetsCache集合中对应的OffsetMetadata对象,还需要向__CONSUMER_OFFSET追加删除标志信息。
private def deleteExpiredOffsets() {
debug("Collecting expired offsets.")
val startMs = time.milliseconds()
// 过滤得到所有的过期的offsetAndMetadata
val numExpiredOffsetsRemoved = inWriteLock(offsetExpireLock) {
val expiredOffsets = offsetsCache.filter { case (groupTopicPartition, offsetAndMetadata) =>
offsetAndMetadata.expireTimestamp < startMs
}
debug("Found %d expired offsets.".format(expiredOffsets.size))
// delete the expired offsets from the table and generate tombstone messages to remove them from the log
val tombstonesForPartition = expiredOffsets.map { case (groupTopicAndPartition, offsetAndMetadata) =>
// 找到该groupID对应的储存offset消息的offsettopic分区的id
val offsetsPartition = partitionFor(groupTopicAndPartition.group)
trace("Removing expired offset and metadata for %s: %s".format(groupTopicAndPartition, offsetAndMetadata))
// 删除对应的offsetAndMetadata对象
offsetsCache.remove(groupTopicAndPartition)
// 获取消息的key
val commitKey = GroupMetadataManager.offsetCommitKey(groupTopicAndPartition.group,
groupTopicAndPartition.topicPartition.topic, groupTopicAndPartition.topicPartition.partition)
// 获取分区使用的magic和时间戳
val (magicValue, timestamp) = getMessageFormatVersionAndTimestamp(offsetsPartition)
// 返回的是对应partition的ID和一个删除标志消息
(offsetsPartition, new Message(bytes = null, key = commitKey, timestamp = timestamp, magicValue = magicValue))
}.groupBy { case (partition, tombstone) => partition }
// Append the tombstone messages to the offset partitions. It is okay if the replicas don't receive these (say,
// if we crash or leaders move) since the new leaders will get rid of expired offsets during their own purge cycles.
// __CONSUMER_OFFSET的一个分区可能记录多个group的offset信息,为了便于追加,按照offsets Topic的分区id进行一次分组
tombstonesForPartition.flatMap { case (offsetsPartition, tombstones) =>
val partitionOpt = replicaManager.getPartition(TopicConstants.GROUP_METADATA_TOPIC_NAME, offsetsPartition)
partitionOpt.map { partition =>
val appendPartition = TopicAndPartition(TopicConstants.GROUP_METADATA_TOPIC_NAME, offsetsPartition)
// 获取要追加到此offsettopic分区中的删除标志消息集合
val messages = tombstones.map(_._2).toSeq
trace("Marked %d offsets in %s for deletion.".format(messages.size, appendPartition))
try {
// do not need to require acks since even if the tombstone is lost,
// it will be appended again in the next purge cycle
// 追加删除标志信息
partition.appendMessagesToLeader(new ByteBufferMessageSet(config.offsetsTopicCompressionCodec, messages: _*))
tombstones.size
}
catch {
case t: Throwable =>
error("Failed to mark %d expired offsets for deletion in %s.".format(messages.size, appendPartition), t)
// ignore and continue
0
}
}
}.sum
}