kafka服务端消费组管理器的源码位于kafka.coordinator.group包下
核心解析
1.GroupMetaManager包含了多个Group
groupMetadataCache = new Pool[String, GroupMetadata]
2.GroupMetadata包含了多个member和offsets信息
members = new mutable.HashMap[String, MemberMetadata]
offsets = new mutable.HashMap[TopicPartition, CommitRecordMetadataAndOffset]
3.MemberMetadata包含了
member的信息,包括member的id,host等
分为四个部分分析
1.元数据 MemberMetadata.scala & GroupMetadata.scala
2.元数据管理 GroupMetadataManager.scala
3.__consumer_offsets分析
4.组管理器 GroupCoordinator
第一部分:MemberMetadata.scala & GroupMetadata.scala
一.MemberMetadata.scala
字段如下:
private[group] class MemberMetadata(var memberId: String,
val groupId: String,
val groupInstanceId: Option[String],
val clientId: String,
val clientHost: String,
val rebalanceTimeoutMs: Int,
val sessionTimeoutMs: Int,
val protocolType: String,
var supportedProtocols: List[(String, Array[Byte])]) {
var assignment: Array[Byte] = Array.empty[Byte]
var awaitingJoinCallback: JoinGroupResult => Unit = null
var awaitingSyncCallback: SyncGroupResult => Unit = null
var isLeaving: Boolean = false
var isNew: Boolean = false
val isStaticMember: Boolean = groupInstanceId.isDefined
var heartbeatSatisfied: Boolean = false
从以上字段可以看出,这个类代表了一个Group里的一个成员(一个消费者),只需关注基础的核心字段即可
var memberId: String
val groupId: String
val clientId: String
val clientHost: String
var supportedProtocols: List[(String, Array[Byte])] 支持的partition分配方式例如轮询
var assignment: Array[Byte] 分配给这个成员的partition
is开头的几个变量见名知意,代表了这个成员的状态
这个类的方法都是对这几个成员变量的增删改查,以一个方法为例
/**
* Vote for one of the potential group protocols. This takes into account the protocol preference as
* indicated by the order of supported protocols and returns the first one also contained in the set
*/
def vote(candidates: Set[String]): String = {
supportedProtocols.find({ case (protocol, _) => candidates.contains(protocol)}) match {
case Some((protocol, _)) => protocol
case None =>
throw new IllegalArgumentException("Member does not support any of the candidate protocols")
}
}
以上方法为partition分配的投票算法
从supportedProtocols中按顺序找,选其中找到的第一个在candidates里的分配方法
二.GroupMetadata.scala
1.组状态相关类,每种状态下回复不同请求,和状态切换英文注释解释的很清楚
Empty 表示当前无成员的消费者组;PreparingRebalance 表示正在执行加入组操作的消费者组;CompletingRebalance 表示等待 Leader 成员制定分配方案的消费者组;Stable 表示已完成 Rebalance 操作可正常工作的消费者组;Dead 表示当前无成员且元数据信息被删除的消费者组。
每个状态类的唯一成员变量val validPreviousStates代表了合法的前置状态
//基础接口
private[group] sealed trait GroupState {
val validPreviousStates: Set[GroupState]
}
//准备rebalance
/**
* Group is preparing to rebalance
*
* action: respond to heartbeats with REBALANCE_IN_PROGRESS
* respond to sync group with REBALANCE_IN_PROGRESS
* remove member on leave group request
* park join group requests from new or existing members until all expected members have joined
* allow offset commits from previous generation
* allow offset fetch requests
* transition: some members have joined by the timeout => CompletingRebalance
* all members have left the group => Empty
* group is removed by partition emigration => Dead
*/
private[group] case object PreparingRebalance extends GroupState {
val validPreviousStates: Set[GroupState] = Set(Stable, CompletingRebalance, Empty)
}
//rebalance完成,等待leader发送分配方案
/**
* Group is awaiting state assignment from the leader
*
* action: respond to heartbeats with REBALANCE_IN_PROGRESS
* respond to offset commits with REBALANCE_IN_PROGRESS
* park sync group requests from followers until transition to Stable
* allow offset fetch requests
* transition: sync group with state assignment received from leader => Stable
* join group from new member or existing member with updated metadata => PreparingRebalance
* leave group from existing member => PreparingRebalance
* member failure detected => PreparingRebalance
* group is removed by partition emigration => Dead
*/
private[group] case object CompletingRebalance extends GroupState {
val validPreviousStates: Set[GroupState] = Set(PreparingRebalance)
}
//稳定状态
/**
* Group is stable
*
* action: respond to member heartbeats normally
* respond to sync group from any member with current assignment
* respond to join group from followers with matching metadata with current group metadata
* allow offset commits from member of current generation
* allow offset fetch requests
* transition: member failure detected via heartbeat => PreparingRebalance
* leave group from existing member => PreparingRebalance
* leader join-group received => PreparingRebalance
* follower join-group with new metadata => PreparingRebalance
* group is removed by partition emigration => Dead
*/
private[group] case object Stable extends GroupState {
val validPreviousStates: Set[GroupState] = Set(CompletingRebalance)
}
//死亡状态,等待删除,不能加入成员
/**
* Group has no more members and its metadata is being removed
*
* action: respond to join group with UNKNOWN_MEMBER_ID
* respond to sync group with UNKNOWN_MEMBER_ID
* respond to heartbeat with UNKNOWN_MEMBER_ID
* respond to leave group with UNKNOWN_MEMBER_ID
* respond to offset commit with UNKNOWN_MEMBER_ID
* allow offset fetch requests
* transition: Dead is a final state before group metadata is cleaned up, so there are no transitions
*/
private[group] case object Dead extends GroupState {
val validPreviousStates: Set[GroupState] = Set(Stable, PreparingRebalance, CompletingRebalance, Empty, Dead)
}
//没有成员,注册信息没有过期,还可以在加入成员
/**
* Group has no more members, but lingers until all offsets have expired. This state
* also represents groups which use Kafka only for offset commits and have no members.
*
* action: respond normally to join group from new members
* respond to sync group with UNKNOWN_MEMBER_ID
* respond to heartbeat with UNKNOWN_MEMBER_ID
* respond to leave group with UNKNOWN_MEMBER_ID
* respond to offset commit with UNKNOWN_MEMBER_ID
* allow offset fetch requests
* transition: last offsets removed in periodic expiration task => Dead
* join group from a new member => PreparingRebalance
* group is removed by partition emigration => Dead
* group is removed by expiration => Dead
*/
private[group] case object Empty extends GroupState {
val validPreviousStates: Set[GroupState] = Set(PreparingRebalance)
}
2.组状态核心类GroupMetadata
加粗的两行为最核心的数据
private[group] class GroupMetadata{
val groupId: String //组id
initialState: GroupState //初始状态
time: Time //初始时间){
type JoinCallback = JoinGroupResult => Unit //join请求的callback
private[group] val lock = new ReentrantLock //锁
private var state: GroupState = initialState
var currentStateTimestamp: Option[Long] = Some(time.milliseconds())
var protocolType: Option[String] = None
var protocolName: Option[String] = None
var generationId = 0 //每次rebalance会+1
private var leaderId: Option[String] = None //leader的id
private val members = new mutable.HashMap[String, MemberMetadata] //最核心的,保存了成员的元数据
private val supportedProtocols = new mutable.HashMap[String, Integer]().withDefaultValue(0)//保存分区分配策略的支持票数
private val offsets = new mutable.HashMap[TopicPartition, CommitRecordMetadataAndOffset]//offsets的提交位移值
// When protocolType == `consumer`, a set of subscribed topics is maintained. The set is
// computed when a new generation is created or when the group is restored from the log.
private var subscribedTopics: Option[Set[String]] = None //消费者订阅的主题
var newMemberAdded: Boolean = false
....
}
3.消费组状态管理
1)状态转换
def transitionTo(groupState: GroupState): Unit = {
assertValidTransition(groupState)//检测前置状态是否正确
state = groupState
currentStateTimestamp = Some(time.milliseconds())
}
private def assertValidTransition(targetState: GroupState): Unit = {
if (!targetState.validPreviousStates.contains(state))
throw new IllegalStateException("Group %s should be in the %s states before moving to %s state. Instead it is in %s state"
.format(groupId, targetState.validPreviousStates.mkString(","), targetState, state))
}
2)查询状态
def currentState = state
3)是否可以rebalance
def canRebalance = PreparingRebalance.validPreviousStates.contains(state)
4.消费组成员管理
核心就是对GroupMetadata的members变量进行增删改查
1)add
def add(member: MemberMetadata, callback: JoinCallback = null): Unit = {
if (members.isEmpty)
this.protocolType = Some(member.protocolType)
assert(groupId == member.groupId)
assert(this.protocolType.orNull == member.protocolType)
// 确保该成员选定的分区分配策略与组选定的分区分配策略相匹配
assert(supportsProtocols(member.protocolType, MemberMetadata.plainProtocolSet(member.supportedProtocols)))
if (leaderId.isEmpty)
leaderId = Some(member.memberId)
members.put(member.memberId, member)
member.supportedProtocols.foreach{ case (protocol, _) => supportedProtocols(protocol) += 1 }
member.awaitingJoinCallback = callback
if (member.isAwaitingJoin)
numMembersAwaitingJoin += 1
}
2)remove
def remove(memberId: String): Unit = {
members.remove(memberId).foreach { member =>
member.supportedProtocols.foreach{ case (protocol, _) => supportedProtocols(protocol) -= 1 }
if (member.isAwaitingJoin)
numMembersAwaitingJoin -= 1
}
if (isLeader(memberId))
leaderId = members.keys.headOption
}
5.消费位移管理
表示位移的成员变量
private val offsets = new mutable.HashMap[TopicPartition, CommitRecordMetadataAndOffset]
提交位移的类
case class CommitRecordMetadataAndOffset(appendedBatchOffset: Option[Long], offsetAndMetadata: OffsetAndMetadata) {
def olderThan(that: CommitRecordMetadataAndOffset): Boolean = appendedBatchOffset.get < that.appendedBatchOffset.get
}
appendedBatchOffset:位移主题消息自己的位移值(__consumer_offset的位移)
offsetAndMetadata:位移提交消息中保存的消费者组的位移值
提交位移后更新offsets
def onOffsetCommitAppend(topicPartition: TopicPartition, offsetWithCommitRecordMetadata: CommitRecordMetadataAndOffset): Unit = {
if (pendingOffsetCommits.contains(topicPartition)) {
if (offsetWithCommitRecordMetadata.appendedBatchOffset.isEmpty)
throw new IllegalStateException("Cannot complete offset commit write without providing the metadata of the record " +
"in the log.")
// offsets字段中没有该分区位移提交数据,或者
// offsets字段中该分区对应的提交位移消息在位移主题中的位移值小于待写入的位移值
if (!offsets.contains(topicPartition) || offsets(topicPartition).olderThan(offsetWithCommitRecordMetadata))
offsets.put(topicPartition, offsetWithCommitRecordMetadata)
}
pendingOffsetCommits.get(topicPartition) match {
case Some(stagedOffset) if offsetWithCommitRecordMetadata.offsetAndMetadata == stagedOffset =>
pendingOffsetCommits.remove(topicPartition)
case _ =>
// The pendingOffsetCommits for this partition could be empty if the topic was deleted, in which case
// its entries would be removed from the cache by the `removeOffsets` method.
}
}
移除过期位移值
def removeExpiredOffsets(currentTimestamp: Long, offsetRetentionMs: Long): Map[TopicPartition, OffsetAndMetadata] = {
def getExpiredOffsets(baseTimestamp: CommitRecordMetadataAndOffset => Long,
subscribedTopics: Set[String] = Set.empty): Map[TopicPartition, OffsetAndMetadata] = {
offsets.filter {
case (topicPartition, commitRecordMetadataAndOffset) =>
!subscribedTopics.contains(topicPartition.topic()) &&
!pendingOffsetCommits.contains(topicPartition) && {
commitRecordMetadataAndOffset.offsetAndMetadata.expireTimestamp match {
case None =>
// current version with no per partition retention
currentTimestamp - baseTimestamp(commitRecordMetadataAndOffset) >= offsetRetentionMs
case Some(expireTimestamp) =>
// older versions with explicit expire_timestamp field => old expiration semantics is used
currentTimestamp >= expireTimestamp
}
}
}.map {
case (topicPartition, commitRecordOffsetAndMetadata) =>
(topicPartition, commitRecordOffsetAndMetadata.offsetAndMetadata)
}.toMap
}
val expiredOffsets: Map[TopicPartition, OffsetAndMetadata] = protocolType match {
case Some(_) if is(Empty) =>
// no consumer exists in the group =>
// - if current state timestamp exists and retention period has passed since group became Empty,
// expire all offsets with no pending offset commit;
// - if there is no current state timestamp (old group metadata schema) and retention period has passed
// since the last commit timestamp, expire the offset
getExpiredOffsets(
commitRecordMetadataAndOffset => currentStateTimestamp
.getOrElse(commitRecordMetadataAndOffset.offsetAndMetadata.commitTimestamp)
)
case Some(ConsumerProtocol.PROTOCOL_TYPE) if subscribedTopics.isDefined =>
// consumers exist in the group =>
// - if the group is aware of the subscribed topics and retention period had passed since the
// the last commit timestamp, expire the offset. offset with pending offset commit are not
// expired
getExpiredOffsets(
_.offsetAndMetadata.commitTimestamp,
subscribedTopics.get
)
case None =>
// protocolType is None => standalone (simple) consumer, that uses Kafka for offset storage only
// expire offsets with no pending offset commit that retention period has passed since their last commit
getExpiredOffsets(_.offsetAndMetadata.commitTimestamp)
case _ =>
Map()
}
if (expiredOffsets.nonEmpty)
debug(s"Expired offsets from group '$groupId': ${expiredOffsets.keySet}")
offsets --= expiredOffsets.keySet
expiredOffsets
}