1、RPC 使用场景
consumer 消费kafka 数据时需要保存上一次消费完成的数据信息(即offet),以便重启consumer 后能够继续从当前位置开始fetch 数据,保存上一次的offset需要使用OFFSET_COMMIT RPC ,或者上一次消费的offset 位置需要使用OFFSET_FETCH RPC,这里又分为old consumer/new consumer
2、OFFSET_COMMIT RPC 源码剖析
2.1 RPC 包含的字段
private String groupId;
private int generationId;
private String memberId;
private String groupInstanceId;
private long retentionTimeMs;
private List<OffsetCommitRequestTopic> topics;
public static final Schema SCHEMA_0 =
groupId:客户端消费topic的唯一标记,每个group 的消费都是独立的
memberID:作为 group coordinator 分配给consumer的标识
groupInstanceId: 唯一表示某个特定group instaned 实例(消费哪个具体的partition)
retentionTmeMs: 用于提交group的过期时间
2.2 客户端入口
客户端通过使用 ConsumerCoordinator.sendOffsetCommitRequest 方法完成rpc 的发送(又分为主动和手动两种方式),主动提交客户端会定期向服务端提交消费完成的最新一条offset 信息,异步提交则由业务侧来决定何时提交offset。
2.3 server 端处理
服务端通用执行kafkaApis.handleOffsetCommitRequest() 方法处理客户端发送的rpc 请求,核心是将请求根据V0 和大于V0进行分类,V0的核心逻辑如下,将offset 直接写入对应的ZK 目录即可(一般用于simple consumer)
if (header.apiVersion == 0) {
// for version 0 always store offsets to ZK
val responseInfo = authorizedTopicRequestInfo.map {
case (topicPartition, partitionData) =>
try {
if (partitionData.committedMetadata() != null
&& partitionData.committedMetadata().length > config.offsetMetadataMaxSize)
(topicPartition, Errors.OFFSET_METADATA_TOO_LARGE)
else {
zkClient.setOrCreateConsumerOffset(
offsetCommitRequest.data().groupId(),
topicPartition,
partitionData.committedOffset())
(topicPartition, Errors.NONE)
}
} catch {
case e: Throwable => (topicPartition, Errors.forException(e))
}
}
sendResponseCallback(responseInfo)
}
对于其他版本的请求,调用如下方法执行
groupCoordinator.handleCommitOffsets(
offsetCommitRequest.data.groupId,
offsetCommitRequest.data.memberId,
Option(offsetCommitRequest.data.groupInstanceId),
offsetCommitRequest.data.generationId,
partitionData,
sendResponseCallback)
}
group.prepareTxnOffsetCommit(producerId, offsetMetadata)
}
} else {
group.inLock {
group.prepareOffsetCommit(offsetMetadata)
}
}
appendForGroup(group, entries, putCacheCallback)
核心步骤1是将数据存储在内存中,第二步则是将数据append到磁盘上(持久化存储)
4、OFFSET_FETCH RPC 源码剖析
4.1 RPC 携带的字段
private final String groupId;
private final List<TopicPartition> partitions;
offset_fetch rpc 携带的字段如上
groupId 为业务侧消费的唯一标记
partitions 则为这次请求需要获取那些partition上次消费到的位置
4.2 server端处理逻辑
4.2.1 VO 请求
if (header.apiVersion == 0) {
val (authorizedPartitions, unauthorizedPartitions) = offsetFetchRequest.partitions.asScala
.partition(authorizeTopicDescribe)
// version 0 reads offsets from ZK
val authorizedPartitionData = authorizedPartitions.map { topicPartition =>
try {
if (!metadataCache.contains(topicPartition))
(topicPartition, OffsetFetchResponse.UNKNOWN_PARTITION)
else {
val payloadOpt = zkClient.getConsumerOffset(offsetFetchRequest.groupId, topicPartition)
payloadOpt match {
case Some(payload) =>
(topicPartition, new OffsetFetchResponse.PartitionData(payload.toLong,
Optional.empty(), OffsetFetchResponse.NO_METADATA, Errors.NONE))
case None =>
(topicPartition, OffsetFetchResponse.UNKNOWN_PARTITION)
}
服务端从ConsumerPathZNode.path}/${group}/offsets/${topic}/${partition}" 的如下目录获取存储的offset 信息并且返回给客户端
def getOffsets(groupId: String, topicPartitionsOpt: Option[Seq[TopicPartition]]): Map[TopicPartition, OffsetFetchResponse.PartitionData] = {
trace("Getting offsets of %s for group %s.".format(topicPartitionsOpt.getOrElse("all partitions"), groupId))
val group = groupMetadataCache.get(groupId)
if (group == null) {
topicPartitionsOpt.getOrElse(Seq.empty[TopicPartition]).map { topicPartition =>
val partitionData = new OffsetFetchResponse.PartitionData(OffsetFetchResponse.INVALID_OFFSET,
Optional.empty(), "", Errors.NONE)
topicPartition -> partitionData
}.toMap
} else {
group.inLock {
if (group.is(Dead)) {
topicPartitionsOpt.getOrElse(Seq.empty[TopicPartition]).map { topicPartition =>
val partitionData = new OffsetFetchResponse.PartitionData(OffsetFetchResponse.INVALID_OFFSET,
Optional.empty(), "", Errors.NONE)
topicPartition -> partitionData
}.toMap
} else {
topicPartitionsOpt match {
case None =>
// Return offsets for all partitions owned by this consumer group. (this only applies to consumers
// that commit offsets to Kafka.)
group.allOffsets.map { case (topicPartition, offsetAndMetadata) =>
topicPartition -> new OffsetFetchResponse.PartitionData(offsetAndMetadata.offset,
offsetAndMetadata.leaderEpoch, offsetAndMetadata.metadata, Errors.NONE)
}
case Some(topicPartitions) =>
topicPartitions.map { topicPartition =>
val partitionData = group.offset(topicPartition) match {
case None =>
new OffsetFetchResponse.PartitionData(OffsetFetchResponse.INVALID_OFFSET,
Optional.empty(), "", Errors.NONE)
case Some(offsetAndMetadata) =>
new OffsetFetchResponse.PartitionData(offsetAndMetadata.offset,
offsetAndMetadata.leaderEpoch, offsetAndMetadata.metadata, Errors.NONE)
}
topicPartition -> partitionData
}.toMap
}
}
}
}