Flink 源码分析之 Flink startupMode 是如何起作用的

最新推荐文章于 2024-04-12 11:00:44 发布

哥伦布112

最新推荐文章于 2024-04-12 11:00:44 发布

阅读量1.1k

点赞数

分类专栏： flink

本文链接：https://blog.csdn.net/u013939918/article/details/107704547

版权

本文深入分析了Flink中Kafka消费者如何根据startupMode设置初始消费位置。当consumer.setStartFromLatest()和kafkaProperties.put("auto.offset.reset", "earliest")并存时，consumer.setStartFromLatest()生效。在open方法中，根据startupMode，如SPECIFIC_OFFSETS、TIMESTAMP、EARLIEST或LATEST，决定从何处开始读取Kafka分区的offset。代码详细展示了不同模式下的处理逻辑，并强调了在重新分配分区时，如何处理新旧分区的offset和状态。" 127705351,7341665,秋招算法刷题攻略,"['人工智能', '求职招聘', '深度学习', '编程', '算法题库']

摘要由CSDN通过智能技术生成

之前一直有个疑问，如果consumer.setStartFromLatest()以及kafkaProperties.put("auto.offset.reset", "earliest")同时存在，究竟哪一个会起作用，答案肯定是consumer.setStartFromLatest()，为什么呢？我们一起来看一下

@Override

public void open(Configuration configuration) throws Exception {

// determine the offset commit mode，区分ON_CHECKPOINTS、DISABLED or KAFKA_PERIODIC，本文主要针对ON_CHECKPOINTS

this.offsetCommitMode = OffsetCommitModes.fromConfiguration(

getIsAutoCommitEnabled(),

enableCommitOnCheckpoints,

((StreamingRuntimeContext) getRuntimeContext()).isCheckpointingEnabled());

// create the kafka partition discoverer

this.partitionDiscoverer = createPartitionDiscoverer(

topicsDescriptor,

getRuntimeContext().getIndexOfThisSubtask(),

getRuntimeContext().getNumberOfParallelSubtasks());

this.partitionDiscoverer.open();

subscribedPartitionsToStartOffsets = new HashMap<>();

//获取fixed topic's or topic pattern 's partitions of this subtask

final List<KafkaTopicPartition> allPartitions = partitionDiscoverer.discoverPartitions();

//从checkpoint中恢复

if (restoredState != null) {

for (KafkaTopicPartition partition : allPartitions) {

//新的分区(未曾在checkpoint中的分区将从earliest offset 开始消费),old partition已经从checkpoint中恢复了，并且已经保存在subscribedPartitionsToStartOffsets

if (!restoredState.containsKey(partition)) {

restoredState.put(partition, KafkaTopicPartitionStateSentinel.EARLIEST_OFFSET);

}

for (Map.Entry<KafkaTopicPartition, Long> restoredStateEntry : restoredState.entrySet()) {

if (!restoredFromOldState) {

// seed the partition discoverer with the union state while filtering out

// restored partitions that should not be subscribed by this subtask

if (KafkaTopicPartitionAssigner.assign(

restoredStateEntry.getKey(), getRuntimeContext().getNumberOfParallelSubtasks())

== getRuntimeContext().getIndexOfThisSubtask()){

subscribedPartitionsToStartOffsets.put(restoredStateEntry.getKey(), restoredStateEntry.getValue());

}

} else {

// when restoring from older 1.1 / 1.2 state, the restored state would not be the union state;

// in this case, just use the restored state as the subscribed partitions

subscribedPartitionsToStartOffsets.put(restoredStateEntry.getKey(), restoredStateEntry.getValue());

}

if (filterRestoredPartitionsWithCurrentTopicsDescriptor) {

subscribedPartitionsToStartOffsets.entrySet().removeIf(entry -> {

if (!topicsDescriptor.isMatchingTopic(entry.getKey().getTopic())) {

LOG.warn(

"{} is removed from subscribed partitions since it is no longer associated with topics descriptor of current execution.",

entry.getKey());

return true;

}

return false;

});

}

LOG.info("Consumer subtask {} will start reading {} partitions with offsets in restored state: {}",

getRuntimeContext().getIndexOfThisSubtask(), subscribedPartitionsToStartOffsets.size(), subscribedPartitionsToStartOffsets);

} else {

// use the partition discoverer to fetch the initial seed partitions,

// and set their initial offsets depending on the startup mode.

// for SPECIFIC_OFFSETS and TIMESTAMP modes, we set the specific offsets now;

// for other modes (EARLIEST, LATEST, and GROUP_OFFSETS), the offset is lazily determined

// when the partition is actually read.

switch (startupMode) {

case SPECIFIC_OFFSETS:

if (specificStartupOffsets == null) {

throw new IllegalStateException(

"Startup mode for the consumer set to " + StartupMode.SPECIFIC_OFFSETS +

", but no specific offsets were specified.");

}

for (KafkaTopicPartition seedPartition : allPartitions) {

//指定partition的offset，从指定的offset卡开始，未指定的从group_offset开始

Long specificOffset = specificStartupOffsets.get(seedPartition);

if (specificOffset != null) {

// since the specified offsets represent the next record to read, we subtract

// it by one so that the initial state of the consumer will be correct

subscribedPartitionsToStartOffsets.put(seedPartition, specificOffset - 1);

} else {

// default to group offset behaviour if the user-provided specific offsets

// do not contain a value for this partition

//对应的startupMode也存储到 subscribedPartitionsToStartOffsets中

subscribedPartitionsToStartOffsets.put(seedPartition

最低0.47元/天解锁文章

哥伦布112

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录