Flink 源码分析之 Flink startupMode 是如何起作用的

本文深入分析了Flink中Kafka消费者如何根据startupMode设置初始消费位置。当consumer.setStartFromLatest()和kafkaProperties.put("auto.offset.reset", "earliest")并存时,consumer.setStartFromLatest()生效。在open方法中,根据startupMode,如SPECIFIC_OFFSETS、TIMESTAMP、EARLIEST或LATEST,决定从何处开始读取Kafka分区的offset。代码详细展示了不同模式下的处理逻辑,并强调了在重新分配分区时,如何处理新旧分区的offset和状态。" 127705351,7341665,秋招算法刷题攻略,"['人工智能', '求职招聘', '深度学习', '编程', '算法题库']
摘要由CSDN通过智能技术生成

之前一直有个疑问,如果consumer.setStartFromLatest()以及kafkaProperties.put("auto.offset.reset", "earliest")同时存在,究竟哪一个会起作用,答案肯定是consumer.setStartFromLatest(),为什么呢?我们一起来看一下

 

@Override

public void open(Configuration configuration) throws Exception {

// determine the offset commit mode,区分ON_CHECKPOINTS、DISABLED or KAFKA_PERIODIC,本文主要针对ON_CHECKPOINTS

this.offsetCommitMode = OffsetCommitModes.fromConfiguration(

getIsAutoCommitEnabled(),

enableCommitOnCheckpoints,

((StreamingRuntimeContext) getRuntimeContext()).isCheckpointingEnabled());

// create the kafka partition discoverer

this.partitionDiscoverer = createPartitionDiscoverer(

topicsDescriptor,

getRuntimeContext().getIndexOfThisSubtask(),

getRuntimeContext().getNumberOfParallelSubtasks());

this.partitionDiscoverer.open();

subscribedPartitionsToStartOffsets = new HashMap<>();

//获取fixed topic's or topic pattern 's partitions of this subtask

final List<KafkaTopicPartition> allPartitions = partitionDiscoverer.discoverPartitions();

//从checkpoint中恢复

if (restoredState != null) {

for (KafkaTopicPartition partition : allPartitions) {

//新的分区(未曾在checkpoint中的分区将从earliest offset 开始消费),old partition已经从checkpoint中恢复了,并且已经保存在subscribedPartitionsToStartOffsets

if (!restoredState.containsKey(partition)) {

restoredState.put(partition, KafkaTopicPartitionStateSentinel.EARLIEST_OFFSET);

}

}

for (Map.Entry<KafkaTopicPartition, Long> restoredStateEntry : restoredState.entrySet()) {

if (!restoredFromOldState) {

// seed the partition discoverer with the union state while filtering out

// restored partitions that should not be subscribed by this subtask

if (KafkaTopicPartitionAssigner.assign(

restoredStateEntry.getKey(), getRuntimeContext().getNumberOfParallelSubtasks())

== getRuntimeContext().getIndexOfThisSubtask()){

subscribedPartitionsToStartOffsets.put(restoredStateEntry.getKey(), restoredStateEntry.getValue());

}

} else {

// when restoring from older 1.1 / 1.2 state, the restored state would not be the union state;

// in this case, just use the restored state as the subscribed partitions

subscribedPartitionsToStartOffsets.put(restoredStateEntry.getKey(), restoredStateEntry.getValue());

}

}

if (filterRestoredPartitionsWithCurrentTopicsDescriptor) {

subscribedPartitionsToStartOffsets.entrySet().removeIf(entry -> {

if (!topicsDescriptor.isMatchingTopic(entry.getKey().getTopic())) {

LOG.warn(

"{} is removed from subscribed partitions since it is no longer associated with topics descriptor of current execution.",

entry.getKey());

return true;

}

return false;

});

}

LOG.info("Consumer subtask {} will start reading {} partitions with offsets in restored state: {}",

getRuntimeContext().getIndexOfThisSubtask(), subscribedPartitionsToStartOffsets.size(), subscribedPartitionsToStartOffsets);

} else {

// use the partition discoverer to fetch the initial seed partitions,

// and set their initial offsets depending on the startup mode.

// for SPECIFIC_OFFSETS and TIMESTAMP modes, we set the specific offsets now;

// for other modes (EARLIEST, LATEST, and GROUP_OFFSETS), the offset is lazily determined

// when the partition is actually read.

switch (startupMode) {

case SPECIFIC_OFFSETS:

if (specificStartupOffsets == null) {

throw new IllegalStateException(

"Startup mode for the consumer set to " + StartupMode.SPECIFIC_OFFSETS +

", but no specific offsets were specified.");

}

for (KafkaTopicPartition seedPartition : allPartitions) {

//指定partition的offset,从指定的offset卡开始,未指定的从group_offset开始

Long specificOffset = specificStartupOffsets.get(seedPartition);

if (specificOffset != null) {

// since the specified offsets represent the next record to read, we subtract

// it by one so that the initial state of the consumer will be correct

subscribedPartitionsToStartOffsets.put(seedPartition, specificOffset - 1);

} else {

// default to group offset behaviour if the user-provided specific offsets

// do not contain a value for this partition

//对应的startupMode也存储到 subscribedPartitionsToStartOffsets中

subscribedPartitionsToStartOffsets.put(seedPartition

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值