FlinkKafkaConsumer offset读取优先级
kafkaConusmer的offset优先级分为3个等级:
- Checkpoint&Savepoint
- setStartFromGroupOffsets等
- auto.offset.reset
/**
* Specifies the consumer to start reading from any committed group offsets found
* in Zookeeper / Kafka brokers. The "group.id" property must be set in the configuration
* properties. If no offset can be found for a partition, the behaviour in "auto.offset.reset"
* set in the configuration properties will be used for the partition.
*
* <p>This method does not affect where partitions are read from when the consumer is restored
* from a checkpoint or savepoint. When the consumer is restored from a checkpoint or
* savepoint, only the offsets in the restored state will be used.
*
* @return The consumer object, to allow function chaining.
*/
public FlinkKafkaConsumerBase<T> setStartFromGroupOffsets() {
this.startupMode = StartupMode.GROUP_OFFSETS;
this.startupOffsetsTimestamp = null;
this.specificStartupOffsets = null;
return this;
}
1、Checkpoint&Savepoint
假如设置了env.enableCheckpoint(),那么优先从 StateBackEnd中读取 【KafkaTopicPartition -> offset】
2、startFromxxxx
可以通过:
kafkaConsumer.setStartFromGroupOffsets() //默认配置,必须指定group-id
kafkaConsumer.setStartFromLastest() //从每个分区的最开始的offset开始读取
kafkaConsumer.setStartFromEarliest() //从每个分区中的最后的offset中读取
kafkaConsumer.setStartFromTimestamp(timestamp: long) //从指定的时间戳读取分区中的数据
kafkaConsumer.setStartFromSpecificOffsets(Map[KafkaTopicPartition,Long]) // 指定不同的分区按照指定的偏移量进行读取read
3、从配置中的auto.offset.reset读取
假如以上1、2都没有指定,那么就会从默认或者 consumer配置中的auto.offset.reset配置属性中决定读取位置