FlinkConsumer 分区和subtask对应关系以及FlinkKafkaConsumerBase解析

最新推荐文章于 2023-12-21 08:25:51 发布

耿宏胜

最新推荐文章于 2023-12-21 08:25:51 发布

阅读量1.8k

点赞数 1

分类专栏： flink 文章标签： java 大数据 flink

本文链接：https://blog.csdn.net/weixin_43885038/article/details/108855256

版权

flink 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

FlinkConsumer 分区和subtask对应关系以及FlinkKafkaConsumerBase解析

FlinkKafkaConsumerBase类
- Flink如何生成消费Kakfa分区的任务

FlinkKafkaConsumerBase类

FlinkKafkaConsumerBase 是一个核心类，其中的，FlinkKafkaConsumer08，FlinkKafkaConsumer09，FlinkKafkaConsumer10等都继承了这个类，首先我们看下这个类的构造方法：

看一下discoveryIntervalMillis ，这个是partition的自动发现时间，默认是public static final long PARTITION_DISCOVERY_DISABLED = Long.MIN_VALUE;，也就是永远不自动发现，这样如果对应的kafka Topic增加分区，那么需要重启程序，才能被发现，

public FlinkKafkaConsumerBase(
			List<String> topics,
			Pattern topicPattern,
			KafkaDeserializationSchema<T> deserializer,
			long discoveryIntervalMillis,
			boolean useMetrics) {
		this.topicsDescriptor = new KafkaTopicsDescriptor(topics, topicPattern);
		this.deserializer = checkNotNull(deserializer, "valueDeserializer");

		checkArgument(
			discoveryIntervalMillis == PARTITION_DISCOVERY_DISABLED || discoveryIntervalMillis >= 0,
			"Cannot define a negative value for the topic / partition discovery interval.");
		this.discoveryIntervalMillis = discoveryIntervalMillis;

		this.useMetrics = useMetrics;
	}

紧接着，看一下FlinkKafkaConsumerBase 的open方法，这里面是对所有partition 的初始化，以及subtazsk和partition一一对应重要代码，也是FlinkConsumer 是如何保证一个 partition 对应一个 thread 的关键所在

	public void open(Configuration configuration) throws Exception {
		// determine the offset commit mode
		this.offsetCommitMode = OffsetCommitModes.fromConfiguration(
				getIsAutoCommitEnabled(),
				enableCommitOnCheckpoints,
				((StreamingRuntimeContext) getRuntimeContext()).isCheckpointingEnabled());

		// create the partition discoverer
		this.partitionDiscoverer = createPartitionDiscoverer(
				topicsDescriptor,
				getRuntimeContext().getIndexOfThisSubtask(),
				getRuntimeContext().getNumberOfParallelSubtasks());
		this.partitionDiscoverer.open();

		subscribedPartitionsToStartOffsets = new HashMap<>();

enableCommitOnCheckpoints 默认开启checkpoints的时候，会默认使用offect提交模式 On_CHECKPOINTS，因为目前flink提交kafka的方式有三种，
1、开启 checkpoint ：在 checkpoint 完成后提交
2、开启 checkpoint，禁用 checkpoint 提交：不提交消费组 offset
3、不开启 checkpoint：依赖kafka client 的自动提交
后续单独开一篇文章，重点介绍

紧接着，我们会看到一个discoverPartitions方法，这是PartitionDiscoverer类里面的，这个是一个重点方法，也是为什么Flink能一个分区对应一个


	final List<KafkaTopicPartition> allPartitions = partitionDiscoverer.discoverPartitions();
		if (restoredState != null) {
			for (KafkaTopicPartition partition : allPartitions) {
				if (!restoredState.containsKey(partition)) {
					restoredState.put(partition, KafkaTopicPartitionStateSentinel.EARLIEST_OFFSET);
				}
			}

			for (Map.Entry<KafkaTopicPartition, Long> restoredStateEntry : restoredState.entrySet()) {
				// seed the partition discoverer with the union state while filtering out
				// restored partitions that should not be subscribed by this subtask
				if (KafkaTopicPartitionAssigner.assign(
					//getNumberOfParallelSubtasks  所有并行度个数，
					//getIndexOfThisSubtask  并行度ID
					restoredStateEntry.getKey(), getRuntimeContext().getNumberOfParallelSubtasks())
						== getRuntimeContext().getIndexOfThisSubtask()){
					subscribedPartitionsToStartOffsets.put(restoredStateEntry.getKey(), restoredStateEntry.getValue());
				}
			}

			if (filterRestoredPartitionsWithCurrentTopicsDescriptor) {
				subscribedPartitionsToStartOffsets.entrySet().removeIf(entry -> {
					if (!topicsDescriptor.isMatchingTopic(entry.getKey().getTopic())) {
						LOG.warn(
							"{} is removed from subscribed partitions since it is no longer associated with topics descriptor of current execution.",
							entry.getKey());
						return true;
					}
					return false;
				});
			}

			LOG.info("Consumer subtask {} will start reading {} partitions with offsets in restored state: {}",
				getRuntimeContext().getIndexOfThisSubtask(), subscribedPartitionsToStartOffsets.size(), subscribedPartitionsToStartOffsets);

我们进去看下discoverPartitions()方法

public List<KafkaTopicPartition> discoverPartitions() throws WakeupException, ClosedException {
		if (!closed && !wakeup) {
			try {
				List<KafkaTopicPartition> newDiscoveredPartitions;
                //这里只是做了判断，判断传入的Topic是否是一个topic名称，还是正则匹配，平时只会传入一个具体的topic名称
				
				
				// (1) get all possible partitions, based on whether we are subscribed to fixed topics or a topic pattern
				if (topicsDescriptor.isFixedTopics()) {
					 //获取所有的Kafkapartition
					newDiscoveredPartitions = getAllPartitionsForTopics(topicsDescriptor.getFixedTopics());
				} else {
					List<String> matchedTopics = getAllTopics();

					// retain topics that match the pattern
					Iterator<String> iter = matchedTopics.iterator();
					while (iter.hasNext()) {
						if (!topicsDescriptor.isMatchingTopic(iter.next())) {
							iter.remove();
						}
					}

					if (matchedTopics.size() != 0) {
						// get partitions only for matched topics
						newDiscoveredPartitions = getAllPartitionsForTopics(matchedTopics);
					} else {
						newDiscoveredPartitions = null;
					}
				}
				//newDiscoveredPartitions  获取全部的Kafka分区，但是目前还不是和subtask一一对应的关系，
				// 如果为Null，或者为o，那么这个topic是没有分区的，也就会报错"Unable to retrieve any partitions
				// (2) eliminate partition that are old partitions or should not be subscribed by this subtask
				if (newDiscoveredPartitions == null || newDiscoveredPartitions.isEmpty()) {
					//如果kafka分区为空，那么初始化的时候就会报错
					throw new RuntimeException("Unable to retrieve any partitions with KafkaTopicsDescriptor: " + topicsDescriptor);
				} else {
                    //这里要注意，下面的代码主要逻辑视为了让subtask和topic对应起来，
					// 
					// 具体我们点进去看下setAndCheckDiscoveredPartition
					Iterator<KafkaTopicPartition> iter = newDiscoveredPartitions.iterator();
					KafkaTopicPartition nextPartition;
					while (iter.hasNext()) {
						nextPartition = iter.next();
						if (!setAndCheckDiscoveredPartition(nextPartition)) {
							iter.remove();
						}
					}
				}

				return newDiscoveredPartitions;
			} catch (WakeupException e) {
				// the actual topic / partition metadata fetching methods
				// may be woken up midway; reset the wakeup flag and rethrow
				wakeup = false;
				throw e;
			}
		} else if (!closed && wakeup) {
			// may have been woken up before the method call
			wakeup = false;
			throw new WakeupException();
		} else {
			throw new ClosedException();
		}
	}

这里我都进行了注释，可以仔细阅读一下，这个方法最终会返回一个只数据这一个subtask的分区List，其中最为核心的算法就封装在setAndCheckDiscoveredPartition(),我们点击去看下

public boolean setAndCheckDiscoveredPartition(KafkaTopicPartition partition) {
		//如果是新分区，会增加到这个set中，
		if (isUndiscoveredPartition(partition)) {
			discoveredPartitions.add(partition);

			//kafkaPartition与indexOfThisSubTask --对应
			return KafkaTopicPartitionAssigner.assign(partition, numParallelSubtasks) == indexOfThisSubtask;
		}

		return false;
	}

这里就牵涉到具体的计算逻辑了，为什么Flink能保证一个partition对应一个Thread
具体原理：
int startIndex = ((partition.getTopic().hashCode() * 31) & 0x7FFFFFFF) % numParallelSubtasks;
(startIndex + partition.getPartition()) % numParallelSubtasks
numParallelSubtasks:subtask的并行数，也就是flink设置的并行度

如L partition 个数为 6；并行度为 3

那么会刚好平均分配到一个subtask中

但是要主要，如果并行度设置过大，大于了分区数，那么就会产生，有的线程是空的，导致资源浪费，

public class KafkaTopicPartitionAssigner {

	/**
	 * Returns the index of the target subtask that a specific Kafka partition should be
	 * assigned to.
	 *
	 * <p>The resulting distribution of partitions of a single topic has the following contract:
	 * <ul>
	 *     <li>1. Uniformly distributed across subtasks</li>
	 *     <li>2. Partitions are round-robin distributed (strictly clockwise w.r.t. ascending
	 *     subtask indices) by using the partition id as the offset from a starting index
	 *     (i.e., the index of the subtask which partition 0 of the topic will be assigned to,
	 *     determined using the topic name).</li>
	 * </ul>
	 *
	 * <p>The above contract is crucial and cannot be broken. Consumer subtasks rely on this
	 * contract to locally filter out partitions that it should not subscribe to, guaranteeing
	 * that all partitions of a single topic will always be assigned to some subtask in a
	 * uniformly distributed manner.
	 *
	 * @param partition the Kafka partition
	 * @param numParallelSubtasks total number of parallel subtasks
	 *
	 * @return index of the target subtask that the Kafka partition should be assigned to.
	 */
	public static int assign(KafkaTopicPartition partition, int numParallelSubtasks) {
		int startIndex = ((partition.getTopic().hashCode() * 31) & 0x7FFFFFFF) % numParallelSubtasks;

		// here, the assumption is that the id of Kafka partitions are always ascending
		// starting from 0, and therefore can be used directly as the offset clockwise from the start index
		return (startIndex + partition.getPartition()) % numParallelSubtasks;
	}

}

通过以上操作：最终返回的allPartitions 是属于这个并行线程的全部partition，
接下来就分两部分，一部分是不从checkpoint中恢复，一种是从checkpoint中恢复

Flink如何生成消费Kakfa分区的任务

第一步已经生成好一个List allPartitions ，它里面包含了这个subtask对应的分区信息
这时候，返回到最初的位置我们可以看到有if (restoredState != null)判断，restoredState是flink从中间状态恢复的信息，我们先讨论没有ckeckpoint的情况，

	// use the partition discoverer to fetch the initial seed partitions,
			// and set their initial offsets depending on the startup mode.
			// for SPECIFIC_OFFSETS and TIMESTAMP modes, we set the specific offsets now;
			// for other modes (EARLIEST, LATEST, and GROUP_OFFSETS), the offset is lazily determined
			// when the partition is actually read.
			switch (startupMode) {
				case SPECIFIC_OFFSETS:
					if (specificStartupOffsets == null) {
						throw new IllegalStateException(
							"Startup mode for the consumer set to " + StartupMode.SPECIFIC_OFFSETS +
								", but no specific offsets were specified.");
					}

					for (KafkaTopicPartition seedPartition : allPartitions) {
						Long specificOffset = specificStartupOffsets.get(seedPartition);
						if (specificOffset != null) {
							// since the specified offsets represent the next record to read, we subtract
							// it by one so that the initial state of the consumer will be correct
							subscribedPartitionsToStartOffsets.put(seedPartition, specificOffset - 1);
						} else {
							// default to group offset behaviour if the user-provided specific offsets
							// do not contain a value for this partition
							subscribedPartitionsToStartOffsets.put(seedPartition, KafkaTopicPartitionStateSentinel.GROUP_OFFSET);
						}
					}

					break;
				case TIMESTAMP:
					if (startupOffsetsTimestamp == null) {
						throw new IllegalStateException(
							"Startup mode for the consumer set to " + StartupMode.TIMESTAMP +
								", but no startup timestamp was specified.");
					}

					for (Map.Entry<KafkaTopicPartition, Long> partitionToOffset
							: fetchOffsetsWithTimestamp(allPartitions, startupOffsetsTimestamp).entrySet()) {
						subscribedPartitionsToStartOffsets.put(
							partitionToOffset.getKey(),
							(partitionToOffset.getValue() == null)
									// if an offset cannot be retrieved for a partition with the given timestamp,
									// we default to using the latest offset for the partition
									? KafkaTopicPartitionStateSentinel.LATEST_OFFSET
									// since the specified offsets represent the next record to read, we subtract
									// it by one so that the initial state of the consumer will be correct
									: partitionToOffset.getValue() - 1);
					}

					break;
				default:
					for (KafkaTopicPartition seedPartition : allPartitions) {
						subscribedPartitionsToStartOffsets.put(seedPartition, startupMode.getStateSentinel());
					}

接下来是有CheckPoint的情况，基本类似，中间会有一些状态的校验工作：

if (restoredState != null) {
			for (KafkaTopicPartition partition : allPartitions) {
				//判断是否是新增的，如果是添加到restoredState中，并从最新开始消费
				if (!restoredState.containsKey(partition)) {
					restoredState.put(partition, KafkaTopicPartitionStateSentinel.EARLIEST_OFFSET);
				}
			}

			for (Map.Entry<KafkaTopicPartition, Long> restoredStateEntry : restoredState.entrySet()) {
				// seed the partition discoverer with the union state while filtering out
				// restored partitions that should not be subscribed by this subtask
				//获取状态信息中的分区信息是否属于这个subtask，如果属于增加到subscribedPartitionsToStartOffsets
				if (KafkaTopicPartitionAssigner.assign(
					//getNumberOfParallelSubtasks  所有并行度个数，
					//getIndexOfThisSubtask  并行度ID
					restoredStateEntry.getKey(), getRuntimeContext().getNumberOfParallelSubtasks())
						== getRuntimeContext().getIndexOfThisSubtask()){
					subscribedPartitionsToStartOffsets.put(restoredStateEntry.getKey(), restoredStateEntry.getValue());
				}
			}
          //检查是否与topic是一致的，相当于一个校验
			if (filterRestoredPartitionsWithCurrentTopicsDescriptor) {
				subscribedPartitionsToStartOffsets.entrySet().removeIf(entry -> {
					if (!topicsDescriptor.isMatchingTopic(entry.getKey().getTopic())) {
						LOG.warn(
							"{} is removed from subscribed partitions since it is no longer associated with topics descriptor of current execution.",
							entry.getKey());
						return true;
					}
					return false;
				});
			}

			LOG.info("Consumer subtask {} will start reading {} partitions with offsets in restored state: {}",
				getRuntimeContext().getIndexOfThisSubtask(), subscribedPartitionsToStartOffsets.size(), subscribedPartitionsToStartOffsets);

首先说一下：startupMode是消费kafka的方式有以下几种

StartupMode  的模式有以下几种，如果我们需要指定offect消费，那么就需要使用SPECIFIC_OFFSETS模式，本次只讨论使用最新的offect消费模式，其他的后续文章会介绍

	/** Start from committed offsets in ZK / Kafka brokers of a specific consumer group (default). */
	GROUP_OFFSETS(KafkaTopicPartitionStateSentinel.GROUP_OFFSET),

	/** 从最早的offect开始消费
	 * Start from the earliest offset possible. */
	EARLIEST(KafkaTopicPartitionStateSentinel.EARLIEST_OFFSET),

	/**
	 * 从最新的offect开始消费
	 * Start from the latest offset. */
	LATEST(KafkaTopicPartitionStateSentinel.LATEST_OFFSET),

	/**
	 *
	 * 根据用户提供的时间戳开始消费
	 * 
	 * 可以传入不同的文件，指定消费方式
	 * Start from user-supplied timestamp for each partition.
	 * Since this mode will have specific offsets to start with, we do not need a sentinel value;
	 * using Long.MIN_VALUE as a placeholder.
	 */
	TIMESTAMP(Long.MIN_VALUE),

	/** 根据用户提供特定的offect开始消费
	 * Start from user-supplied specific offsets for each partition.
	 * Since this mode will have specific offsets to start with, we do not need a sentinel value;
	 * using Long.MIN_VALUE as a placeholder.
	 */
	SPECIFIC_OFFSETS(Long.MIN_VALUE);

	/** The sentinel offset value corresponding to this startup mode. */
	private long stateSentinel;



	最后生成一个 Hashmap  subscribedPartitionsToStartOffsets

根据之前生成的allPartitions 生成一个subscribedPartitionsToStartOffsets，这里我们会传入一个分区的offect默认是最大值，重最新消费最后所有的任务会进入到run方法中执行

@Override
	public void run(SourceContext<T> sourceContext) throws Exception {
		if (subscribedPartitionsToStartOffsets == null) {
			throw new Exception("The partitions were not set for the consumer");
		}

		// initialize commit metrics and default offset callback method
		this.successfulCommits = this.getRuntimeContext().getMetricGroup().counter(COMMITS_SUCCEEDED_METRICS_COUNTER);
		this.failedCommits =  this.getRuntimeContext().getMetricGroup().counter(COMMITS_FAILED_METRICS_COUNTER);
		final int subtaskIndex = this.getRuntimeContext().getIndexOfThisSubtask();

		this.offsetCommitCallback = new KafkaCommitCallback() {
			@Override
			public void onSuccess() {
				successfulCommits.inc();
			}

			@Override
			public void onException(Throwable cause) {
				LOG.warn(String.format("Consumer subtask %d failed async Kafka commit.", subtaskIndex), cause);
				failedCommits.inc();
			}
		};

		// mark the subtask as temporarily idle if there are no initial seed partitions;
		// once this subtask discovers some partitions and starts collecting records, the subtask's
		// status will automatically be triggered back to be active.
		if (subscribedPartitionsToStartOffsets.isEmpty()) {
			sourceContext.markAsTemporarilyIdle();
		}

		LOG.info("Consumer subtask {} creating fetcher with offsets {}.",
			getRuntimeContext().getIndexOfThisSubtask(), subscribedPartitionsToStartOffsets);
		// from this point forward:
		//   - 'snapshotState' will draw offsets from the fetcher,
		//     instead of being built from `subscribedPartitionsToStartOffsets`
		//   - 'notifyCheckpointComplete' will start to do work (i.e. commit offsets to
		//     Kafka through the fetcher, if configured to do so)
		this.kafkaFetcher = createFetcher(
				sourceContext,
				subscribedPartitionsToStartOffsets,
				watermarkStrategy,
				(StreamingRuntimeContext) getRuntimeContext(),
				offsetCommitMode,
				getRuntimeContext().getMetricGroup().addGroup(KAFKA_CONSUMER_METRICS_GROUP),
				useMetrics);

		if (!running) {
			return;
		}

		// depending on whether we were restored with the current state version (1.3),
		// remaining logic branches off into 2 paths:
		//  1) New state - partition discovery loop executed as separate thread, with this
		//                 thread running the main fetcher loop
		//  2) Old state - partition discovery is disabled and only the main fetcher loop is executed
		if (discoveryIntervalMillis == PARTITION_DISCOVERY_DISABLED) {
			kafkaFetcher.runFetchLoop();
		} else {
			runWithPartitionDiscovery();
		}
	}

具体如何生成的，后续继续更新

耿宏胜

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
1
评论
FlinkConsumer 分区和subtask对应关系以及FlinkKafkaConsumerBase解析

FlinkConsumer 分区和subtask对应关系以及FlinkKafkaConsumerBase解析FlinkKafkaConsumerBase类Flink如何生成消费Kakfa分区的任务FlinkKafkaConsumerBase类FlinkKafkaConsumerBase 是一个核心类，其中的，FlinkKafkaConsumer08，FlinkKafkaConsumer09，FlinkKafkaConsumer10等都继承了这个类，首先我们看下这个类的构造方法：看一下discovery
复制链接

扫一扫

专栏目录