flink消费kafka时topic partitions 和并行度间的分配源码详解

最新推荐文章于 2024-07-12 08:45:00 发布

今天上上签

最新推荐文章于 2024-07-12 08:45:00 发布

阅读量3.6k

点赞数 12

分类专栏： flink 文章标签： flink 大数据

本文链接：https://blog.csdn.net/bradyM/article/details/111592738

版权

flink 专栏收录该内容

12 篇文章 1 订阅

订阅专栏

引言

当我们消费kafka的一个topic时，我们知道kafka partition 是和我们设置的并行度是一一对应的；
也就是说，假如我们的topic有12个分区，那我们就设置12个并行度，这样每个并行度都能接收到数据且数据均匀；
那如果我们设置了15个并行度，那么就会有3个并行度是收不到数据的；这可以在web ui上，点开source operate 查看SubTasks的Bytes Sent，就可以发现，有三个SubTasks的Bytes Sent始终为0。

当我们消费kafka多个topic的时候，假如有两个topic，总共24个partitions，我们设置24个并行度；如果我们按照相同的想法，并行度和partition一一对应，那么就该是24个SubTasks都能消费到数据，可实际结果却不是这样的，我们发现有10多个SubTasks并没有消费到任何数据。

所以，带着问题找答案，数据到底怎么分配的呢？

源码

① 我们找到问题的入口，我们在程序里都会new 这样的对象去建立flink和kafka的联系：

new FlinkKafkaConsumer011(topics, new SimpleStringSchema(), kafkaPro)

② 我们new了个FlinkKafkaConsumer011的对象，实际最终就是new 了个 FlinkKafkaConsumerBase，我们可以看到，我们传入的topic list 被封装成了个KafkaTopicsDescriptor的对象

/**
	 * Base constructor.
	 *
	 * @param topics fixed list of topics to subscribe to (null, if using topic pattern)
	 * @param topicPattern the topic pattern to subscribe to (null, if using fixed topics)
	 * @param deserializer The deserializer to turn raw byte messages into Java/Scala objects.
	 * @param discoveryIntervalMillis the topic / partition discovery interval, in
	 *                                milliseconds (0 if discovery is disabled).
	 */
	public FlinkKafkaConsumerBase(
			List<String> topics,
			Pattern topicPattern,
			KafkaDeserializationSchema<T> deserializer,
			long discoveryIntervalMillis,
			boolean useMetrics) {
		//将topic list 被封装成了个KafkaTopicsDescriptor的对象
		this.topicsDescriptor = new KafkaTopicsDescriptor(topics, topicPattern);
		this.deserializer = checkNotNull(deserializer, "valueDeserializer");

		checkArgument(
			discoveryIntervalMillis == PARTITION_DISCOVERY_DISABLED || discoveryIntervalMillis >= 0,
			"Cannot define a negative value for the topic / partition discovery interval.");
		this.discoveryIntervalMillis = discoveryIntervalMillis;

		this.useMetrics = useMetrics;
	}

③ 查看FlinkKafkaConsumerBase.open方法,我们只需看前几行就可以了；我们可以看到，②中的KafkaTopicsDescriptor的对象和当前subtaskID和subtask总数(并行度)继续被封装成了AbstractPartitionDiscoverer的对象partitionDiscoverer ；然后partitionDiscoverer 调用了其方法：discoverPartitions

@Override
public void open(Configuration configuration) throws Exception {
	// determine the offset commit mode
	this.offsetCommitMode = OffsetCommitModes.fromConfiguration(
			getIsAutoCommitEnabled(),
			enableCommitOnCheckpoints,
			((StreamingRuntimeContext) getRuntimeContext()).isCheckpointingEnabled());

	// create the partition discoverer
	this.partitionDiscoverer = createPartitionDiscoverer(
			topicsDescriptor,
			getRuntimeContext().getIndexOfThisSubtask(),
			getRuntimeContext().getNumberOfParallelSubtasks());
	this.partitionDiscoverer.open();

	subscribedPartitionsToStartOffsets = new HashMap<>();
	final List<KafkaTopicPartition> allPartitions = partitionDiscoverer.discoverPartitions();
	......
	}

④ 继续看partitionDiscoverer.discoverPartitions方法；我们先判断我们传入的是否为一个固定list还是一个正则匹配形式，这里我们传入的是list，所以调用了一个getAllPartitionsForTopics方法，(我们这里跳到看⑤) 现在返回的一个KafkaTopicPartition的集合那就是包含了每个topic的每个分区信息的集合；
然后我们往下可以看到把集合的每个元素传到setAndCheckDiscoveredPartition方法中(跳到⑥)。

/**
	 * Execute a partition discovery attempt for this subtask.
	 * This method lets the partition discoverer update what partitions it has discovered so far.
	 *
	 * @return List of discovered new partitions that this subtask should subscribe to.
	 */
	public List<KafkaTopicPartition> discoverPartitions() throws WakeupException, ClosedException {
		if (!closed && !wakeup) {
			try {
				List<KafkaTopicPartition> newDiscoveredPartitions;

				// (1) get all possible partitions, based on whether we are subscribed to fixed topics or a topic pattern
				if (topicsDescriptor.isFixedTopics()) {
					newDiscoveredPartitions = getAllPartitionsForTopics(topicsDescriptor.getFixedTopics());
				} else {
					List<String> matchedTopics = getAllTopics();

					// retain topics that match the pattern
					Iterator<String> iter = matchedTopics.iterator();
					while (iter.hasNext()) {
						if (!topicsDescriptor.isMatchingTopic(iter.next())) {
							iter.remove();
						}
					}

					if (matchedTopics.size() != 0) {
						// get partitions only for matched topics
						newDiscoveredPartitions = getAllPartitionsForTopics(matchedTopics);
					} else {
						newDiscoveredPartitions = null;
					}
				}

				// (2) eliminate partition that are old partitions or should not be subscribed by this subtask
				if (newDiscoveredPartitions == null || newDiscoveredPartitions.isEmpty()) {
					throw new RuntimeException("Unable to retrieve any partitions with KafkaTopicsDescriptor: " + topicsDescriptor);
				} else {
					Iterator<KafkaTopicPartition> iter = newDiscoveredPartitions.iterator();
					KafkaTopicPartition nextPartition;
					while (iter.hasNext()) {
						nextPartition = iter.next();
						if (!setAndCheckDiscoveredPartition(nextPartition)) {
							iter.remove();
						}
					}
				}

				return newDiscoveredPartitions;
			} catch (WakeupException e) {
				// the actual topic / partition metadata fetching methods
				// may be woken up midway; reset the wakeup flag and rethrow
				wakeup = false;
				throw e;
			}
		} else if (!closed && wakeup) {
			// may have been woken up before the method call
			wakeup = false;
			throw new WakeupException();
		} else {
			throw new ClosedException();
		}
	}

⑤ 通过getAllPartitionsForTopics方法，我们可以看到，这里把每个topic的每个分区做为元素，封装成一个KafkaTopicPartition对象，并加到一个list中最终返回；就好比说，我们这里最开始传入的是两个topic，每个topic有12个分区，那么这个list此时就有了24个元素。

@Override
	protected List<KafkaTopicPartition> getAllPartitionsForTopics(List<String> topics) throws WakeupException, RuntimeException {
		List<KafkaTopicPartition> partitions = new LinkedList<>();

		try {
			for (String topic : topics) {
				final List<PartitionInfo> kafkaPartitions = kafkaConsumer.partitionsFor(topic);

				if (kafkaPartitions == null) {
					throw new RuntimeException("Could not fetch partitions for %s. Make sure that the topic exists.".format(topic));
				}

				for (PartitionInfo partitionInfo : kafkaPartitions) {
					partitions.add(new KafkaTopicPartition(partitionInfo.topic(), partitionInfo.partition()));
				}
			}
		} catch (org.apache.kafka.common.errors.WakeupException e) {
			// rethrow our own wakeup exception
			throw new WakeupException();
		}

		return partitions;
	}

⑥ setAndCheckDiscoveredPartition方法，这个方法将告诉我们，一个topic的分区如何被一个subtask ‘预定’；
我们先看这个分区是不是已经被预定的，如果没有，则调用KafkaTopicPartitionAssigner.assign方法

/**
	 * Sets a partition as discovered. Partitions are considered as new
	 * if its partition id is larger than all partition ids previously
	 * seen for the topic it belongs to. Therefore, for a set of
	 * discovered partitions, the order that this method is invoked with
	 * each partition is important.
	 *
	 * <p>If the partition is indeed newly discovered, this method also returns
	 * whether the new partition should be subscribed by this subtask.
	 *
	 * @param partition the partition to set and check
	 *
	 * @return {@code true}, if the partition wasn't seen before and should
	 *         be subscribed by this subtask; {@code false} otherwise
	 */
	public boolean setAndCheckDiscoveredPartition(KafkaTopicPartition partition) {
		if (isUndiscoveredPartition(partition)) {
			discoveredPartitions.add(partition);

			return KafkaTopicPartitionAssigner.assign(partition, numParallelSubtasks) == indexOfThisSubtask;
		}

		return false;
	}

⑦ KafkaTopicPartitionAssigner.assign方法，这个方法返回了一个subtask的id，这个subtask就是topic中一个的partition被指定分配的subtask。
到这里我们应该就明白了，一个partition是如何分配给指定的subtask的：
是根据Topic名称哈希之后对并行度取余，加上分区值再次对并行度取余所决定的。

/**
	 * Returns the index of the target subtask that a specific Kafka partition should be
	 * assigned to.
	 *
	 * <p>The resulting distribution of partitions of a single topic has the following contract:
	 * <ul>
	 *     <li>1. Uniformly distributed across subtasks</li>
	 *     <li>2. Partitions are round-robin distributed (strictly clockwise w.r.t. ascending
	 *     subtask indices) by using the partition id as the offset from a starting index
	 *     (i.e., the index of the subtask which partition 0 of the topic will be assigned to,
	 *     determined using the topic name).</li>
	 * </ul>
	 *
	 * <p>The above contract is crucial and cannot be broken. Consumer subtasks rely on this
	 * contract to locally filter out partitions that it should not subscribe to, guaranteeing
	 * that all partitions of a single topic will always be assigned to some subtask in a
	 * uniformly distributed manner.
	 *
	 * @param partition the Kafka partition
	 * @param numParallelSubtasks total number of parallel subtasks
	 *
	 * @return index of the target subtask that the Kafka partition should be assigned to.
	 */
	public static int assign(KafkaTopicPartition partition, int numParallelSubtasks) {
		int startIndex = ((partition.getTopic().hashCode() * 31) & 0x7FFFFFFF) % numParallelSubtasks;

		// here, the assumption is that the id of Kafka partitions are always ascending
		// starting from 0, and therefore can be used directly as the offset clockwise from the start index
		return (startIndex + partition.getPartition()) % numParallelSubtasks;
	}