flink kafka addSource(comsumer ) 源码学习笔记

最新推荐文章于 2024-08-17 23:53:30 发布

程序媛-yang

最新推荐文章于 2024-08-17 23:53:30 发布

阅读量1k

点赞数

分类专栏： flink 文章标签： flink

本文链接：https://blog.csdn.net/weixin_38472282/article/details/105357245

版权

flink 专栏收录该内容

17 篇文章 0 订阅

订阅专栏

addsource 其中function存的是FlinkKafkaConsumer对象

   public <OUT> DataStreamSource<OUT> addSource(SourceFunction<OUT> function, String sourceName, TypeInformation<OUT> typeInfo) {

   	if (function instanceof ResultTypeQueryable) {
   		typeInfo = ((ResultTypeQueryable<OUT>) function).getProducedType();
   	}
   	if (typeInfo == null) {
   		try {
   			typeInfo = TypeExtractor.createTypeInfo(
   					SourceFunction.class,
   					function.getClass(), 0, null, null);
   		} catch (final InvalidTypesException e) {
   			typeInfo = (TypeInformation<OUT>) new MissingTypeInfo(sourceName, e);
			}
   	}

   	boolean isParallel = function instanceof ParallelSourceFunction;

   	clean(function);

   	final StreamSource<OUT, ?> sourceOperator = new StreamSource<>(function);
   	return new DataStreamSource<>(this, typeInfo, sourceOperator, isParallel, sourceName);
   }

task阶段
task启动调用SourceStreamTask中的 perforDefaultAction

protected void performDefaultAction(ActionContext context) throws Exception {
   // Against the usual contract of this method, this implementation is not step-wise but blocking instead for
   // compatibility reasons with the current source interface (source functions run as a loop, not in steps).
   sourceThread.start();

   // We run an alternative mailbox loop that does not involve default actions and synchronizes around actions.
   try {
      runAlternativeMailboxLoop();
   } catch (Exception mailboxEx) {
      // We cancel the source function if some runtime exception escaped the mailbox.
      if (!isCanceled()) {
         cancelTask();
      }
      throw mailboxEx;
   }

   sourceThread.join();
   if (!isFinished) {
      sourceThread.checkThrowSourceExecutionException();
   }

   context.allActionsCompleted();
}

3.通过调用streamSource 中的run 中的userFunction.run(ctx); 进入kafka comsumer消费阶段（部分代码）
如果是自定义的sourse 函数，就进入自定义的函数run中执行

public void run(final Object lockingObject,
			final StreamStatusMaintainer streamStatusMaintainer,
			final Output<StreamRecord<OUT>> collector,
			final OperatorChain<?, ?> operatorChain) throws Exception {
		try {
			userFunction.run(ctx);

4.KafkaConsumerThread 线程，取数据
kafkacomsumer是通过pull的方式从kafka消息队列中获得消息的

public void run() {
   // early exit check
   if (!running) {
      return;
   }

   // this is the means to talk to FlinkKafkaConsumer's main thread
   final Handover handover = this.handover;

   // This method initializes the KafkaConsumer and guarantees it is torn down properly.
   // This is important, because the consumer has multi-threading issues,
   // including concurrent 'close()' calls.
   //生成独立的comsumer对象
   try {
      this.consumer = getConsumer(kafkaProperties);
   }
   catch (Throwable t) {
      handover.reportError(t);
      return;
   }

   // from here on, the consumer is guaranteed to be closed properly
   try {
      // register Kafka's very own metrics in Flink's metric reporters
      if (useMetrics) {
         // register Kafka metrics to Flink
         Map<MetricName, ? extends Metric> metrics = consumer.metrics();
         if (metrics == null) {
            // MapR's Kafka implementation returns null here.
            log.info("Consumer implementation does not support metrics");
         } else {
            // we have Kafka metrics, register them
            for (Map.Entry<MetricName, ? extends Metric> metric: metrics.entrySet()) {
               consumerMetricGroup.gauge(metric.getKey().name(), new KafkaMetricWrapper(metric.getValue()));

               // TODO this metric is kept for compatibility purposes; should remove in the future
               subtaskMetricGroup.gauge(metric.getKey().name(), new KafkaMetricWrapper(metric.getValue()));
            }
         }
      }

      // early exit check
      if (!running) {
         return;
      }

      // the latest bulk of records. May carry across the loop if the thread is woken up
      // from blocking on the handover
      ConsumerRecords<byte[], byte[]> records = null;

      // reused variable to hold found unassigned new partitions.
      // found partitions are not carried across loops using this variable;
      // they are carried across via re-adding them to the unassigned partitions queue
      List<KafkaTopicPartitionState<TopicPartition>> newPartitions;

      // main fetch loop
      while (running) {

         // check if there is something to commit
         if (!commitInProgress) {
            // get and reset the work-to-be committed, so we don't repeatedly commit the same
            final Tuple2<Map<TopicPartition, OffsetAndMetadata>, KafkaCommitCallback> commitOffsetsAndCallback =
                  nextOffsetsToCommit.getAndSet(null);

            if (commitOffsetsAndCallback != null) {
               log.debug("Sending async offset commit request to Kafka broker");

               // also record that a commit is already in progress
               // the order here matters! first set the flag, then send the commit command.
               commitInProgress = true;
               consumer.commitAsync(commitOffsetsAndCallback.f0, new CommitCallback(commitOffsetsAndCallback.f1));
            }
         }

         try {
            if (hasAssignedPartitions) {
               newPartitions = unassignedPartitionsQueue.pollBatch();
            }
            else {
               // if no assigned partitions block until we get at least one
               // instead of hot spinning this loop. We rely on a fact that
               // unassignedPartitionsQueue will be closed on a shutdown, so
               // we don't block indefinitely
               newPartitions = unassignedPartitionsQueue.getBatchBlocking();
            }
            if (newPartitions != null) {
               reassignPartitions(newPartitions);
            }

FlinkKafkaConsumerBase 中的 open函数，初始化了offset和partition
offset

	this.offsetCommitMode = OffsetCommitModes.fromConfiguration(
				getIsAutoCommitEnabled(),
				enableCommitOnCheckpoints,
				((StreamingRuntimeContext) getRuntimeContext()).isCheckpointingEnabled());

其中对象 OffsetCommitModes，有三种模式，

/** Completely disable offset committing. */
	DISABLED,
	/** Commit offsets back to Kafka only when checkpoints are completed. */
	ON_CHECKPOINTS,
	/** Commit offsets periodically back to Kafka, using the auto commit functionality of internal Kafka clients. */
	KAFKA_PERIODIC;

offSet（非checkpoint）自动commit 当最开始设置属性时

properties.put("enable.auto.commit", "true");

5.分区策略
分区与subTask
FlinkKafkaConsumerBase类中的open函数中，通过调用

KafkaTopicPartitionAssigner.assign
for (Map.Entry<KafkaTopicPartition, Long> restoredStateEntry : restoredState.entrySet()) {
   if (!restoredFromOldState) {
      // seed the partition discoverer with the union state while filtering out
      // restored partitions that should not be subscribed by this subtask
      if (KafkaTopicPartitionAssigner.assign(
         restoredStateEntry.getKey(), getRuntimeContext().getNumberOfParallelSubtasks())
            == getRuntimeContext().getIndexOfThisSubtask()){
         subscribedPartitionsToStartOffsets.put(restoredStateEntry.getKey(), restoredStateEntry.getValue());
      }
   } else {
      // when restoring from older 1.1 / 1.2 state, the restored state would not be the union state;
      // in this case, just use the restored state as the subscribed partitions
      subscribedPartitionsToStartOffsets.put(restoredStateEntry.getKey(), restoredStateEntry.getValue());
   }
}

对subtask取余分配

public static int assign(KafkaTopicPartition partition, int numParallelSubtasks) {
   int startIndex = ((partition.getTopic().hashCode() * 31) & 0x7FFFFFFF) % numParallelSubtasks;

   // here, the assumption is that the id of Kafka partitions are always ascending
   // starting from 0, and therefore can be used directly as the offset clockwise from the start index
   return (startIndex + partition.getPartition()) % numParallelSubtasks;
}

AbstractPartitionAssignor中的assign 实现kafka Consumer 的分区策略
默认为RoundRobin 模式的，按照topic分配，可以overRide这个方法自定义分区策略

public Map<String, Assignment> assign(Cluster metadata, Map<String, Subscription> subscriptions) {
    Set<String> allSubscribedTopics = new HashSet<>();
    for (Map.Entry<String, Subscription> subscriptionEntry : subscriptions.entrySet())
        allSubscribedTopics.addAll(subscriptionEntry.getValue().topics());

    Map<String, Integer> partitionsPerTopic = new HashMap<>();
    for (String topic : allSubscribedTopics) {
        Integer numPartitions = metadata.partitionCountForTopic(topic);
        if (numPartitions != null && numPartitions > 0)
            partitionsPerTopic.put(topic, numPartitions);
        else
            log.debug("Skipping assignment for topic {} since no metadata is available", topic);
    }

通过组件 AbstractPartitionDiscoverer 处理后将得到由两个 KafkaTopicPartition 对象组成的集合：KafkaTopicPartition(topic:A, partition:0) 和 KafkaTopicPartition(topic:A, partition:1)

checkpoint
Flink Kafka Consumer 最主要的职责就是能从 Kafka 中获取数据，交给下游处理。在 Kafka Consumer 中 AbstractFetcher 组件负责完成这部分功能。除此之外 Fetcher 还负责 offset 的提交、KafkaTopicPartitionState 结构的数据维护。
状态的数据结构的存储在KafkaTopicPartitionState
ListState 为状态存储结构
在 FlinkKafkaConsumerBase中实现了checkPointFunction，所以当comsumer开启checkpoint后，可以实现checkpointedFuntion里的逻辑

public abstract class FlinkKafkaConsumerBase<T> extends RichParallelSourceFunction<T> implements
      CheckpointListener,
      ResultTypeQueryable<T>,
      CheckpointedFunction {

checkpointedFunction中包含以下两个函数

/**
 * This method is called when a snapshot for a checkpoint is requested. This acts as a hook to the function to
 * ensure that all state is exposed by means previously offered through {@link FunctionInitializationContext} when
 * the Function was initialized, or offered now by {@link FunctionSnapshotContext} itself.
 *
 * @param context the context for drawing a snapshot of the operator
 * @throws Exception
 */
void snapshotState(FunctionSnapshotContext context) throws Exception;
/**
 * This method is called when the parallel function instance is created during distributed
 * execution. Functions typically set up their state storing data structures in this method.
 *
 * @param context the context for initializing the operator
 * @throws Exception
 */
void initializeState(FunctionInitializationContext context) throws Exception;

snapshot
StreamTask 中的CheckpointingOperation
AbstractStreamOperator 中的snapshotState 调用

snapshotInProgress.setKeyedStateRawFuture(snapshotContext.getKeyedStateStreamFuture());

AbstractUdfStreamOperator 中的snapshotState 调用StreamingFunctionUtils 中的snapshotFunctionState

StreamingFunctionUtils 中的snapshotFunctionState调用trySnapshotFunctionState

StreamingFunctionUtils 中的trySnapshotFunctionState 调用 snapshotState

if (userFunction instanceof CheckpointedFunction) {
   ((CheckpointedFunction) userFunction).snapshotState(context);

   return true;

snapshotState 为FlinkKafkaConsumerBase 中的snapshotState snapshotState作用
提交offset ，以免出现故障便于恢复

HashMap<KafkaTopicPartition, Long> currentOffsets = fetcher.snapshotCurrentState();

存储当前的offset，key是KafkaTopicPartition ，KafkaTopicPartition 的结构如下

待提交的offsetCommit

pendingOffsetsToCommit.put(context.getCheckpointId(), currentOffsets)

pendingOffsetsToCommit ，可以用于恢复状态。

initializeState
FlinkKafkaConsumerBase 中的initializeState
状态初始化阶段尝试从状态后端加载出可以用来恢复的状态

OperatorStateStore stateStore = context.getOperatorStateStore();

ListState<Tuple2<KafkaTopicPartition, Long>> oldRoundRobinListState =
   stateStore.getSerializableListState(DefaultOperatorStateBackend.DEFAULT_OPERATOR_STATE_NAME);

FlinkKafkaConsumerBase 中的open

this.offsetCommitMode = OffsetCommitModes.fromConfiguration(
      getIsAutoCommitEnabled(),
      enableCommitOnCheckpoints,
      ((StreamingRuntimeContext) getRuntimeContext()).isCheckpointingEnabled());

getIsAutoCommitEnabled() 是调用FlinkKafkaConsumer09 中的 getIsAutoCommitEnabled()

protected boolean getIsAutoCommitEnabled() {
  return getBoolean(properties, ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, true) &&
        PropertiesUtil.getLong(properties, ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, 5000) > 0;
}