- addsource 其中function存的是FlinkKafkaConsumer对象
public <OUT> DataStreamSource<OUT> addSource(SourceFunction<OUT> function, String sourceName, TypeInformation<OUT> typeInfo) {
if (function instanceof ResultTypeQueryable) {
typeInfo = ((ResultTypeQueryable<OUT>) function).getProducedType();
}
if (typeInfo == null) {
try {
typeInfo = TypeExtractor.createTypeInfo(
SourceFunction.class,
function.getClass(), 0, null, null);
} catch (final InvalidTypesException e) {
typeInfo = (TypeInformation<OUT>) new MissingTypeInfo(sourceName, e);
}
}
boolean isParallel = function instanceof ParallelSourceFunction;
clean(function);
final StreamSource<OUT, ?> sourceOperator = new StreamSource<>(function);
return new DataStreamSource<>(this, typeInfo, sourceOperator, isParallel, sourceName);
}
- task阶段
task启动调用SourceStreamTask中的 perforDefaultAction
protected void performDefaultAction(ActionContext context) throws Exception {
// Against the usual contract of this method, this implementation is not step-wise but blocking instead for
// compatibility reasons with the current source interface (source functions run as a loop, not in steps).
sourceThread.start();
// We run an alternative mailbox loop that does not involve default actions and synchronizes around actions.
try {
runAlternativeMailboxLoop();
} catch (Exception mailboxEx) {
// We cancel the source function if some runtime exception escaped the mailbox.
if (!isCanceled()) {
cancelTask();
}
throw mailboxEx;
}
sourceThread.join();
if (!isFinished) {
sourceThread.checkThrowSourceExecutionException();
}
context.allActionsCompleted();
}
3.通过调用streamSource 中的run 中的userFunction.run(ctx); 进入kafka comsumer消费阶段(部分代码)
如果是自定义的sourse 函数,就进入自定义的函数run中执行
public void run(final Object lockingObject,
final StreamStatusMaintainer streamStatusMaintainer,
final Output<StreamRecord<OUT>> collector,
final OperatorChain<?, ?> operatorChain) throws Exception {
try {
userFunction.run(ctx);
4.KafkaConsumerThread 线程,取数据
kafkacomsumer是通过pull的方式从kafka消息队列中获得消息的
public void run() {
// early exit check
if (!running) {
return;
}
// this is the means to talk to FlinkKafkaConsumer's main thread
final Handover handover = this.handover;
// This method initializes the KafkaConsumer and guarantees it is torn down properly.
// This is important, because the consumer has multi-threading issues,
// including concurrent 'close()' calls.
//生成独立的comsumer对象
try {
this.consumer = getConsumer(kafkaProperties);
}
catch (Throwable t) {
handover.reportError(t);
return;
}
// from here on, the consumer is guaranteed to be closed properly
try {
// register Kafka's very own metrics in Flink's metric reporters
if (useMetrics) {
// register Kafka metrics to Flink
Map<MetricName, ? extends Metric> metrics = consumer.metrics();
if (metrics == null) {
// MapR's Kafka implementation returns null here.
log.info("Consumer implementation does not support metrics");
} else {
// we have Kafka metrics, register them
for (Map.Entry<MetricName, ? extends Metric> metric: metrics.entrySet()) {
consumerMetricGroup.gauge(metric.getKey().name(), new KafkaMetricWrapper(metric.getValue()));
// TODO this metric is kept for compatibility purposes; should remove in the future
subtaskMetricGroup.gauge(metric.getKey().name(), new KafkaMetricWrapper(metric.getValue()));
}
}
}
// early exit check
if (!running) {
return;
}
// the latest bulk of records. May carry across the loop if the thread is woken up
// from blocking on the handover
ConsumerRecords<byte[], byte[]> records = null;
// reused variable to hold found unassigned new partitions.
// found partitions are not carried across loops using this variable;
// they are carried across via re-adding them to the unassigned partitions queue
List<KafkaTopicPartitionState<TopicPartition>> newPartitions;
// main fetch loop
while (running) {
// check if there is something to commit
if (!commitInProgress) {
// get and reset the work-to-be committed, so we don't repeatedly commit the same
final Tuple2<Map<TopicPartition, OffsetAndMetadata>, KafkaCommitCallback> commitOffsetsAndCallback =
nextOffsetsToCommit.getAndSet(null);
if (commitOffsetsAndCallback != null) {
log.debug("Sending async offset commit request to Kafka broker");
// also record that a commit is already in progress
// the order here matters! first set the flag, then send the commit command.
commitInProgress = true;
consumer.commitAsync(commitOffsetsAndCallback.f0, new CommitCallback(commitOffsetsAndCallback.f1));
}
}
try {
if (hasAssignedPartitions) {
newPartitions = unassignedPartitionsQueue.pollBatch();
}
else {
// if no assigned partitions block until we get at least one
// instead of hot spinning this loop. We rely on a fact that
// unassignedPartitionsQueue will be closed on a shutdown, so
// we don't block indefinitely
newPartitions = unassignedPartitionsQueue.getBatchBlocking();
}
if (newPartitions != null) {
reassignPartitions(newPartitions);
}
- FlinkKafkaConsumerBase 中的 open函数,初始化了offset和partition
offset
this.offsetCommitMode = OffsetCommitModes.fromConfiguration(
getIsAutoCommitEnabled(),
enableCommitOnCheckpoints,
((StreamingRuntimeContext) getRuntimeContext()).isCheckpointingEnabled());
其中对象 OffsetCommitModes,有三种 模式,
/** Completely disable offset committing. */
DISABLED,
/** Commit offsets back to Kafka only when checkpoints are completed. */
ON_CHECKPOINTS,
/** Commit offsets periodically back to Kafka, using the auto commit functionality of internal Kafka clients. */
KAFKA_PERIODIC;
offSet(非checkpoint)自动commit 当最开始设置属性时
properties.put("enable.auto.commit", "true");
5.分区策略
分区与subTask
FlinkKafkaConsumerBase类中的open函数中,通过调用
KafkaTopicPartitionAssigner.assign
for (Map.Entry<KafkaTopicPartition, Long> restoredStateEntry : restoredState.entrySet()) {
if (!restoredFromOldState) {
// seed the partition discoverer with the union state while filtering out
// restored partitions that should not be subscribed by this subtask
if (KafkaTopicPartitionAssigner.assign(
restoredStateEntry.getKey(), getRuntimeContext().getNumberOfParallelSubtasks())
== getRuntimeContext().getIndexOfThisSubtask()){
subscribedPartitionsToStartOffsets.put(restoredStateEntry.getKey(), restoredStateEntry.getValue());
}
} else {
// when restoring from older 1.1 / 1.2 state, the restored state would not be the union state;
// in this case, just use the restored state as the subscribed partitions
subscribedPartitionsToStartOffsets.put(restoredStateEntry.getKey(), restoredStateEntry.getValue());
}
}
对subtask取余分配
public static int assign(KafkaTopicPartition partition, int numParallelSubtasks) {
int startIndex = ((partition.getTopic().hashCode() * 31) & 0x7FFFFFFF) % numParallelSubtasks;
// here, the assumption is that the id of Kafka partitions are always ascending
// starting from 0, and therefore can be used directly as the offset clockwise from the start index
return (startIndex + partition.getPartition()) % numParallelSubtasks;
}
AbstractPartitionAssignor中的assign 实现kafka Consumer 的分区策略
默认为RoundRobin 模式的,按照topic分配,可以overRide这个方法自定义分区策略
public Map<String, Assignment> assign(Cluster metadata, Map<String, Subscription> subscriptions) {
Set<String> allSubscribedTopics = new HashSet<>();
for (Map.Entry<String, Subscription> subscriptionEntry : subscriptions.entrySet())
allSubscribedTopics.addAll(subscriptionEntry.getValue().topics());
Map<String, Integer> partitionsPerTopic = new HashMap<>();
for (String topic : allSubscribedTopics) {
Integer numPartitions = metadata.partitionCountForTopic(topic);
if (numPartitions != null && numPartitions > 0)
partitionsPerTopic.put(topic, numPartitions);
else
log.debug("Skipping assignment for topic {} since no metadata is available", topic);
}
通过组件 AbstractPartitionDiscoverer 处理后将得到由两个 KafkaTopicPartition 对象组成的集合:KafkaTopicPartition(topic:A, partition:0) 和 KafkaTopicPartition(topic:A, partition:1)
checkpoint
Flink Kafka Consumer 最主要的职责就是能从 Kafka 中获取数据,交给下游处理。在 Kafka Consumer 中 AbstractFetcher 组件负责完成这部分功能。除此之外 Fetcher 还负责 offset 的提交、KafkaTopicPartitionState 结构的数据维护。
状态的数据结构的存储在KafkaTopicPartitionState
ListState 为状态存储结构
在 FlinkKafkaConsumerBase中实现了checkPointFunction,所以当comsumer开启checkpoint后,可以实现checkpointedFuntion里的逻辑
public abstract class FlinkKafkaConsumerBase<T> extends RichParallelSourceFunction<T> implements
CheckpointListener,
ResultTypeQueryable<T>,
CheckpointedFunction {
checkpointedFunction中包含以下两个函数
/**
* This method is called when a snapshot for a checkpoint is requested. This acts as a hook to the function to
* ensure that all state is exposed by means previously offered through {@link FunctionInitializationContext} when
* the Function was initialized, or offered now by {@link FunctionSnapshotContext} itself.
*
* @param context the context for drawing a snapshot of the operator
* @throws Exception
*/
void snapshotState(FunctionSnapshotContext context) throws Exception;
/**
* This method is called when the parallel function instance is created during distributed
* execution. Functions typically set up their state storing data structures in this method.
*
* @param context the context for initializing the operator
* @throws Exception
*/
void initializeState(FunctionInitializationContext context) throws Exception;
snapshot
StreamTask 中的CheckpointingOperation
AbstractStreamOperator 中的snapshotState 调用
snapshotInProgress.setKeyedStateRawFuture(snapshotContext.getKeyedStateStreamFuture());
AbstractUdfStreamOperator 中的snapshotState 调用StreamingFunctionUtils 中的snapshotFunctionState
StreamingFunctionUtils 中的snapshotFunctionState调用trySnapshotFunctionState
StreamingFunctionUtils 中的trySnapshotFunctionState 调用 snapshotState
if (userFunction instanceof CheckpointedFunction) {
((CheckpointedFunction) userFunction).snapshotState(context);
return true;
snapshotState 为FlinkKafkaConsumerBase 中的snapshotState snapshotState作用
提交offset ,以免出现故障便于恢复
HashMap<KafkaTopicPartition, Long> currentOffsets = fetcher.snapshotCurrentState();
存储当前的offset,key是KafkaTopicPartition ,KafkaTopicPartition 的结构如下
待提交的offsetCommit
pendingOffsetsToCommit.put(context.getCheckpointId(), currentOffsets)
pendingOffsetsToCommit ,可以用于恢复状态。
initializeState
FlinkKafkaConsumerBase 中的initializeState
状态初始化阶段尝试从状态后端加载出可以用来恢复的状态
OperatorStateStore stateStore = context.getOperatorStateStore();
ListState<Tuple2<KafkaTopicPartition, Long>> oldRoundRobinListState =
stateStore.getSerializableListState(DefaultOperatorStateBackend.DEFAULT_OPERATOR_STATE_NAME);
FlinkKafkaConsumerBase 中的open
this.offsetCommitMode = OffsetCommitModes.fromConfiguration(
getIsAutoCommitEnabled(),
enableCommitOnCheckpoints,
((StreamingRuntimeContext) getRuntimeContext()).isCheckpointingEnabled());
getIsAutoCommitEnabled() 是调用FlinkKafkaConsumer09 中的 getIsAutoCommitEnabled()
protected boolean getIsAutoCommitEnabled() {
return getBoolean(properties, ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, true) &&
PropertiesUtil.getLong(properties, ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, 5000) > 0;
}