本次分析针对 org.apache.flink.streaming.examples.wordcount程序进行分析, 通过debug代码,了解 StreamGraph的构建过程。
针对对应的几处操作添加debug断点:

首先执行到 DataStream的 flatMap方法:

继续往下执行,执行到doTransform方法的时候, 获取当前执行环境,将当前StreamTransformation添加到StreamExecutionEnvironment 内部使用一个 List<StreamTransformation<?>> transformations中,该List 保留生成 DataStream 的所有转换:

接下来执行到keyBy操作,DataStream转换为了KeyedStream:

执行sum聚合操作:

继续往下,同样的会获取当前执行环境,将对应的StreamTransformation添加到StreamExecutionEnvironment 内部使用一个 List<StreamTransformation<?>> transformations中:

最后sink操作,同样的会获取当前执行环境,将对应的StreamTransformation添加到StreamExecutionEnvironment 内部使用一个 List<StreamTransformation<?>> transformations中:

接着程序执行到 StreamTransformation的execute方法:
public JobExecutionResult execute(String jobName) throws Exception {
// DataStream中的所有转换
final List<Transformation<?>> originalTransformations = new ArrayList<>(transformations);
// 生成StreamGraph
StreamGraph streamGraph = getStreamGraph();
if (jobName != null) {
streamGraph.setJobName(jobName);
}
try {
return execute(streamGraph);
} catch (Throwable t) {
Optional<ClusterDatasetCorruptedException> clusterDatasetCorruptedException =
ExceptionUtils.findThrowable(t, ClusterDatasetCorruptedException.class);
if (!clusterDatasetCorruptedException.isPresent()) {
throw t;
}
// Retry without cache if it is caused by corrupted cluster dataset.
invalidateCacheTransformations(originalTransformations);
streamGraph = getStreamGraph(originalTransformations);
return execute(streamGraph);
}
}
小结:
- flatmap 转换将用户自定义的 FlatMapFunction 包装到 StreamFlatMap 这个 Operator 中
- 再将 StreamFlatMap 包装到 OneInputTransformation
- 最后该 transformation 存到 env 中
- 当调用 env.execute 时,遍历其中的 transformation 列表构造出 StreamGraph
往下执行:

StreamGraphGenerator的generate方法:
public StreamGraph generate() {
streamGraph = new StreamGraph(executionConfig, checkpointConfig, savepointRestoreSettings);
shouldExecuteInBatchMode = shouldExecuteInBatchMode();
configureStreamGraph(streamGraph);
alreadyTransformed = new IdentityHashMap<>();
// 遍历DataStream的操作列表,递归调用```StreamGraphGenerator#transform```方法
for (Transformation<?> transformation : transformations) {
transform(transformation);
}
streamGraph.setSlotSharingGroupResource(slotSharingGroupResources);
setFineGrainedGlobalStreamExchangeMode(streamGraph);
for (StreamNode node : streamGraph.getStreamNodes()) {
if (node.getInEdges().stream().anyMatch(this::shouldDisableUnalignedCheckpointing)) {
for (StreamEdge edge : node.getInEdges()) {
edge.setSupportsUnalignedCheckpoints(false);
}
}
}
final StreamGraph builtStreamGraph = streamGraph;
alreadyTransformed.clear();
alreadyTransformed = null;
streamGraph = null;
return builtStreamGraph;
}
StreamGraphGenerator的transform方法:
private Collection<Integer> transform(Transformation<?> transform) {
if (alreadyTransformed.containsKey(transform)) {
return alreadyTransformed.get(transform);
}
LOG.debug("Transforming " + transform);
if (transform.getMaxParallelism() <= 0) {
// if the max parallelism hasn't been set, then first use the job wide max parallelism
// from the ExecutionConfig.
int globalMaxParallelismFromConfig = executionConfig.getMaxParallelism();
if (globalMaxParallelismFromConfig > 0) {
transform.setMaxParallelism(globalMaxParallelismFromConfig);
}
}
transform
.getSlotSharingGroup()
.ifPresent(
slotSharingGroup -> {
final ResourceSpec resourceSpec =
SlotSharingGroupUtils.extractResourceSpec(slotSharingGroup);
if (!resourceSpec.equals(ResourceSpec.UNKNOWN)) {
slotSharingGroupResources.compute(
slotSharingGroup.getName(),
(name, profile) -> {
if (profile == null) {
return ResourceProfile.fromResourceSpec(
resourceSpec, MemorySize.ZERO);
} else if (!ResourceProfile.fromResourceSpec(
resourceSpec, MemorySize.ZERO)
.equals(profile)) {
throw new IllegalArgumentException(
"The slot sharing group "
+ slotSharingGroup.getName()
+ " has been configured with two different resource spec.");
} else {
return profile;
}
});
}
});
// call at least once to trigger exceptions about MissingTypeInfo
transform.getOutputType();
@SuppressWarnings("unchecked")
final TransformationTranslator<?, Transformation<?>> translator =
(TransformationTranslator<?, Transformation<?>>)
translatorMap.get(transform.getClass());
Collection<Integer> transformedIds;
if (translator != null) {
transformedIds = translate(translator, transform);
} else {
transformedIds = legacyTransform(transform);
}
// need this check because the iterate transformation adds itself before
// transforming the feedback edges
if (!alreadyTransformed.containsKey(transform)) {
alreadyTransformed.put(transform, transformedIds);
}
return transformedIds;
}
StreamGraphGenerator的translate方法:
private Collection<Integer> translate(
final TransformationTranslator<?, Transformation<?>> translator,
final Transformation<?> transform) {
checkNotNull(translator);
checkNotNull(transform);
//首先确保上游节点完成转换 (递归调用再这里)
final List<Collection<Integer>> allInputIds = getParentInputIds(transform.getInputs());
// the recursive call might have already transformed this
// 由于是递归调用的,可能已经完成了转换
if (alreadyTransformed.containsKey(transform)) {
return alreadyTransformed.get(transform);
}
//确定资源共享组,用户如果没有指定,默认是default
final String slotSharingGroup =
determineSlotSharingGroup(
transform.getSlotSharingGroup().isPresent()
? transform.getSlotSharingGroup().get().getName()
: null,
allInputIds.stream()
.flatMap(Collection::stream)
.collect(Collectors.toList()));
final TransformationTranslator.Context context =
new ContextImpl(this, streamGraph, slotSharingGroup, configuration);
// 执行模式
return shouldExecuteInBatchMode
? translator.translateForBatch(transform, context)
: translator.translateForStreaming(transform, context);
}
StreamGraphGenerator的getParentInputIds方法:
private List<Collection<Integer>> getParentInputIds(
@Nullable final Collection<Transformation<?>> parentTransformations) {
final List<Collection<Integer>> allInputIds = new ArrayList<>();
if (parentTransformations == null) {
return allInputIds;
}
for (Transformation<?> transformation : parentTransformations) {
// 递归调用
allInputIds.add(transform(transformation));
}
return allInputIds;
}
SimpleTransformationTranslator的translateForStreaming方法:
@Override
public final Collection<Integer> translateForStreaming(
final T transformation, final Context context) {
checkNotNull(transformation);
checkNotNull(context);
final Collection<Integer> transformedIds =
translateForStreamingInternal(transformation, context);
configure(transformation, context);
return transformedIds;
}
AbstractOneInputTransformationTranslator的translateInternal方法, StreamTransformations被转换为 StreamGraph中的节点 StreamNode,并为上下游节点添加边 StreamEdge。:
protected Collection<Integer> translateInternal(
final Transformation<OUT> transformation,
final StreamOperatorFactory<OUT> operatorFactory,
final TypeInformation<IN> inputType,
@Nullable final KeySelector<IN, ?> stateKeySelector,
@Nullable final TypeInformation<?> stateKeyType,
final Context context) {
checkNotNull(transformation);
checkNotNull(operatorFactory);
checkNotNull(inputType);
checkNotNull(context);
final StreamGraph streamGraph = context.getStreamGraph();
final String slotSharingGroup = context.getSlotSharingGroup();
final int transformationId = transformation.getId();
final ExecutionConfig executionConfig = streamGraph.getExecutionConfig();
//向 StreamGraph 中添加 Operator, 这一步会生成对应的 StreamNode
streamGraph.addOperator(
transformationId,
slotSharingGroup,
transformation.getCoLocationGroupKey(),
operatorFactory,
inputType,
transformation.getOutputType(),
transformation.getName());
if (stateKeySelector != null) {
TypeSerializer<?> keySerializer = stateKeyType.createSerializer(executionConfig);
streamGraph.setOneInputStateKey(transformationId, stateKeySelector, keySerializer);
}
int parallelism =
transformation.getParallelism() != ExecutionConfig.PARALLELISM_DEFAULT
? transformation.getParallelism()
: executionConfig.getParallelism();
streamGraph.setParallelism(
transformationId, parallelism, transformation.isParallelismConfigured());
streamGraph.setMaxParallelism(transformationId, transformation.getMaxParallelism());
final List<Transformation<?>> parentTransformations = transformation.getInputs();
checkState(
parentTransformations.size() == 1,
"Expected exactly one input transformation but found "
+ parentTransformations.size());
//依次连接到上游节点,创建 StreamEdge
for (Integer inputId : context.getStreamNodeIds(parentTransformations.get(0))) {
streamGraph.addEdge(inputId, transformationId, 0);
}
if (transformation instanceof PhysicalTransformation) {
streamGraph.setSupportsConcurrentExecutionAttempts(
transformationId,
((PhysicalTransformation<OUT>) transformation)
.isSupportsConcurrentExecutionAttempts());
}
return Collections.singleton(transformationId);
}
接着看一看 StreamGraph中对应的添加节点和边的方法:
protected StreamNode addNode(
Integer vertexID,
@Nullable String slotSharingGroup,
@Nullable String coLocationGroup,
Class<? extends TaskInvokable> vertexClass,
StreamOperatorFactory<?> operatorFactory,
String operatorName) {
if (streamNodes.containsKey(vertexID)) {
throw new RuntimeException("Duplicate vertexID " + vertexID);
}
StreamNode vertex =
new StreamNode(
vertexID,
slotSharingGroup,
coLocationGroup,
operatorFactory,
operatorName,
vertexClass);
//创建 StreamNode,这里保存了 StreamOperator 和 vertexClass 信息
streamNodes.put(vertexID, vertex);
return vertex;
}

在 StreamNode中,保存了对应的 StreamOperator(从 StreamTransformation得到),并且还引入了变量 jobVertexClass来表示该节点在 TaskManager中运行时的实际任务类型。
private final Class<? extends TaskInvokable> jobVertexClass;
TaskInvokable是所有可以在 TaskManager中运行的任务的接口,包括流式任务和批任务。StreamTask是所有流式任务的基础类,其具体的子类包括 SourceStreamTask, OneInputStreamTask, TwoInputStreamTask等。
对于一些不包含物理转换操作的 StreamTransformation,如 Partitioning, 侧输出/select, union,并不会生成 StreamNode,而是生成一个带有特定属性的虚拟节点。当添加一条有虚拟节点指向下游节点的边时,会找到虚拟节点上游的物理节点,在两个物理节点之间添加边,并把虚拟转换操作的属性附着上去。
以 PartitionTansformation为例, PartitionTansformation是 KeyedStream对应的转换, PartitionTransformationTranslator的translateInternal 方法:
private Collection<Integer> translateInternal(
final PartitionTransformation<OUT> transformation,
final Context context,
boolean supportsBatchExchange) {
checkNotNull(transformation);
checkNotNull(context);
final StreamGraph streamGraph = context.getStreamGraph();
final List<Transformation<?>> parentTransformations = transformation.getInputs();
checkState(
parentTransformations.size() == 1,
"Expected exactly one input transformation but found "
+ parentTransformations.size());
final Transformation<?> input = parentTransformations.get(0);
List<Integer> resultIds = new ArrayList<>();
StreamExchangeMode exchangeMode = transformation.getExchangeMode();
// StreamExchangeMode#BATCH has no effect in streaming mode so we can safely reset it to
// UNDEFINED and let Flink decide on the best exchange mode.
if (!supportsBatchExchange && exchangeMode == StreamExchangeMode.BATCH) {
exchangeMode = StreamExchangeMode.UNDEFINED;
}
for (Integer inputId : context.getStreamNodeIds(input)) {
final int virtualId = Transformation.getNewNodeId();
//添加虚拟的 Partition 节点
streamGraph.addVirtualPartitionNode(
inputId, virtualId, transformation.getPartitioner(), exchangeMode);
resultIds.add(virtualId);
}
return resultIds;
}

StreamGraph的addVirtualPartitionNode方法:
public void addVirtualPartitionNode(
Integer originalId,
Integer virtualId,
StreamPartitioner<?> partitioner,
StreamExchangeMode exchangeMode) {
if (virtualPartitionNodes.containsKey(virtualId)) {
throw new IllegalStateException(
"Already has virtual partition node with id " + virtualId);
}
//添加一个虚拟节点,后续添加边的时候会连接到实际的物理节点
virtualPartitionNodes.put(virtualId, new Tuple3<>(originalId, partitioner, exchangeMode));
}
在每一个物理节点的转换上,会调用 StreamGraph#addEdge在输入节点和当前节点之间建立边的连接:
public void addEdge(Integer upStreamVertexID, Integer downStreamVertexID, int typeNumber) {
addEdge(upStreamVertexID, downStreamVertexID, typeNumber, null);
}
public void addEdge(
Integer upStreamVertexID,
Integer downStreamVertexID,
int typeNumber,
IntermediateDataSetID intermediateDataSetId) {
addEdgeInternal(
upStreamVertexID,
downStreamVertexID,
typeNumber,
null,
new ArrayList<String>(),
null,
null,
intermediateDataSetId);
}
private void addEdgeInternal(
Integer upStreamVertexID,
Integer downStreamVertexID,
int typeNumber,
StreamPartitioner<?> partitioner,
List<String> outputNames,
OutputTag outputTag,
StreamExchangeMode exchangeMode,
IntermediateDataSetID intermediateDataSetId) {
//先判断是不是虚拟节点上的边,如果是,则找到虚拟节点上游对应的物理节点
//在两个物理节点之间添加边,并把对应的 StreamPartitioner,或者 OutputTag 等补充信息添加到StreamEdge中
if (virtualSideOutputNodes.containsKey(upStreamVertexID)) {
int virtualId = upStreamVertexID;
upStreamVertexID = virtualSideOutputNodes.get(virtualId).f0;
if (outputTag == null) {
outputTag = virtualSideOutputNodes.get(virtualId).f1;
}
addEdgeInternal(
upStreamVertexID,
downStreamVertexID,
typeNumber,
partitioner,
null,
outputTag,
exchangeMode,
intermediateDataSetId);
} else if (virtualPartitionNodes.containsKey(upStreamVertexID)) {
int virtualId = upStreamVertexID;
upStreamVertexID = virtualPartitionNodes.get(virtualId).f0;
if (partitioner == null) {
partitioner = virtualPartitionNodes.get(virtualId).f1;
}
exchangeMode = virtualPartitionNodes.get(virtualId).f2;
addEdgeInternal(
upStreamVertexID,
downStreamVertexID,
typeNumber,
partitioner,
outputNames,
outputTag,
exchangeMode,
intermediateDataSetId);
} else {
createActualEdge(
upStreamVertexID,
downStreamVertexID,
typeNumber,
partitioner,
outputTag,
exchangeMode,
intermediateDataSetId);
}
}
StreamGraph的createActualEdge方法:
private void createActualEdge(
Integer upStreamVertexID,
Integer downStreamVertexID,
int typeNumber,
StreamPartitioner<?> partitioner,
OutputTag outputTag,
StreamExchangeMode exchangeMode,
IntermediateDataSetID intermediateDataSetId) {
// 两个物理节点
StreamNode upstreamNode = getStreamNode(upStreamVertexID);
StreamNode downstreamNode = getStreamNode(downStreamVertexID);
// 如果没有指定分区或者上下游的分区一致,则使用forward分区策略,否则使用rebalance分区策略
// If no partitioner was specified and the parallelism of upstream and downstream
// operator matches use forward partitioning, use rebalance otherwise.
if (partitioner == null
&& upstreamNode.getParallelism() == downstreamNode.getParallelism()) {
partitioner =
dynamic ? new ForwardForUnspecifiedPartitioner<>() : new ForwardPartitioner<>();
} else if (partitioner == null) {
partitioner = new RebalancePartitioner<Object>();
}
if (partitioner instanceof ForwardPartitioner) {
if (upstreamNode.getParallelism() != downstreamNode.getParallelism()) {
if (partitioner instanceof ForwardForConsecutiveHashPartitioner) {
partitioner =
((ForwardForConsecutiveHashPartitioner<?>) partitioner)
.getHashPartitioner();
} else {
throw new UnsupportedOperationException(
"Forward partitioning does not allow "
+ "change of parallelism. Upstream operation: "
+ upstreamNode
+ " parallelism: "
+ upstreamNode.getParallelism()
+ ", downstream operation: "
+ downstreamNode
+ " parallelism: "
+ downstreamNode.getParallelism()
+ " You must use another partitioning strategy, such as broadcast, rebalance, shuffle or global.");
}
}
}
if (exchangeMode == null) {
exchangeMode = StreamExchangeMode.UNDEFINED;
}
/**
* Just make sure that {@link StreamEdge} connecting same nodes (for example as a result of
* self unioning a {@link DataStream}) are distinct and unique. Otherwise it would be
* difficult on the {@link StreamTask} to assign {@link RecordWriter}s to correct {@link
* StreamEdge}.
*/
int uniqueId = getStreamEdges(upstreamNode.getId(), downstreamNode.getId()).size();
//创建 StreamEdge,保留了 StreamPartitioner 等属性
StreamEdge edge =
new StreamEdge(
upstreamNode,
downstreamNode,
typeNumber,
partitioner,
outputTag,
exchangeMode,
uniqueId,
intermediateDataSetId);
//分别将StreamEdge添加到上游节点和下游节点
getStreamNode(edge.getSourceId()).addOutEdge(edge);
getStreamNode(edge.getTargetId()).addInEdge(edge);
}
这样通过 StreamNode 和 SteamEdge,就得到了 DAG 中的所有节点和边,以及它们之间的连接关系,拓扑结构也就建立了。
本文深入剖析Flink程序执行过程中StreamGraph的构建过程,从flatMap开始,详细讲解doTransform、keyBy、sum等操作如何形成DataTransformation,并最终构成StreamGraph。在execute方法触发时,转换为JobGraph,构建出DAG结构,明确了各Operator、StreamNode和StreamEdge的角色与连接关系。
2138

被折叠的 条评论
为什么被折叠?



