Flink1.12源码解读——StreamGraph执行图构建过程

最新推荐文章于 2024-02-21 09:36:13 发布

按时吃早饭ABC

最新推荐文章于 2024-02-21 09:36:13 发布

阅读量523

点赞数

分类专栏：大数据文章标签：大数据

本文链接：https://blog.csdn.net/ws0owws0ow/article/details/113888353

版权

大数据专栏收录该内容

11 篇文章 1 订阅

订阅专栏

Apache Flink作为国内最火的大数据计算引擎之一，自身支持高吞吐，低延迟，exactly-once语义，有状态流等特性，阅读源码有助加深对框架的理解和认知。

Flink分为四种执行图，本章主要解析StreamGraph的生成计划

本章及后续源码解读环境以生产为主：运行模式：OnYarn，HA模式：ZK，Mode：Streaming，关键逻辑解释我会备注到代码上（灰色字体）勿忽视。

StreamGraph作为Flink最上层的逻辑封装可以理解为用户API的转化的逻辑层，它同JobGraph一样最初是在Client端生成（集群模式）但不一样的是不会像JobGraph那样有链等加工操作，主要是把用户编写的Transformations转换成StreamNode并生成指向上下游的StreamEdge并装载进StreamGraph，过程并不复杂。

以OnYarn为例，当我们在集群环境提交submit脚本入口函数：

主要流程为flink获取用户jar包，反射并调用用户main函数

/**
	 * Submits the job based on the arguments.
	 */
	public static void main(final String[] args) {
		........

		try {
			final CliFrontend cli = new CliFrontend(
				configuration,
				customCommandLines);

			SecurityUtils.install(new SecurityConfiguration(cli.configuration));
			int retCode = SecurityUtils.getInstalledContext()
				//根据参数匹配对应的执行方式
					.runSecured(() -> cli.parseAndRun(args));
			System.exit(retCode);
		}

public static void executeProgram(
			PipelineExecutorServiceLoader executorServiceLoader,
			Configuration configuration,
			PackagedProgram program,
			boolean enforceSingleJobExecution,
			boolean suppressSysout) throws ProgramInvocationException {
		...

			try {//执行主函数
				program.invokeInteractiveModeForExecution();

private static void callMainMethod(Class<?> entryClass, String[] args) throws ProgramInvocationException {
		。。。。

		try {//执行主函数
			mainMethod.invoke(null, (Object) args);

当上述代码执行到mainMethod.inoke后就会进入到用户的主程序中开始运行用户main函数的逻辑

跟Spark一样，Flink也是懒执行，用户逻辑代码会在Flink封装并执行完所有流程图后才开始运行。

以wc为例，各个operator会生成对应的Transformation（比如Map对应的OneInputTransformation）等Flink封装的逻辑实例，但目前并没有真正运行用户代码，而是直到运行到StreamExecutionEnvironment.execute()后，才开始懒执行

Map函数：这里看得出核心逻辑都封装成了OneInputTransformation，包括处理用户Map逻辑的代码StreamTask（这里的processElement主要是output到input的数据交互逻辑，后续章节会展开说），添加到StreamExecutionEnvironment的List<Transformation<?>> transformations中的Transformation会依次生成对应的StreamNode

public <R> SingleOutputStreamOperator<R> map(MapFunction<T, R> mapper, TypeInformation<R> outputType) {
        //生成StreamMap
		return transform("Map", outputType, new StreamMap<>(clean(mapper)));
	}

public class StreamMap<IN, OUT>
		extends AbstractUdfStreamOperator<OUT, MapFunction<IN, OUT>>
		implements OneInputStreamOperator<IN, OUT> {

	private static final long serialVersionUID = 1L;

	public StreamMap(MapFunction<IN, OUT> mapper) {
		super(mapper);
		chainingStrategy = ChainingStrategy.ALWAYS;
	}

	@Override
    //处理逻辑
	public void processElement(StreamRecord<IN> element) throws Exception {
		output.collect(element.replace(userFunction.map(element.getValue())));
	}
}

protected <R> SingleOutputStreamOperator<R> doTransform(
			String operatorName,
			TypeInformation<R> outTypeInfo,
			StreamOperatorFactory<R> operatorFactory) {

		// read the output type of the input Transform to coax out errors about MissingTypeInfo
		transformation.getOutputType();

		//operatorFactory 列：常用的用户自定义的算子（map，filter等）生成SimpleUdfStreamOperatorFactory
		OneInputTransformation<T, R> resultTransform = new OneInputTransformation<>(
				this.transformation,
				operatorName,
				operatorFactory,
				outTypeInfo,
				environment.getParallelism());

		@SuppressWarnings({"unchecked", "rawtypes"})
		SingleOutputStreamOperator<R> returnStream = new SingleOutputStreamOperator(environment, resultTransform);

		getExecutionEnvironment().addOperator(resultTransform);

		return returnStream;
	}

当执行到StreamExecutionEnvironment.execute()后，开始触发Flink的执行程序

public JobExecutionResult execute(String jobName) throws Exception {
		Preconditions.checkNotNull(jobName, "Streaming Job name should not be null.");
		//生成StreamGraph
		return execute(getStreamGraph(jobName));
	}

public StreamGraph generate() {
		//生成StreamGraph实例
		streamGraph = new StreamGraph(executionConfig, checkpointConfig, savepointRestoreSettings);
		//判断执行模式
		shouldExecuteInBatchMode = shouldExecuteInBatchMode(runtimeExecutionMode);
		configureStreamGraph(streamGraph);

		alreadyTransformed = new HashMap<>();

		for (Transformation<?> transformation: transformations) {
			//生成streamNode 和 streamEdge
			transform(transformation);
		}

private Collection<Integer> translateInternal(
			final OneInputTransformation<IN, OUT> transformation,
			final Context context) {
		checkNotNull(transformation);
		checkNotNull(context);

		final StreamGraph streamGraph = context.getStreamGraph();
		final String slotSharingGroup = context.getSlotSharingGroup();
		final int transformationId = transformation.getId();
		final ExecutionConfig executionConfig = streamGraph.getExecutionConfig();

		//生成StreamNode，并添加到StreamGraph的streamNodesMap中
		streamGraph.addOperator(
				transformationId,
				slotSharingGroup,
				transformation.getCoLocationGroupKey(),
				transformation.getOperatorFactory(),
				transformation.getInputType(),
				transformation.getOutputType(),
				transformation.getName());

        .......

        for (Integer inputId: context.getStreamNodeIds(parentTransformations.get(0))) {
			//生成Edge并把该edge添加到自己的上下游streamNode中
			streamGraph.addEdge(inputId, transformationId, 0);
		}

生成StreamNode

public <IN, OUT> void addOperator(
		Integer vertexID,
		@Nullable String slotSharingGroup,
		@Nullable String coLocationGroup,
		StreamOperatorFactory<OUT> operatorFactory,
		TypeInformation<IN> inTypeInfo,
		TypeInformation<OUT> outTypeInfo,
		String operatorName) {
		//后面在生产Task时是通过该Class来反射调用带参构造函数来初始化Task
		//比如Map函数对应的OneInputStreamTask.class
		Class<? extends AbstractInvokable> invokableClass =
			operatorFactory.isStreamSource() ? SourceStreamTask.class : OneInputStreamTask.class;
		addOperator(vertexID, slotSharingGroup, coLocationGroup, operatorFactory, inTypeInfo,
			outTypeInfo, operatorName, invokableClass);
	}

protected StreamNode addNode(
		Integer vertexID,
		@Nullable String slotSharingGroup,
		@Nullable String coLocationGroup,
		Class<? extends AbstractInvokable> vertexClass,
		StreamOperatorFactory<?> operatorFactory,
		String operatorName) {

		if (streamNodes.containsKey(vertexID)) {
			throw new RuntimeException("Duplicate vertexID " + vertexID);
		}

		//生成StreamNode 核心数据：slotSharingGroup,operatorFactory(常用的用户自义定算子SimpleUdfStreamOperatorFactory等，
		// 里面封装了用户的userFunction)
		StreamNode vertex = new StreamNode(
			vertexID,
			slotSharingGroup,
			coLocationGroup,
			operatorFactory,
			operatorName,
			vertexClass);

		streamNodes.put(vertexID, vertex);

生成Edge

private void addEdgeInternal(Integer upStreamVertexID,
								 Integer downStreamVertexID,
								 int typeNumber,
								 StreamPartitioner<?> partitioner,
								 List<String> outputNames,
								 OutputTag outputTag,
								 ShuffleMode shuffleMode) {

		//如果是sideout类型的transformation，使用上游的transformationId继续调用addEdgeInternal
		if (virtualSideOutputNodes.containsKey(upStreamVertexID)) {
			int virtualId = upStreamVertexID;
			upStreamVertexID = virtualSideOutputNodes.get(virtualId).f0;
			//outputTag标识一个sideout流
			if (outputTag == null) {
				outputTag = virtualSideOutputNodes.get(virtualId).f1;
			}
			addEdgeInternal(upStreamVertexID, downStreamVertexID, typeNumber, partitioner, null, outputTag, shuffleMode);
			//partition类型的transformation同上
		} else if (virtualPartitionNodes.containsKey(upStreamVertexID)) {
			int virtualId = upStreamVertexID;
			upStreamVertexID = virtualPartitionNodes.get(virtualId).f0;
			if (partitioner == null) {
				partitioner = virtualPartitionNodes.get(virtualId).f1;
			}
			shuffleMode = virtualPartitionNodes.get(virtualId).f2;
			addEdgeInternal(upStreamVertexID, downStreamVertexID, typeNumber, partitioner, outputNames, outputTag, shuffleMode);
		} else {
			StreamNode upstreamNode = getStreamNode(upStreamVertexID);
			StreamNode downstreamNode = getStreamNode(downStreamVertexID);

			// If no partitioner was specified and the parallelism of upstream and downstream
			// operator matches use forward partitioning, use rebalance otherwise.
			// 分区器由上下游的并行度是否一致决定
			// 这里ForwardPartitioner与RebalancePartitioner等的区别主要体现在selectChannel，
			// 前者直接返会当前channel的index 0 后者为当前Channel个数取随机+1 再对Channel个数取余（另外几个partitioner也实现不同的selectChannel)
			if (partitioner == null && upstreamNode.getParallelism() == downstreamNode.getParallelism()) {
				partitioner = new ForwardPartitioner<Object>();
			} else if (partitioner == null) {
				partitioner = new RebalancePartitioner<Object>();
			}

			if (partitioner instanceof ForwardPartitioner) {
				if (upstreamNode.getParallelism() != downstreamNode.getParallelism()) {
					throw new UnsupportedOperationException("Forward partitioning does not allow " +
						"change of parallelism. Upstream operation: " + upstreamNode + " parallelism: " + upstreamNode.getParallelism() +
						", downstream operation: " + downstreamNode + " parallelism: " + downstreamNode.getParallelism() +
						" You must use another partitioning strategy, such as broadcast, rebalance, shuffle or global.");
				}
			}

			//决定Operator是否可chain（！=batch）以及ResultPartitionType的类型
			//通常transformation的shuffleMode = UNDEFINED（包括partition类型的transformation）
			//此时ResultPartitionType的类型将由GlobalDataExchangeMode决定（非batch模式下=ALL_EDGES_PIPELINED->ResultPartitionType=PIPELINED_BOUNDED）
			if (shuffleMode == null) {
				shuffleMode = ShuffleMode.UNDEFINED;
			}

			//生成StreamEdge 核心属性为上下游节点和分区器及shuffleMode
			StreamEdge edge = new StreamEdge(upstreamNode, downstreamNode, typeNumber,
				partitioner, outputTag, shuffleMode);

			//把该edge添加到自己的上下游streamNode中
			getStreamNode(edge.getSourceId()).addOutEdge(edge);
			getStreamNode(edge.getTargetId()).addInEdge(edge);
		}
	}

至此StreamGraph构建完毕，看得出StreamGraph整体构造过程比较清晰不复杂，仅仅是Flink对用户API以及部分配置的描述并存储在自己封装的数据结构中而已，这些数据结构会在后面构建JobGraph时取出调用，比如遍历出缓存的StreamEdge 符合条件的Operator会Chian在一起合并成JobVertex等等，而JobGraph所存储封装的数据结构又会在构造ExecutionGraph时被取出调用直到构造出物理执行图，从Client端到Cluster端整个过程会开始慢慢变得复杂，中间也会涉及到不同的外部依赖也会生成负责调度，监控，物理执行的组件等等，以上整个过程会放在后面章节解析。

欢迎任何技术上的交流，Thanks..

按时吃早饭ABC

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Flink1.12源码解读——StreamGraph执行图构建过程

Apache Flink作为国内最火的大数据计算引擎之一，自身支持高吞吐，低延迟，exactly-once语义，有状态流等特性，阅读源码有助加深对框架的理解和认知。
复制链接

扫一扫