Flink源码剖析：flink-streaming-java 之 JobGraph

最新推荐文章于 2024-09-27 19:00:00 发布

Matty_Blog

最新推荐文章于 2024-09-27 19:00:00 发布

阅读量816

点赞数

分类专栏： Flink

本文链接：https://blog.csdn.net/a1240466196/article/details/105921784

版权

本文深入剖析Flink源码，重点讲解StreamGraph如何转换为JobGraph。通过调用链路分析，解释StreamExecutionEnvironment.execute()方法的执行流程，涉及JobVertex、JobEdge、IntermediateDataSet和StreamConfig等关键概念。并以自带的WordCount为例，展示转换过程。

摘要由CSDN通过智能技术生成

文章目录

本文主要围绕 Flink 源码中 flink-streaming-java 模块。介绍下 StreamGraph 转成 JobGraph 的过程等。

StreamGraph 和 JobGraph 都是在 Client 端生成的，也就是说我们可以在 IDE 中通过断点调试观察 StreamGraph 和 JobGraph 的生成过程。
StreamGraph 实际上只对应 Flink 作业在逻辑上的执行计划图，Flink 会进一步对 StreamGraph 进行转换，得到另一个执行计划图，即 JobGraph。

1. 调用链路

使用 DataStream API 编写好程序之后，就会调用到 StreamExecutionEnvironment.execute() 方法了，首先会调用 getStreamGraph 生成 StreamGraph，接着就会将 StreamGraph 转成 JobGraph，调用链路如下：

首先，调用 StreamExecutionEnvironment 的 executeAsync() 方法，根据 Configuration 获取 PipelineExecutorFactory 和 PipelineExecutor 。

在这里插入图片描述

图1: 获取PipelineExecutorFactory和PipelineExecutor时序图

@Public
public class StreamExecutionEnvironment {
   
/**
 * 根据 execution.target 配置反射得到 PipelineExecutorFactory，拿出工厂类对应的 PipelineExecutor，执行其 execute() 方法
 * execute的主要工作是将 StreamGraph 转成了 JobGraph，并创建相应的 ClusterClient 完成提交任务的操作。
 */
@Internal
public JobClient executeAsync(StreamGraph streamGraph) throws Exception {
   
	checkNotNull(streamGraph, "StreamGraph cannot be null.");
	checkNotNull(configuration.get(DeploymentOptions.TARGET), "No execution.target specified in your configuration file.");

	// SPI机制
	// 根据flink Configuration中的"execution.target"加载 PipelineExecutorFactory
	// PipelineExecutorFactory 的实现类在flink-clients包或者flink-yarn包里，因此需要在pom.xml中添加对应的依赖
	final PipelineExecutorFactory executorFactory =
		executorServiceLoader.getExecutorFactory(configuration);

    // 反射出的 PipelineExecutorFactory 类不能为空
	checkNotNull(
		executorFactory,
		"Cannot find compatible factory for specified execution.target (=%s)",
		configuration.get(DeploymentOptions.TARGET));

	// 根据加载到的 PipelineExecutorFactory 工厂类，获取其对应的 PipelineExecutor，
	// 并执行 PipelineExecutor 的 execute() 方法，将 StreamGraph 转成 JobGraph
	CompletableFuture<JobClient> jobClientFuture = executorFactory
		.getExecutor(configuration)
		.execute(streamGraph, configuration);

	// 异步调用的返回结果
	// ...
 }
}

PipelineExecutorFactory 是通过 SPI ServiceLoader 加载的，我们看下 flink-clients 模块的 META-INF.services 文件：
在这里插入图片描述

图2: flink-clients模块的META-INF文件

PipelineExecutorFactory 的实现子类，分别对应着 Flink 的不同部署模式，如 local、standalone、yarn、kubernets 等：
在这里插入图片描述

图3: PipelineExecutorFactory子类

这里我们只看下 LocalExecutorFactory 的实现：

@Internal
public class LocalExecutorFactory implements PipelineExecutorFactory {
   

	/**
	 * execution.target 配置项对应的值为 "local"
	 */
	@Override
	public boolean isCompatibleWith(final Configuration configuration) {
   
		return LocalExecutor.NAME.equalsIgnoreCase(configuration.get(DeploymentOptions.TARGET));
	}

	/**
	 * 直接 new 一个 LocalExecutor 返回
	 */
	@Override
	public PipelineExecutor getExecutor(final Configuration configuration) {
   
		return new LocalExecutor();
	}
}

PipelineExecutor 的实现子类与 PipelineExecutorFactory 与工厂类一一对应，负责将 StreamGraph 转成 JobGraph，并生成 ClusterClient 执行任务的提交：
在这里插入图片描述

图4: PipelineExecutor子类

接着，调用到 LocalExecutor 中的 getJobGraph() 方法，会反射出 StreamGraphTranslator 类，并调用它的 translateToJobGraph() 方法。

在这里插入图片描述

图5：LocalExecutor的getJobGraph()方法的时序图

@Internal
public class LocalExecutor implements PipelineExecutor {
   

	// ...
	private JobGraph getJobGraph(Pipeline pipeline, Configuration configuration) {
   
		// ...

		// 这里调用 FlinkPipelineTranslationUtil 的 getJobGraph() 方法
		return FlinkPipelineTranslationUtil.getJobGraph(pipeline, configuration, 1);
	}
}

FlinkPipelineTranslationUtil 中通过反射得到一个 FlinkPipelineTranslator ，即 StreamGraphTranslator：

public class FlinkPipelineTranslationUtil{
   
    public static JobGraph getJobGraph(
		Pipeline pipeline,
		Configuration optimizerConfiguration,
		int defaultParallelism) {
   

	    // 通过反射得到 FlinkPipelineTranslator 
	    FlinkPipelineTranslator pipelineTranslator = getPipelineTranslator(pipeline);

	    return pipelineTranslator.translateToJobGraph(pipeline,
			optimizerConfiguration,
			defaultParallelism);
    }

    private static FlinkPipelineTranslator getPipelineTranslator(Pipeline pipeline) {
   
	    PlanTranslator planToJobGraphTransmogrifier = new PlanTranslator();

	    if (planToJobGraphTransmogrifier.canTranslate(pipeline)) {
   
		    return planToJobGraphTransmogrifier;
	    }

	    FlinkPipelineTranslator streamGraphTranslator = reflectStreamGraphTranslator();

	    // 其实就是判断当前的 Pipeline 实例是不是 StreamGraph
	    if (!streamGraphTranslator.canTranslate(pipeline)) {
   
		    throw new RuntimeException("Translator " + streamGraphTranslator + " cannot translate "
				+ "the given pipeline " + pipeline + ".");
	    }
	    return streamGraphTranslator;
    }

    private static FlinkPipelineTranslator reflectStreamGraphTranslator() {
   
		
	    Class<?> streamGraphTranslatorClass;
	    try {
   
		    streamGraphTranslatorClass = Class.forName(
				// 因为这个类在 flink-streaming-java 模块中，FlinkPipelineTranslationUtil 在 flink-clients 模块中，
			    // flink-clients 模块没有引入 flink-streaming-java 模块，所以只能通过反射拿到
				"org.apache.flink.streaming.api.graph.StreamGraphTranslator",
				true,
				FlinkPipelineTranslationUtil.class.getClassLoader());
	    } catch (ClassNotFoundException e) {
   
		    throw new RuntimeException("Could not load StreamGraphTranslator.", e);
	    }

	    FlinkPipelineTranslator streamGraphTranslator;
	    try {
   
		    streamGraphTranslator =
				(FlinkPipelineTranslator) streamGraphTranslatorClass.newInstance();
	    } catch (InstantiationException | IllegalAccessException e) {
   
		    throw new RuntimeException("Could not instantiate StreamGraphTranslator.", e);
	    }
	    return streamGraphTranslator;
    }
}

最后，调用 StreamGraphTranslator 的 translateToJobGraph() 方法，会一直调用到 StreamGraph 类自己的 getJobGraph() 方法。

图6：StreamGraphTranslator的translateToJobGraph()方法的时序图

public class StreamGraphTranslator implements FlinkPipelineTranslator {
   

	/**
	 * 其实就是调用 StreamGraph 自己的 getJobGraph() 方法生成 JobGraph
	 */
	@Override
	public JobGraph translateToJobGraph(
			Pipeline pipeline,
			Configuration optimizerConfiguration,
			int defaultParallelism) {
   
		checkArgument(pipeline instanceof StreamGraph,
				"Given pipeline is not a DataStream StreamGraph.");

		StreamGraph streamGraph = (StreamGraph) pipeline;
		return streamGraph.getJobGraph