在Flink1.10任务提交流程分析(一)中分析了从flink run开始到任务提交到集群前的流程分析,对于不同的提交模式Flink中使用不同的PipelineExecutor,本篇基于yarn-per-job模式分析向yarn-cluster提交任务的流程。(注:基于1.10.1分析)
YarnJobClusterExecutor
接着上篇的分析,任务最终提交是交给PipelineExecutor来execute,PipelineExecutor的选择是根据不同的提交模式来决定即execution.target参数来决定,对于yarn-per-job会选择YarnJobClusterExecutor类型的executor。
public class YarnJobClusterExecutor extends AbstractJobClusterExecutor<ApplicationId, YarnClusterClientFactory> {
public static final String NAME = "yarn-per-job";
public YarnJobClusterExecutor() {
super(new YarnClusterClientFactory());
}
}
其实现比较简单,比较重要其构造器中YarnClusterClientFactory,用于创建YarnClusterDescriptor,包含了yarn客户端YarnClient、yarn配置、提交yarn的队列等一些提交yarn的信息。它继承了AbstractJobClusterExecutor 抽象任务提交executor,execute也是由AbstractJobClusterExecutor来执行:
public class AbstractJobClusterExecutor<ClusterID, ClientFactory extends ClusterClientFactory<ClusterID>> implements PipelineExecutor {
private static final Logger LOG = LoggerFactory.getLogger(AbstractJobClusterExecutor.class);
//代表的就是YarnClusterClientFactory
private final ClientFactory clusterClientFactory;
public AbstractJobClusterExecutor(@Nonnull final ClientFactory clusterClientFactory) {
this.clusterClientFactory = checkNotNull(clusterClientFactory);
}
//执行任务提交
//pipeline 代表StreamGraph
public CompletableFuture<JobClient> execute(@Nonnull final Pipeline pipeline, @Nonnull final Configuration configuration) throws Exception {
//将StreamGraph转换为JobGraph
final JobGraph jobGraph = ExecutorUtils.getJobGraph(pipeline, configuration);
//创建提交任务的一些信息:YarnClusterDescriptor
try (final ClusterDescriptor<ClusterID> clusterDescriptor = clusterClientFactory.createClusterDescriptor(configuration)) {
//将配置信息封装在ExecutionConfigAccessor中
final ExecutionConfigAccessor configAccessor = ExecutionC