flink源码解析3 ExecutionGraph的形成与物理执行_org.apache.flink.runtime.executiongraph.executiong-CSDN博客

本文链接：https://blog.csdn.net/m0_37139189/article/details/102898298

flink在client端形成jobGraph之后会提交给JobMaster ，在这里会形成ExecutionGraph

JobMaster的构造函数中有这么一句话：

this.executionGraph = this.createAndRestoreExecutionGraph(this.jobManagerJobMetricGroup);

一直追踪导EecutionGraphBulider#buildGraph

这个方法中有比较重要的一句话

  // 根据JobVertex列表，生成execution graph
        executionGraph.attachJobGraph(sortedTopology);

根据jobGraph生成executionGraph的大部分逻辑都在这个方法中

for (JobVertex jobVertex : topologiallySorted) {

   if (jobVertex.isInputVertex() && !jobVertex.isStoppable()) {
      this.isStoppable = false;
   }

   // create the execution job vertex and attach it to the graph
   ExecutionJobVertex ejv = new ExecutionJobVertex(
      this,
      jobVertex,
      1,
      rpcTimeout,
      globalModVersion,
      createTimestamp);

   ejv.connectToPredecessors(this.intermediateResults);

   ExecutionJobVertex previousTask = this.tasks.putIfAbsent(jobVertex.getID(), ejv);
   if (previousTask != null) {
      throw new JobException(String.format("Encountered two job vertices with ID %s : previous=[%s] / new=[%s]",
            jobVertex.getID(), ejv, previousTask));
   }

看上面这段代码的逻辑:

首先会遍历所有的JobVertex,根据每一个JobVertex生成一个ExecutionJobVertex。重点在ExecutionJobVertex的构造函数中：

重要的代码片段：

this.producedDataSets = new IntermediateResult[jobVertex.getNumberOfProducedIntermediateDataSets
for (int i = 0; i < jobVertex.getProducedDataSets().size(); i++) {
   final IntermediateDataSet result = jobVertex.getProducedDataSets().get(i);

   this.producedDataSets[i] = new IntermediateResult(
         result.getId(),
         this,
         numTaskVertices,
         result.getResultType());
}

首先会创建一个producedDataSets列表，然后根据JobVertext中的ProducedDataSet变量，给produceDatSets列表赋值

for (int i = 0; i < numTaskVertices; i++) {
   ExecutionVertex vertex = new ExecutionVertex(
         this,
         i,
         producedDataSets,
         timeout,
         initialGlobalModVersion,
         createTimestamp,
         maxPriorAttemptsHistoryLength);

   this.taskVertices[i] = vertex;
}

这里则是根据并行度，创建一个ExecutionVertex，每个并行度就是一个ExecutionVertex，然后放在taskVertices数组中。

ExecutionJobVertex创建完毕之后会进入

ejv.connectToPredecessors(this.intermediateResults);

追踪导ExectionVertex#connectSource

public void connectSource(int inputNumber, IntermediateResult source, JobEdge edge, int consumerNumber) {

   final DistributionPattern pattern = edge.getDistributionPattern();
   final IntermediateResultPartition[] sourcePartitions = source.getPartitions();

   ExecutionEdge[] edges;

   switch (pattern) {
      case POINTWISE:
         edges =