spark源码-任务提交流程之-4-container中启动executor

zdaiqing

已于 2022-07-28 16:21:33 修改

阅读量707

点赞数

分类专栏： Spark 大数据源码文章标签： spark 大数据 scala

于 2022-07-26 23:29:43 首次发布

本文链接：https://blog.csdn.net/m0_37817767/article/details/126005340

版权

源码同时被 3 个专栏收录

29 篇文章

订阅专栏

大数据

26 篇文章

订阅专栏

Spark

25 篇文章

订阅专栏

container中启动executor

1.概述
2.入口
3.runAllocatedContainers
4.ExecutorRunnable.run
- 4.1.startContainer启动容器
5.总结
6.参考资料

1.概述

在spark源码-任务提交流程之ApplicationMaster分析到，在AM中，做了3件事情：

1、创建driver线程并启动driver线程，执行用户类定义的main方法；

2、向RM注册AM；

3、AM向RM申请资源，根据资源（containers）匹配规则选择可用资源，并在分配的container中启动executor；

下面分析在container中启动executor的情况；

2.入口

此段代码在AM向RM申请资源的时候调用；

AM向RM申请资源，RM返回资源列表containers，由分配器调用此段代码，实现资源筛选及executor启动；详情见spark源码-任务提交流程之ApplicationMaster：AM向RM申请资源

private[yarn] class YarnAllocator(
    driverUrl: String,
    driverRef: RpcEndpointRef,
    conf: YarnConfiguration,
    sparkConf: SparkConf,
    amClient: AMRMClient[ContainerRequest],
    appAttemptId: ApplicationAttemptId,
    securityMgr: SecurityManager,
    localResources: Map[String, LocalResource],
    resolver: SparkRackResolver,
    clock: Clock = new SystemClock)
  extends Logging {
  
  def handleAllocatedContainers(allocatedContainers: Seq[Container]): Unit = {
    //可用容器列表
    val containersToUse = new ArrayBuffer[Container](allocatedContainers.size)
		
    //......不相关代码省略
    
		//在分配的containers中启动executors
    runAllocatedContainers(containersToUse)

    //......不相关代码省略
  }
}

3.runAllocatedContainers

当前方法会在分配的containers中启动executors；

轮询可用容器，每个容器对应启动一个线程，在线程中启动executor；

private[yarn] class YarnAllocator(
    driverUrl: String,
    driverRef: RpcEndpointRef,
    conf: YarnConfiguration,
    sparkConf: SparkConf,
    amClient: AMRMClient[ContainerRequest],
    appAttemptId: ApplicationAttemptId,
    securityMgr: SecurityManager,
    localResources: Map[String, LocalResource],
    resolver: SparkRackResolver,
    clock: Clock = new SystemClock)
  extends Logging {
  
  private def runAllocatedContainers(containersToUse: ArrayBuffer[Container]): Unit = {
    //轮询可用容器
    for (container <- containersToUse) {
      executorIdCounter += 1
      val executorHostname = container.getNodeId.getHost
      val containerId = container.getId
      val executorId = executorIdCounter.toString
      assert(container.getResource.getMemory >= resource.getMemory)
      logInfo(s"Launching container $containerId on host $executorHostname " +
        s"for executor with ID $executorId")

      def updateInternalState(): Unit = synchronized {
        runningExecutors.add(executorId)
        numExecutorsStarting.decrementAndGet()
        executorIdToContainer(executorId) = container
        containerIdToExecutorId(container.getId) = executorId

        val containerSet = allocatedHostToContainersMap.getOrElseUpdate(executorHostname,
          new HashSet[ContainerId])
        containerSet += containerId
        allocatedContainerToHostMap.put(containerId, executorHostname)
      }

      //判断是否可以新启动executor：正在运行的executor数 < 目标executor数
      if (runningExecutors.size() < targetNumExecutors) {
        //记录启动的executor数
        numExecutorsStarting.incrementAndGet()
        if (launchContainers) {
          //线程池中启用一个新的线程，并执行新线程的run方法
          launcherPool.execute(new Runnable {
            override def run(): Unit = {
              try {
                //实例化ExecutorRunnable对象并执行对象的run方法，在run方法中启动executor
                new ExecutorRunnable(
                  Some(container),
                  conf,
                  sparkConf,
                  driverUrl,
                  executorId,
                  executorHostname,
                  executorMemory,
                  executorCores,
                  appAttemptId.getApplicationId.toString,
                  securityMgr,
                  localResources
                ).run()
                //更新内置状态
                updateInternalState()
              } catch {
                case e: Throwable =>
                  numExecutorsStarting.decrementAndGet()
                  if (NonFatal(e)) {
                    logError(s"Failed to launch executor $executorId on container $containerId", e)
                    // Assigned container should be released immediately
                    // to avoid unnecessary resource occupation.
                    amClient.releaseAssignedContainer(containerId)
                  } else {
                    throw e
                  }
              }
            }
          })
        } else {
          // For test only
          updateInternalState()
        }
      } else {
        logInfo(("Skip launching executorRunnable as running executors count: %d " +
          "reached target executors count: %d.").format(
          runningExecutors.size, targetNumExecutors))
      }
    }
  }
}

4.ExecutorRunnable.run

private[yarn] class ExecutorRunnable(
    container: Option[Container],
    conf: YarnConfiguration,
    sparkConf: SparkConf,
    masterAddress: String,
    executorId: String,
    hostname: String,
    executorMemory: Int,
    executorCores: Int,
    appId: String,
    securityMgr: SecurityManager,
    localResources: Map[String, LocalResource]) extends Logging {

  var rpc: YarnRPC = YarnRPC.create(conf)
  var nmClient: NMClient = _

  def run(): Unit = {
    logDebug("Starting Executor Container")
    //NameNode创建、初始化、启动一条龙
    nmClient = NMClient.createNMClient()
    nmClient.init(conf)
    nmClient.start()
    //启动容器
    startContainer()
  }
}

4.1.startContainer启动容器

在当前方法中，主要做了如下3件事请：

1、准备容器中启动executor的环境；

2、封装容器中启动executor的命令；

3、NameNode中启动Executor；

private[yarn] class ExecutorRunnable(
    container: Option[Container],
    conf: YarnConfiguration,
    sparkConf: SparkConf,
    masterAddress: String,
    executorId: String,
    hostname: String,
    executorMemory: Int,
    executorCores: Int,
    appId: String,
    securityMgr: SecurityManager,
    localResources: Map[String, LocalResource]) extends Logging {

  def startContainer(): java.util.Map[String, ByteBuffer] = {
    //容器启动上下文
    val ctx = Records.newRecord(classOf[ContainerLaunchContext])
      .asInstanceOf[ContainerLaunchContext]
    //准备环境
    val env = prepareEnvironment().asJava

    ctx.setLocalResources(localResources.asJava)
    ctx.setEnvironment(env)

    val credentials = UserGroupInformation.getCurrentUser().getCredentials()
    val dob = new DataOutputBuffer()
    credentials.writeTokenStorageToStream(dob)
    ctx.setTokens(ByteBuffer.wrap(dob.getData()))

    //封装参数
    val commands = prepareCommand()

    ctx.setCommands(commands.asJava)
    ctx.setApplicationACLs(
      YarnSparkHadoopUtil.getApplicationAclsForYarn(securityMgr).asJava)

    // If external shuffle service is enabled, register with the Yarn shuffle service already
    // started on the NodeManager and, if authentication is enabled, provide it with our secret
    // key for fetching shuffle files later
    if (sparkConf.get(SHUFFLE_SERVICE_ENABLED)) {
      val secretString = securityMgr.getSecretKey()
      val secretBytes =
        if (secretString != null) {
          // This conversion must match how the YarnShuffleService decodes our secret
          JavaUtils.stringToBytes(secretString)
        } else {
          // Authentication is not enabled, so just provide dummy metadata
          ByteBuffer.allocate(0)
        }
      ctx.setServiceData(Collections.singletonMap("spark_shuffle", secretBytes))
    }

    // Send the start request to the ContainerManager
    try {
      //NM启动容器
      nmClient.startContainer(container.get, ctx)
    } catch {
      case ex: Exception =>
        throw new SparkException(s"Exception while starting container ${container.get.getId}" +
          s" on host $hostname", ex)
    }
  }
}

4.1.1.prepareEnvironment

将系统属性spark开头的参数、sparkConf中executor执行环境相关参数、log相关参数封装到hashMap构建的缓存中；

private[yarn] class ExecutorRunnable(
    container: Option[Container],
    conf: YarnConfiguration,
    sparkConf: SparkConf,
    masterAddress: String,
    executorId: String,
    hostname: String,
    executorMemory: Int,
    executorCores: Int,
    appId: String,
    securityMgr: SecurityManager,
    localResources: Map[String, LocalResource]) extends Logging {

  private def prepareEnvironment(): HashMap[String, String] = {
    val env = new HashMap[String, String]()
    Client.populateClasspath(null, conf, sparkConf, env, sparkConf.get(EXECUTOR_CLASS_PATH))

    // http模式获取
    val yarnHttpPolicy = conf.get(
      YarnConfiguration.YARN_HTTP_POLICY_KEY,
      YarnConfiguration.YARN_HTTP_POLICY_DEFAULT
    )
    val httpScheme = if (yarnHttpPolicy == "HTTPS_ONLY") "https://" else "http://"

    //将系统属性中spark开头的参数封装到env缓存中；
    System.getenv().asScala.filterKeys(_.startsWith("SPARK"))
      .foreach { case (k, v) => env(k) = v }

    //将sparkConf中executor执行环境变量封装到env缓存中
    sparkConf.getExecutorEnv.foreach { case (key, value) =>
      if (key == Environment.CLASSPATH.name()) {
        // If the key of env variable is CLASSPATH, we assume it is a path and append it.
        // This is kept for backward compatibility and consistency with hadoop
        YarnSparkHadoopUtil.addPathToEnvironment(env, key, value)
      } else {
        // For other env variables, simply overwrite the value.
        env(key) = value
      }
    }

    // 设置log相关参数到env缓存中
    container.foreach { c =>
      sys.env.get("SPARK_USER").foreach { user =>
        val containerId = ConverterUtils.toString(c.getId)
        val address = c.getNodeHttpAddress
        val baseUrl = s"$httpScheme$address/node/containerlogs/$containerId/$user"

        env("SPARK_LOG_URL_STDERR") = s"$baseUrl/stderr?start=-4096"
        env("SPARK_LOG_URL_STDOUT") = s"$baseUrl/stdout?start=-4096"
      }
    }

    env
  }
}

4.1.2.prepareCommand封装命令

当前方法主要是将容器中启动executor的命令进行组装；

executor内存参数、外部类库路径、容器临时目录、日志目录、backend、driver相关信息、executor相关信息等；

由参数设置可知，启动executor后，执行task的backend为org.apache.spark.executor.CoarseGrainedExecutorBackend，由/bin/java方式启动的一个backend进场；

private[yarn] class ExecutorRunnable(
    container: Option[Container],
    conf: YarnConfiguration,
    sparkConf: SparkConf,
    masterAddress: String,
    executorId: String,
    hostname: String,
    executorMemory: Int,
    executorCores: Int,
    appId: String,
    securityMgr: SecurityManager,
    localResources: Map[String, LocalResource]) extends Logging {

  private def prepareCommand(): List[String] = {
    // JVM命令缓存
    val javaOpts = ListBuffer[String]()

    // 将executor的内存参数添加到命令缓存中
    val executorMemoryString = executorMemory + "m"
    javaOpts += "-Xmx" + executorMemoryString

    // 解析sparkConf中spark.executor.extraLibraryPath（外部类库路径）参数，添加到命令缓存中
    sparkConf.get(EXECUTOR_JAVA_OPTIONS).foreach { opts =>
      val subsOpt = Utils.substituteAppNExecIds(opts, appId, executorId)
      javaOpts ++= Utils.splitCommandString(subsOpt).map(YarnSparkHadoopUtil.escapeForShell)
    }

    // 加工命令前缀
    val prefixEnv = sparkConf.get(EXECUTOR_LIBRARY_PATH).map { libPath =>
      Client.createLibraryPathPrefix(libPath, sparkConf)
    }

    //添加容器临时目录
    javaOpts += "-Djava.io.tmpdir=" +
      new Path(Environment.PWD.$$(), YarnConfiguration.DEFAULT_CONTAINER_TEMP_DIR)

    // Certain configs need to be passed here because they are needed before the Executor
    // registers with the Scheduler and transfers the spark configs. Since the Executor backend
    // uses RPC to connect to the scheduler, the RPC settings are needed as well as the
    // authentication settings.
    sparkConf.getAll
      .filter { case (k, v) => SparkConf.isExecutorStartupConf(k) }
      .foreach { case (k, v) => javaOpts += YarnSparkHadoopUtil.escapeForShell(s"-D$k=$v") }

    

    //将日志目录设置到命令中
    javaOpts += ("-Dspark.yarn.app.container.log.dir=" + ApplicationConstants.LOG_DIR_EXPANSION_VAR)

    //加工用户类路径
    val userClassPath = Client.getUserClasspath(sparkConf).flatMap { uri =>
      val absPath =
        if (new File(uri.getPath()).isAbsolute()) {
          Client.getClusterPath(sparkConf, uri.getPath())
        } else {
          Client.buildPath(Environment.PWD.$(), uri.getPath())
        }
      Seq("--user-class-path", "file:" + absPath)
    }.toSeq

    YarnSparkHadoopUtil.addOutOfMemoryErrorArgument(javaOpts)
      
    //组装命令
    val commands = prefixEnv ++
      Seq(Environment.JAVA_HOME.$$() + "/bin/java", "-server") ++
      javaOpts ++
      Seq("org.apache.spark.executor.CoarseGrainedExecutorBackend",
        "--driver-url", masterAddress,
        "--executor-id", executorId,
        "--hostname", hostname,
        "--cores", executorCores.toString,
        "--app-id", appId) ++
      userClassPath ++
      Seq(
        s"1>${ApplicationConstants.LOG_DIR_EXPANSION_VAR}/stdout",
        s"2>${ApplicationConstants.LOG_DIR_EXPANSION_VAR}/stderr")

    // TODO: it would be nicer to just make sure there are no null commands here
    commands.map(s => if (s == null) "null" else s).toList
  }
  
}

4.1.3.nmClient.startContainer启动容器

public class NMClientImpl extends NMClient {
    protected ConcurrentMap<ContainerId, NMClientImpl.StartedContainer> startedContainers = new ConcurrentHashMap();
    
	public Map<String, ByteBuffer> startContainer(Container container, ContainerLaunchContext containerLaunchContext) throws YarnException, IOException {
        //启动容器
        NMClientImpl.StartedContainer startingContainer = new NMClientImpl.StartedContainer(container.getId(), container.getNodeId());
        
        synchronized(startingContainer) {
            //将启动的容器添加到启动完成容器列表中
            this.addStartingContainer(startingContainer);
            ContainerManagementProtocolProxyData proxy = null;

            Map allServiceResponse;
            try {
                //实例化容器管理器代理，有代理来启动容器
                proxy = this.cmProxy.getProxy(container.getNodeId().toString(), container.getId());
                //封装请求
                StartContainerRequest scRequest = StartContainerRequest.newInstance(containerLaunchContext, container.getContainerToken());
                List<StartContainerRequest> list = new ArrayList();
                list.add(scRequest);
                StartContainersRequest allRequests = StartContainersRequest.newInstance(list);
                //启动容器，执行executor启动命令
                StartContainersResponse response = proxy.getContainerManagementProtocol().startContainers(allRequests);
                if (response.getFailedRequests() != null && response.getFailedRequests().containsKey(container.getId())) {
                    Throwable t = ((SerializedException)response.getFailedRequests().get(container.getId())).deSerialize();
                    this.parseAndThrowException(t);
                }

                allServiceResponse = response.getAllServicesMetaData();
                startingContainer.state = ContainerState.RUNNING;
            } catch (YarnException var19) {
                startingContainer.state = ContainerState.COMPLETE;
                this.startedContainers.remove(startingContainer.containerId);
                throw var19;
            } catch (IOException var20) {
                startingContainer.state = ContainerState.COMPLETE;
                this.startedContainers.remove(startingContainer.containerId);
                throw var20;
            } catch (Throwable var21) {
                startingContainer.state = ContainerState.COMPLETE;
                this.startedContainers.remove(startingContainer.containerId);
                throw RPCUtil.getRemoteException(var21);
            } finally {
                if (proxy != null) {
                    this.cmProxy.mayBeCloseProxy(proxy);
                }

            }

            return allServiceResponse;
        }
    }
}

4.1.3.1.StartedContainer类

该类是NMClientImpl类的内部类；

封装启动完成的容器信息；

protected static class StartedContainer {
        private ContainerId containerId;
        private NodeId nodeId;
        private ContainerState state;

        public StartedContainer(ContainerId containerId, NodeId nodeId) {
            this.containerId = containerId;
            this.nodeId = nodeId;
            this.state = ContainerState.NEW;
        }

        public ContainerId getContainerId() {
            return this.containerId;
        }

        public NodeId getNodeId() {
            return this.nodeId;
        }
    }

5.总结

在AM中先RM申请资源，RM返回资源给AM后，由AM创建资源分配器，由资源分配器根据分配的资源启动executor；

executor启动过程中：

将系统属性spark开头的参数、sparkConf中executor执行环境相关参数、log相关参数封装到hashMap构建的缓存中；

根据executor内存参数、外部类库路径、容器临时目录、日志目录、backend、driver相关信息、executor相关信息等构建executor启动命令，其中设置以/bin/java方式启动一个org.apache.spark.executor.CoarseGrainedExecutorBackend进程，后续由该进程执行task任务；

最后由NM启动executor；