Spark内核源码流程之Yarn cluster提交运行流程

最新推荐文章于 2024-05-17 17:27:38 发布

liguanghai12

最新推荐文章于 2024-05-17 17:27:38 发布

阅读量375

点赞数 1

本文链接：https://blog.csdn.net/weixin_44563670/article/details/113373028

版权

大数据同时被 3 个专栏收录

28 篇文章 1 订阅

订阅专栏

Spark

18 篇文章 0 订阅

订阅专栏

scala

16 篇文章 1 订阅

订阅专栏

当sparksubmit提交模式为Yarn Cluster模式时的启动流程
在这里插入图片描述
1.通过submit 在本地机器上启动submit 提交job JVM进程
2.submit中会判断是否是cluster环境，是，通过反射执行YarnClusterApplication 的main方法
2.YarnClusterApplication 会启动yarn ResourceManager client，连接yarn RM
3.连接成功后，会发送指令，在指定的NM上启动ApplicationMaster JVM 进程，AM启动成功后，使用一个线程执行命令行中传入的–class类main方法，也就是Driver线程
4.此时，Driver会阻塞，ApplicatonMaster会像RM申请资源，RM向AM返回可用资源，AM拿到资源后把这些资源逐一放入一个线程中，线程会启动命令，来连接对应的NM，发送java/bin 来启动一个Excetor jvm进程
5.进程首先会启动 CoarseGrainedExecutorBackend，通信环境后台，向ApplicatonMaster注册并收到注册成功的消息后，再创建Excetor计算对象
6.此后，Driver中的RDD计算逻辑按照Spark的任务划分，切分任务
7.Driver把任务发送给注册的Excetor中，并监控执行

部分源码：
当使用Spark 提交job时命令：

bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn \
--deploymode cluster \   表示yarn的集群模式
./examples/jars/spark-examples_2.12-2.4.5.jar \

此时会在类中 SparkSubmit中调用

 def doSubmit(args: Array[String]): Unit = {
    // Initialize logging if it hasn't been done yet. Keep track of whether logging needs to
    // be reset before the application starts.
    val uninitLog = initializeLogIfNecessary(true, silent = true)

    val appArgs = parseArguments(args)
    if (appArgs.verbose) {
      logInfo(appArgs.toString)
    }
    appArgs.action match {
      case SparkSubmitAction.SUBMIT => submit(appArgs, uninitLog)
      case SparkSubmitAction.KILL => kill(appArgs)
      case SparkSubmitAction.REQUEST_STATUS => requestStatus(appArgs)
      case SparkSubmitAction.PRINT_VERSION => printVersion()
    }
  }

Action动作是submit

@tailrec
  private def submit(args: SparkSubmitArguments, uninitLog: Boolean): Unit = {

    def doRunMain(): Unit = {
      if (args.proxyUser != null) {
        val proxyUser = UserGroupInformation.createProxyUser(args.proxyUser,
          UserGroupInformation.getCurrentUser())
        try {
          proxyUser.doAs(new PrivilegedExceptionAction[Unit]() {
            override def run(): Unit = {
              runMain(args, uninitLog)
            }
          })
        } catch {
          case e: Exception =>
            // Hadoop's AuthorizationException suppresses the exception's stack trace, which
            // makes the message printed to the output by the JVM not very helpful. Instead,
            // detect exceptions with empty stack traces here, and treat them differently.
            if (e.getStackTrace().length == 0) {
              error(s"ERROR: ${e.getClass().getName()}: ${e.getMessage()}")
            } else {
              throw e
            }
        }
      } else {
        runMain(args, uninitLog)
      }
    }

    // In standalone cluster mode, there are two submission gateways:
    //   (1) The traditional RPC gateway using o.a.s.deploy.Client as a wrapper
    //   (2) The new REST-based gateway introduced in Spark 1.3
    // The latter is the default behavior as of Spark 1.3, but Spark submit will fail over
    // to use the legacy gateway if the master endpoint turns out to be not a REST server.
    if (args.isStandaloneCluster && args.useRest) {
      try {
        logInfo("Running Spark using the REST application submission protocol.")
        doRunMain()
      } catch {
        // Fail over to use the legacy submission gateway
        case e: SubmitRestConnectionException =>
          logWarning(s"Master endpoint ${args.master} was not a REST server. " +
            "Falling back to legacy submission gateway instead.")
          args.useRest = false
          submit(args, false)
      }
    // In all other modes, just run the main class as prepared
    } else {
      doRunMain()
    }
  }

最后调用doRunMain

而后这段代码中会根据环境来获取是否是cluster调用
创建一个主类YarnClusterApplication来执行main方法

// Following constants are visible for testing.
  private[deploy] val YARN_CLUSTER_SUBMIT_CLASS =
    "org.apache.spark.deploy.yarn.YarnClusterApplication"

此处yarn client 执行run方法

private[spark] class YarnClusterApplication extends SparkApplication {

  override def start(args: Array[String], conf: SparkConf): Unit = {
    // SparkSubmit would use yarn cache to distribute files & jars in yarn mode,
    // so remove them from sparkConf here for yarn mode.
    conf.remove(JARS)
    conf.remove(FILES)

    new Client(new ClientArguments(args), conf, null).run()
  }

}

yarnClient.submitApplication(appContext)
会向yarn中的ResourceManager 提交执行命令

def submitApplication(): ApplicationId = {
    ResourceRequestHelper.validateResources(sparkConf)

    var appId: ApplicationId = null
    try {
      launcherBackend.connect()
      yarnClient.init(hadoopConf)
      yarnClient.start()

      logInfo("Requesting a new application from cluster with %d NodeManagers"
        .format(yarnClient.getYarnClusterMetrics.getNumNodeManagers))

      // Get a new application from our RM
      val newApp = yarnClient.createApplication()
      val newAppResponse = newApp.getNewApplicationResponse()
      appId = newAppResponse.getApplicationId()

      // The app staging dir based on the STAGING_DIR configuration if configured
      // otherwise based on the users home directory.
      val appStagingBaseDir = sparkConf.get(STAGING_DIR)
        .map { new Path(_, UserGroupInformation.getCurrentUser.getShortUserName) }
        .getOrElse(FileSystem.get(hadoopConf).getHomeDirectory())
      stagingDirPath = new Path(appStagingBaseDir, getAppStagingDir(appId))

      new CallerContext("CLIENT", sparkConf.get(APP_CALLER_CONTEXT),
        Option(appId.toString)).setCurrentContext()

      // Verify whether the cluster has enough resources for our AM
      verifyClusterResources(newAppResponse)

      // Set up the appropriate contexts to launch our AM
      val containerContext = createContainerLaunchContext(newAppResponse)
      val appContext = createApplicationSubmissionContext(newApp, containerContext)

      // Finally, submit and monitor the application
      logInfo(s"Submitting application $appId to ResourceManager")
      yarnClient.submitApplication(appContext)
      launcherBackend.setAppId(appId.toString)
      reportLauncherState(SparkAppHandle.State.SUBMITTED)

      appId
    } catch {
      case e: Throwable =>
        if (stagingDirPath != null) {
          cleanupStagingDir()
        }
        throw e
    }
  }

指定指令来启动application jvm进程

private def createContainerLaunchContext(newAppResponse: GetNewApplicationResponse)
    : ContainerLaunchContext = {
	...........
    val javaOpts = ListBuffer[String]()
	............   
    // Add Xmx for AM memory
    javaOpts += "-Xmx" + amMemory + "m"
    .............
    javaOpts += "-Djava.io.tmpdir=" + tmpDir
	..............    
      // In our expts, using (default) throughput collector has severe perf ramifications in
      // multi-tenant machines
      javaOpts += "-XX:+UseConcMarkSweepGC"
      javaOpts += "-XX:MaxTenuringThreshold=31"
      javaOpts += "-XX:SurvivorRatio=8"
      javaOpts += "-XX:+CMSIncrementalMode"
      javaOpts += "-XX:+CMSIncrementalPacing"
      javaOpts += "-XX:CMSIncrementalDutyCycleMin=0"
      javaOpts += "-XX:CMSIncrementalDutyCycle=10"
    }
	....................                                  
    // TODO: it would be nicer to just make sure there are no null commands here
    val printableCommands = commands.map(s => if (s == null) "null" else s).toList
    amContainer.setCommands(printableCommands.asJava)

此处就会启动applicationMaster

liguanghai12

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
Spark内核源码流程之Yarn cluster提交运行流程

当sparksubmit提交模式为Yarn Cluster模式时的启动流程1.通过submit 在本地机器上启动submit 提交job JVM进程2.submit中会判断是否是cluster环境，是，通过反射执行YarnClusterApplication 的main方法2.YarnClusterApplication 会启动yarn ResourceManager client，连接yarn RM3.连接成功后，会发送指令，在指定的NM上启动ApplicationMaster JVM 进程，A
复制链接

扫一扫