sparkYarn集群提交流程分析(二)

最新推荐文章于 2024-06-16 21:20:59 发布

long_World

最新推荐文章于 2024-06-16 21:20:59 发布

阅读量197

点赞数 1

分类专栏： spark 文章标签： spark 大数据 scala 源码

本文链接：https://blog.csdn.net/long_World/article/details/114984564

版权

spark 专栏收录该内容

6 篇文章 1 订阅

订阅专栏

sparkYarn集群提交流程分析(二)

在这里插入图片描述

书接上文,我们了解到了在不涉及集群的情况下,我们需要在本地启动一个SparkSubmit进程并且,在进程中执行了一个Client伴生对象的main方法,这次我们从client是什么说起

client

  def main(argStrings: Array[String]) {
    if (!sys.props.contains("SPARK_SUBMIT")) {
        
      logWarning("WARNING: This client is deprecated and will be removed in a " +
        "future version of Spark. Use ./bin/spark-submit with \"--master yarn\"")
    }
    System.setProperty("SPARK_YARN_MODE", "true")
      
    val sparkConf = new SparkConf
  
    // SparkSubmit would use yarn cache to distribute files & jars in yarn mode,
    // so remove them from sparkConf here for yarn mode.
    sparkConf.remove("spark.jars")
    sparkConf.remove("spark.files")
    val args = new ClientArguments(argStrings)
    new Client(args, sparkConf).run()
  }

1 .可以看到进入main方法后,心事对这些配置一顿操作,只需要知道我们的提交模式是spark-submit进入到判断语句即可
2 .进入判断语句后只是对一些配置进行操作,这些都可以很简单的根据文档注释理解
3 .主要代码在于

    //封装从sparksubmit传来的参数
	val args = new ClientArguments(argStrings)
	//创建一个client对象
    new Client(args, sparkConf).run()

4 .进入到Client类的主构造方法中

private[spark] class Client(
    val args: ClientArguments,
    val hadoopConf: Configuration,
    val sparkConf: SparkConf)
  extends Logging {

  import Client._
  import YarnSparkHadoopUtil._

  def this(clientArgs: ClientArguments, spConf: SparkConf) =
    this(clientArgs, SparkHadoopUtil.get.newConfiguration(spConf), spConf)

  //维护了一个YarnClient
  private val yarnClient = YarnClient.createYarnClient
  private val yarnConf = new YarnConfiguration(hadoopConf)

  private val isClusterMode = sparkConf.get("spark.submit.deployMode", "client") == "cluster"

5 .可以明显发现,Client中维护了一个YarnClient对象,猜想这个是与yarn连接的对象
6 .在main方法中不光创建了对象还执行了Client对象的run()方法

run()

def run(): Unit = {
    this.appId = submitApplication()
    if (!launcherBackend.isConnected() && fireAndForget) {
      val report = getApplicationReport(appId)
      val state = report.getYarnApplicationState
      logInfo(s"Application report for $appId (state: $state)")
      logInfo(formatReportDetails(report))
      if (state == YarnApplicationState.FAILED || state == YarnApplicationState.KILLED) {
        throw new SparkException(s"Application $appId finished with status: $state")
      }
    } else {
      val (yarnApplicationState, finalApplicationStatus) = monitorApplication(appId)
      if (yarnApplicationState == YarnApplicationState.FAILED ||
        finalApplicationStatus == FinalApplicationStatus.FAILED) {
        throw new SparkException(s"Application $appId finished with failed status")
      }
      if (yarnApplicationState == YarnApplicationState.KILLED ||
        finalApplicationStatus == FinalApplicationStatus.KILLED) {
        throw new SparkException(s"Application $appId is killed")
      }
      if (finalApplicationStatus == FinalApplicationStatus.UNDEFINED) {
        throw new SparkException(s"The final status of application $appId is undefined")
      }
    }
  }

1 .上面代码中比较关键的就是第2行代码
```
 this.appId = submitApplication()
```
2 .这里可以发现这个方法的名字叫做submitApplication那么就一定与提交任务有关,点进去

  /**
   * Submit an application running our ApplicationMaster to the ResourceManager.
   *
   * The stable Yarn API provides a convenience method (YarnClient#createApplication) for
   * creating applications and setting up the application submission context. This was not
   * available in the alpha API.
   */ 
def submitApplication(): ApplicationId = {
    var appId: ApplicationId = null
    try {
      launcherBackend.connect()
      // Setup the credentials before doing anything else,
      // so we have don't have issues at any point.
      setupCredentials()
      yarnClient.init(yarnConf)
      yarnClient.start()

      logInfo("Requesting a new application from cluster with %d NodeManagers"
        .format(yarnClient.getYarnClusterMetrics.getNumNodeManagers))

      // Get a new application from our RM
      val newApp = yarnClient.createApplication()
      val newAppResponse = newApp.getNewApplicationResponse()
      appId = newAppResponse.getApplicationId()
      reportLauncherState(SparkAppHandle.State.SUBMITTED)
      launcherBackend.setAppId(appId.toString)

      new CallerContext("CLIENT", Option(appId.toString)).setCurrentContext()

      // Verify whether the cluster has enough resources for our AM
      verifyClusterResources(newAppResponse)

      // Set up the appropriate contexts to launch our AM
      val containerContext = createContainerLaunchContext(newAppResponse)
      val appContext = createApplicationSubmissionContext(newApp, containerContext)

      // Finally, submit and monitor the application
      logInfo(s"Submitting application $appId to ResourceManager")
      yarnClient.submitApplication(appContext)
      appId
    } catch {
      case e: Throwable =>
        if (appId != null) {
          cleanupStagingDir(appId)
        }
        throw e
    }
  }

3 .上面注释中说到yarn提供了一个YarnClient#createApplication的API来创建Application和创建Application上下文,并且还提到在RM中创建,由此可见YarnClient的createApplication()会与RM进行交互,请求这个Application,并且还会返回一个AppId

步骤①小总结:
- 1 .至此对应了图片中从Client到RM的第一次连接
- 2 .通过createApplication方法创建
- 3 .至此验证了Client中的YarnClient是连接Yarn的一个对象

createContainerLaunchContext(newAppResponse)

1 .在往下查看submitApplication方法可以找到这样一句话

      // Set up the appropriate contexts to launch our AM
      val containerContext = createContainerLaunchContext(newAppResponse)

2 .注释的意思是设置适当的上下文启动AM,AM(ApplicationMaster)是spark项目的管理者,所以也是最先启动的守护进程
3 .所以这个一定要点进去看看

 /**
   * Set up a ContainerLaunchContext to launch our ApplicationMaster container.
   * This sets up the launch environment, java options, and the command for launching the AM.
   */
  private def createContainerLaunchContext(newAppResponse: GetNewApplicationResponse)
    : ContainerLaunchContext = {
    logInfo("Setting up container launch context for our AM")
    val appId = newAppResponse.getApplicationId
    val appStagingDirPath = new Path(appStagingBaseDir, getAppStagingDir(appId))
    val pySparkArchives =
      if (sparkConf.get(IS_PYTHON_APP)) {
        findPySparkArchives()
      } else {
        Nil
      }
    val launchEnv = setupLaunchEnv(appStagingDirPath, pySparkArchives)
    val localResources = prepareLocalResources(appStagingDirPath, pySparkArchives)

    val amContainer = Records.newRecord(classOf[ContainerLaunchContext])
    amContainer.setLocalResources(localResources.asJava)
    amContainer.setEnvironment(launchEnv.asJava)

    val javaOpts = ListBuffer[String]()

    // Set the environment variable through a command prefix
    // to append to the existing value of the variable
    var prefixEnv: Option[String] = None

    // Add Xmx for AM memory
    javaOpts += "-Xmx" + amMemory + "m"

	........
    val amClass =
      if (isClusterMode) {
        Utils.classForName("org.apache.spark.deploy.yarn.ApplicationMaster").getName
      } else {
        Utils.classForName("org.apache.spark.deploy.yarn.ExecutorLauncher").getName
      }
	....
    val amArgs =
      Seq(amClass) ++ userClass ++ userJar ++ primaryPyFile ++ primaryRFile ++
        userArgs ++ Seq(
          "--properties-file", buildPath(YarnSparkHadoopUtil.expandEnvironment(Environment.PWD),
            LOCALIZED_CONF_DIR, SPARK_CONF_FILE))

    // Command for the ApplicationMaster
    val commands = prefixEnv ++ Seq(
        YarnSparkHadoopUtil.expandEnvironment(Environment.JAVA_HOME) + "/bin/java", "-server"
      ) ++
      javaOpts ++ amArgs ++
      Seq(
        "1>", ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stdout",
        "2>", ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stderr")

    // TODO: it would be nicer to just make sure there are no null commands here
    val printableCommands = commands.map(s => if (s == null) "null" else s).toList
    amContainer.setCommands(printableCommands.asJava)

	......

    // send the acl settings into YARN to control who has access via YARN interfaces
    val securityManager = new SecurityManager(sparkConf)
    amContainer.setApplicationACLs(
      YarnSparkHadoopUtil.getApplicationAclsForYarn(securityManager).asJava)
    setupSecurityToken(amContainer)
    amContainer
  }

4 .这里主要的意思就是封装一个java命令发送到集群中去执行,具体什么样的代码,继续向下看
5 .代码中找到一行注释 // Command for the ApplicationMaster就是创建AM的命令
6 .向上查找比较关键的就是变量amClass的值,由于是集群运行所以
- ```
amClass = org.apache.spark.deploy.yarn.ApplicationMaster
```
7 .拼接后大概得出一个这样的命令

${JAVA_HOME}/bin/java org.apache.spark.deploy.yarn.ApplicationMaster

8 .也就是将这行代码发送到集群中执行,此时在及集群中RM会选择一个NM的节点执行ApplicationMaster的main()也就是说在某一个节点上创建了一个AM的进程

步骤②总结
- 1 .在向RM第一次提交Application之后,同样在SubmitApplication方法中会执行创建AM的操作
- 2 .具体形式是由本地封装java命令发送到集群中运行
- 3 .发送往集群的过程后会在某一个节点中创建一个ApplicationMaster的进程
  也就是将这行代码发送到集群中执行,此时在及集群中RM会选择一个NM的节点执行ApplicationMaster的main()也就是说在某一个节点上创建了一个AM的进程

步骤②总结

1 .在向RM第一次提交Application之后,同样在SubmitApplication方法中会执行创建AM的操作
2 .具体形式是由本地封装java命令发送到集群中运行
3 .发送往集群的过程后会在某一个节点中创建一个ApplicationMaster的进程
欲听后事如何,请听下回解析!!!

转下一篇

(三) https://blog.csdn.net/long_World/article/details/114984490

long_World

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
sparkYarn集群提交流程分析(二)

sparkYarn集群提交流程分析(二)书接上文,我们了解到了在不涉及集群的情况下,我们需要在本地启动一个SparkSubmit进程并且,在进程中执行了一个Client伴生对象的main方法,这次我们从client是什么说起client def main(argStrings: Array[String]) { if (!sys.props.contains("SPARK_SUBMIT")) { logWarning("WARNING: This
复制链接

扫一扫