【源码分析】Spark on Yarn应用提交流程

最新推荐文章于 2023-08-23 22:46:05 发布

初心江湖路

最新推荐文章于 2023-08-23 22:46:05 发布

阅读量414

点赞数

分类专栏： Spark 文章标签： Spark on yarn yarn cluster yarn client

本文链接：https://blog.csdn.net/weixin_43878293/article/details/101027788

版权

本文详细分析了Spark on YARN的提交流程，从SparkSubmit的main方法开始，涵盖yarn客户端的run方法，向ResourceManager提交application，ApplicationMaster的启动以及driver程序的运行。重点讲解了在cluster和client模式下driver的运行方式，并探讨了ExecutorLauncher与driver及ApplicationMaster的关系。

摘要由CSDN通过智能技术生成

本文直接从SparkSubmit说起，脚本提交过程在之前的《spark-submit脚本执行过程》文章中已经说明。

一、主要过程概括

1、执行org.apache.spark.deploy.SparkSubmit的main方法提交。
2、运行yarn客户端Client的run方法。
3、向ResourceManager提交application请求container用来运行ApplicationMaster。
4、运行ApplicationMaster的main方法，运行driver程序并注册AM。
5、用户程序开始运行，遇到action动作开始作业调度。

二、源码分析

首先，SparkSubmit入口函数

override def main(args: Array[String]): Unit = {
    val submit = new SparkSubmit() {
      self =>
      override protected def parseArguments(args: Array[String]): SparkSubmitArguments = {
        new SparkSubmitArguments(args) {
          override protected def logInfo(msg: => String): Unit = self.logInfo(msg)

          override protected def logWarning(msg: => String): Unit = self.logWarning(msg)
        }
      }
      override protected def logInfo(msg: => String): Unit = printMessage(msg)
      override protected def logWarning(msg: => String): Unit = printMessage(s"Warning: $msg")

      override def doSubmit(args: Array[String]): Unit = {
        try {
          super.doSubmit(args)
        } catch {
          case e: SparkUserAppException =>
            exitFn(e.exitCode)
        }
      }
    }
    submit.doSubmit(args)
  }

然后

def doSubmit(args: Array[String]): Unit = {
    // Initialize logging if it hasn't been done yet. Keep track of whether logging needs to
    // be reset before the application starts.
    val uninitLog = initializeLogIfNecessary(true, silent = true)

    val appArgs = parseArguments(args)
    if (appArgs.verbose) {
      logInfo(appArgs.toString)
    }
    appArgs.action match {
    // 匹配到这里
      case SparkSubmitAction.SUBMIT => submit(appArgs, uninitLog)
      case SparkSubmitAction.KILL => kill(appArgs)
      case SparkSubmitAction.REQUEST_STATUS => requestStatus(appArgs)
      case SparkSubmitAction.PRINT_VERSION => printVersion()
    }
  }

实际的submit执行分为两步