Spark 源码阅读(5)——Spark-submit任务提交流程

任务提交后执行前的逻辑:

client端:

1、spark-submit脚本提交任务,会通过反射的方式调用到我们自己提交的类的main方法

2、执行我们自己代码中的new SparkContext

    2.1、创建actorSystem

    2.2、创建TaskSchedulerImpl 任务分发的类

    2.3、创建SparkDeploySchedulerBackend 调度任务

    2.4、创建DAGScheduler 划分任务,创建一个线程,task阻塞队列

    2.5、创建clinetActor,向master注册app(job-jar)信息

    2.6、创建driverActor

Master端:

1、master接收到app的注册信息后将信息保存起来

2、通过worker的心跳和注册,知道集群中有多少资源

3、对比app需要的资源,然后根据集群中有的资源,进行资源分配(打伞、集中)

4、通知worker,启动executor

Worker端:

1、启动executor

2、和driverActor交互,监听和接收任务


集群提交任务后的调度spark-submit

在spark-submit的脚本启动启动了SparkSubmit类,调用了SparkSubmit的main方法

下面我们看一下SparkSubmit的main方法:

def main(args: Array[String]): Unit = {
  val appArgs = new SparkSubmitArguments(args)
  if (appArgs.verbose) {
    printStream.println(appArgs)
  }
  appArgs.action match {
    case SparkSubmitAction.SUBMIT => submit(appArgs)
    case SparkSubmitAction.KILL => kill(appArgs)
    case SparkSubmitAction.REQUEST_STATUS => requestStatus(appArgs)
  }
}

匹配到SUBMIT,看一下submit方法:

private[spark] def submit(args: SparkSubmitArguments): Unit = {
  val (childArgs, childClasspath, sysProps, childMainClass) = prepareSubmitEnvironment(args)

  def doRunMain(): Unit = {
    if (args.proxyUser != null) {
      val proxyUser = UserGroupInformation.createProxyUser(args.proxyUser,
        UserGroupInformation.getCurrentUser())
      try {
        proxyUser.doAs(new PrivilegedExceptionAction[Unit]() {
          override def run(): Unit = {
            runMain(childArgs, childClasspath, sysProps, childMainClass, args.verbose)
          }
        })
      } catch {
        case e: Exception =>
          // Hadoop's AuthorizationException suppresses the exception's stack trace, which
          // makes the message printed to the output by the JVM not very helpful. Instead,
          // detect exceptions with empty stack traces here, and treat them differently.
          if (e.getStackTrace().length == 0) {
            printStream.println(s"ERROR: ${e.getClass().getName()}: ${e.getMessage()}")
            exitFn()
          } else {
            throw e
          }
      }
    } else {
      runMain(childArgs, childClasspath, sysProps, childMainClass, args.verbose)
    }
  }

   // In standalone cluster mode, there are two submission gateways:
   //   (1) The traditional Akka gateway using o.a.s.deploy.Client as a wrapper
   //   (2) The new REST-based gateway introduced in Spark 1.3
   // The latter is the default behavior as of Spark 1.3, but Spark submit will fail over
   // to use the legacy gateway if the master endpoint turns out to be not a REST server.
  if (args.isStandaloneCluster && args.useRest) {
    try {
      printStream.println("Running Spark using the REST application submission protocol.")
      doRunMain()
    } catch {
      // Fail over to use the legacy submission gateway
      case e: SubmitRestConnectionException =>
        printWarning(s"Master endpoint ${args.master} was not a REST server. " +
          "Falling back to legacy submission gateway instead.")
        args.useRest = false
        
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值