Spark On Yarn 源码分析

概述
首先说明下AppMaster 与Driver 的区别,每个运行在yarn上的服务都需要有一个AppMaster,而Driver是一个Spark任务特有的,Driver 会负责创建SparkContext对象【内部维护了用于任务划分以及执行的DAGScheduler以及TaskScheduler】,DAGScheduler会将任务划分为job stage 然后交给TaskScheduler调度分配给Executor执行。【注:Driver 不仅仅可以用来创建SparkContext 还可以做一些其他的需求…】
所以Driver 跟AppMaster 是两个完全不同的东西,Driver 主要负责Spark任务的资源调度,任务分配,AppMaster 负责yarn上的资源调配。只不过Spark运行在Yarn上 二者之间就出现了一些交互,Driver 会与AppMaster 进行通信,资源的申请由AppMaster 来完成,任务的调度是由Driver来分配调度,Driver会与executor 交互 分配给其任务。

yarn-client与yarn-cluster分析
二者的区别是client模式下 Driver 运行在本地,而cluster 模式下driver会运行在运行AppMaster所在的容器中。

在Yarn-Clinet 模式先优先会启动Driver 就是我们自己所写的类开始。在初始化SparkContext 的时候会作为客户端向Yarn 申请资源(运行AppMaster),然后会将AppMaster 注册到yarn 进而去申请Executor 资源,然后由Driver负责任务的具体调度,此时AppMaster 会监测Driver运行状态,一旦检测Driver 异常 则会释放整个Yarn的资源。故在此模式下 client退出 则整个任务都会退出。

在Yarn-Cluster 模式下,首先会启动向Yarn申请资源用于运行AppMaster,然后App Master通过反射机制运行我们自己定义的主类,待SparkContext运行后 将自己注册到Yarn上并申请executor资源,这种模式下driver 与AppMaster 是同一个进程中的两个线程,clinet 端退出也不会影响到整个任务的状态,因为Driver 运行在远端,待Driver 结束后,AppMaster 也会释放整个的资源。

SparkOnYarn源码分析
1.程序入口
执行sparkSubmit 命令时实际执行的是 spark-class 脚本,而spark-class脚本的参数是SparkSubmit实际上最后程序的入口类还是SparkSubmit.scala类。

exec "${SPARK_HOME}"/bin/spark-class org.apache.spark.deploy.SparkSubmit "$@"

spark-class脚本内容分析:

这里只贴出了部分内容.

如下是其中一段调用Main.scala构建sparkSubmit命令的类

build_command() {
  "$RUNNER" -Xmx128m -cp "$LAUNCH_CLASSPATH" org.apache.spark.launcher.Main "$@"
  printf "%d\0" $?
}

具体org.apache.spark.launcher.Main 分析

 public static void main(String[] argsArray) throws Exception {
    checkArgument(argsArray.length > 0, "Not enough arguments: missing class name.");

    List<String> args = new ArrayList<>(Arrays.asList(argsArray));
    String className = args.remove(0);

    boolean printLaunchCommand = !isEmpty(System.getenv("SPARK_PRINT_LAUNCH_COMMAND"));
    AbstractCommandBuilder builder;
    //如果当前传入的类是SparkSubmit【这里要记得上面的spark-submit脚本在调用spark-class 脚本时传入的第一个参数即:SparkSubmit】
    if (className.equals("org.apache.spark.deploy.SparkSubmit")) {
      try {
        //开始构建命令
        //这里会判断当前命令是spark-shell pyspark-shell 等并做变量标记
        //然后会调用OptionParser类解析所有传入的参数
        builder = new SparkSubmitCommandBuilder(args);
      } catch (IllegalArgumentException e) {
        printLaunchCommand = false;
        System.err.println("Error: " + e.getMessage());
        System.err.println();

        MainClassOptionParser parser = new MainClassOptionParser();
        try {
          parser.parse(args);
        } catch (Exception ignored) {
          // Ignore parsing exceptions.
        }

        List<String> help = new ArrayList<>();
        if (parser.className != null) {
          help.add(parser.CLASS);
          help.add(parser.className);
        }
        help.add(parser.USAGE_ERROR);
        builder = new SparkSubmitCommandBuilder(help);
      }
    } else {
      builder = new SparkClassCommandBuilder(className, args);
    }

    Map<String, String> env = new HashMap<>();
     //拼接整个java 命令 设置jvm 参数等  ===》 java -cp xxxx.jar.....           SparkSubmit  ......
    List<String> cmd = builder.buildCommand(env);
    //debug信息 打印命令信息
    // export  SPARK_PRINT_LAUNCH_COMMAND=true
    if (printLaunchCommand) {
      System.err.println("Spark Command: " + join(" ", cmd));
      System.err.println("========================================");
    }

    if (isWindows()) {
      System.out.println(prepareWindowsCommand(cmd, env));
    } else {
      // In bash, use NULL as the arg separator since it cannot be used in an argument.
      //将env中参数加入cmd 中
      List<String> bashCmd = prepareBashCommand(cmd, env);
      //打印所有参数
      for (String c : bashCmd) {
        System.out.print(c);
        //这里就是个转义【null】 打印出来就是个 空格 分割 【Linux下】
        System.out.print('\0');
      }
    }
  }

回到spark-class中

#这里是对上述打印的命令做拼接
CMD=()
while IFS= read -d '' -r ARG; do
  CMD+=("$ARG")
done < <(build_command "$@")
#最后这里exec 执行整个拼接的命令
#eg:exec env LD_LIBRARY_PATH=:xxxxx/lib/native /usr/lib/java/jdk1.8.0/bin/java -cp xxxxx.jar -Xmx1g org.apache.spark.deploy.SparkSubmit  xxxx【输入的参数】
CMD=("${CMD[@]:0:$LAST}")
exec "${CMD[@]}"
.

接下来就是SparkSubmit.scala 类
进入Main方法:

def main(args: Array[String]): Unit = {
    // scalastyle:on println
    //主要是解析过来的参数
    //校验参数得合法性 是否是支持的参数项
    //以及拿到具体得参数 跟其对应的值 并赋值给对应的变量
    val appArgs = new SparkSubmitArguments(args)
    //verbose  debug信息
    if (appArgs.verbose) {
      // scalastyle:off println
      printStream.println(appArgs)
      // scalastyle:on println
    }
    //针对当前提交得操作类型 做处理
    //默认是submit
    appArgs.action match {
       //提交任务
      case SparkSubmitAction.SUBMIT => submit(appArgs)
      //kill 任务
      case SparkSubmitAction.KILL => kill(appArgs)
      //查询状态
      case SparkSubmitAction.REQUEST_STATUS => requestStatus(appArgs)
    }
  }

进入submit 方法
//这里主要是提交任务
//需要注意prepareSubmitEnvjironment 方法此方法会获取到需要执行的程序入口client 模式下 入口为自己编写的app main
//否则时org.apache.spark.deploy.yarn.Client中的main 方法

  @tailrec
  private def submit(args: SparkSubmitArguments): Unit = {
    //获取 参数  类路径  属性信息  主方法
    //如果是yarn 得话  返回得要么是 自己写得类得main  要么是 org.apache.spark.deploy.yarn.Client
    val (childArgs, childClasspath, sysProps, childMainClass) = prepareSubmitEnvironment(args)
     //定义后续调用的doRunMain方法
    def doRunMain(): Unit = {
      if (args.proxyUser != null) {
        val proxyUser = UserGroupInformation.createProxyUser(args.proxyUser,
          UserGroupInformation.getCurrentUser())
        try {
          proxyUser.doAs(new PrivilegedExceptionAction[Unit]() {
            override def run(): Unit = {
              runMain(childArgs, childClasspath, sysProps, childMainClass, args.verbose)
            }
          })
        } catch {
          case e: Exception =>
            // Hadoop's AuthorizationException suppresses the exception's stack trace, which
            // makes the message printed to the output by the JVM not very helpful. Instead,
            // detect exceptions with empty stack traces here, and treat them differently.
            if (e.getStackTrace().length == 0) {
              // scalastyle:off println
              printStream.println(s"ERROR: ${e.getClass().getName()}: ${e.getMessage()}")
              // scalastyle:on println
              exitFn(1)
            } else {
              throw e
            }
        }
      } else {
        runMain(childArgs, childClasspath, sysProps, childMainClass, args.verbose)
      }
    }
     //下面不管使用哪种方式最总都会调用上述定义的方法
     // In standalone cluster mode, there are two submission gateways:
     //   (1) The traditional RPC gateway using o.a.s.deploy.Client as a wrapper
     //   (2) The new REST-based gateway introduced in Spark 1.3
     // The latter is the default behavior as of Spark 1.3, but Spark submit will fail over
     // to use the legacy gateway if the master endpoint turns out to be not a REST server.
    if (args.isStandaloneCluster && args.useRest) {
      try {
        // scalastyle:off println
        printStream.println("Running Spark using the REST application submission protocol.")
        // scalastyle:on println
        doRunMain()
      } catch {
        // Fail over to use the legacy submission gateway
        case e: SubmitRestConnectionException =>
          printWarning(s"Master endpoint ${args.master} was not a REST server. " +
            "Falling back to legacy submission gateway instead.")
          args.useRest = false
          submit(args)
      }
    // In all other modes, just run the main class as prepared
    } else {
      doRunMain()
    }
  }`

可以看到如上最总会调用 runMain方法
其中有如下代码:

val mainMethod = mainClass.getMethod("main", new Array[String](0).getClass)
......
......
//按照上述分析 按照不同模式执行不同类的main
mainMethod.invoke(null, childArgs.toArray)

接下来按照不同模式分别分析:
Yarn-Cluster模式下:

由上面的分析可以知道运行的类是:org.apache.spark.deploy.yarn.Client
那么到对应类中的main 方法:

def main(argStrings: Array[String]) {
    if (!sys.props.contains("SPARK_SUBMIT")) {
      logWarning("WARNING: This client is deprecated and will be removed in a " +
        "future version of Spark. Use ./bin/spark-submit with \"--master yarn\"")
    }

    // Set an env variable indicating we are running in YARN mode.
    // Note that any env variable with the SPARK_ prefix gets propagated to all (remote) processes
    System.setProperty("SPARK_YARN_MODE", "true")
    val sparkConf = new SparkConf
    // SparkSubmit would use yarn cache to distribute files & jars in yarn mode,
    // so remove them from sparkConf here for yarn mode.
    sparkConf.remove("spark.jars")
    sparkConf.remove("spark.files")
    //用于参数的解析
    val args = new ClientArguments(argStrings)
    //创建client并调用run 运行整个流程
    new Client(args, sparkConf).run()
  }

然后可以看到Cline 初始化类的过程,会有如下两段信息

  //yarn客户端 ==>YarnClinetImpl
 private val yarnClient = YarnClient.createYarnClient
 //创建用于发送socket信息的launcherBacked
 private val launcherBackend = new LauncherBackend() {
    override def onStopRequest(): Unit = {
      if (isClusterMode && appId != null) {
        yarnClient.killApplication(appId)
      } else {
        setState(SparkAppHandle.State.KILLED)
        stop()
      }
    }
  }

然后回过来看下run方法:

 def run(): Unit = {
    //提交任务获取appId
    this.appId = submitApplication()
    if (!launcherBackend.isConnected() && fireAndForget) {
      val report = getApplicationReport(appId)
      val state = report.getYarnApplicationState
      logInfo(s"Application report for $appId (state: $state)")
      logInfo(formatReportDetails(report))
      if (state == YarnApplicationState.FAILED || state == YarnApplicationState.KILLED) {
        throw new SparkException(s"Application $appId finished with status: $state")
      }
    } else {
       //获取任务执行信息
      val (yarnApplicationState, finalApplicationStatus) = monitorApplication(appId)
      if (yarnApplicationState == YarnApplicationState.FAILED ||
        finalApplicationStatus == FinalApplicationStatus.FAILED) {
        throw new SparkException(s"Application $appId finished with failed status")
      }
      if (yarnApplicationState == YarnApplicationState.KILLED ||
        finalApplicationStatus == FinalApplicationStatus.KILLED) {
        throw new SparkException(s"Application $appId is killed")
      }
      if (finalApplicationStatus == FinalApplicationStatus.UNDEFINED) {
        throw new SparkException(s"The final status of application $appId is undefined")
      }
    }
  }

然后这里关注

 def submitApplication(): ApplicationId = {
    var appId: ApplicationId = null
    try {
      //创建连接
      //这个是要配合SparkLauncher 方式提交的任务的 不是通过这种方式应该不会走这个的
      launcherBackend.connect()
      // Setup the credentials before doing anything else,
      // so we have don't have issues at any point.
      //初始化yarn相关配置 并启动客户端
      setupCredentials()
      yarnClient.init(yarnConf)
      yarnClient.start()

      logInfo("Requesting a new application from cluster with %d NodeManagers"
        .format(yarnClient.getYarnClusterMetrics.getNumNodeManagers))

      // Get a new application from our RM
      //创建一个app应用
      val newApp = yarnClient.createApplication()
      val newAppResponse = newApp.getNewApplicationResponse()
      appId = newAppResponse.getApplicationId()
      reportLauncherState(SparkAppHandle.State.SUBMITTED)
      launcherBackend.setAppId(appId.toString)

      //与hadoop通信
      new CallerContext("CLIENT", Option(appId.toString)).setCurrentContext()

      // Verify whether the cluster has enough resources for our AM
      //验证是否由足够的资源运行AppMaster
      verifyClusterResources(newAppResponse)

      // Set up the appropriate contexts to launch our AM
      // 设置AppMaster运行时的环境变量 以及整个启动命令
      // 这里会根据当前任务的模式: 
      //  client模式下:使用的AppMaster类 org.apache.spark.deploy.yarn.ExecutorLauncher
      //  cluster模式下ra:使用的AppMaster类 org.apache.spark.deploy.yarn.ApplicationMaster
      //  同时会根据模式设置是否注册用户自己编写的应用类
      val containerContext = createContainerLaunchContext(newAppResponse)
      val appContext = createApplicationSubmissionContext(newApp, containerContext)

       //提交任务到yarn 上
      // Finally, submit and monitor the application
      logInfo(s"Submitting application $appId to ResourceManager")
      yarnClient.submitApplication(appContext)
      appId
    } catch {
      case e: Throwable =>
        if (appId != null) {
          cleanupStagingDir(appId)
        }
        throw e
    }
  }

这里先看cluster模式下 ApplicationMaster :

//appMaster 运行在yarn 上的
  def main(args: Array[String]): Unit = {
    SignalUtils.registerLogger(log)
    // 解析参数
    val amArgs = new ApplicationMasterArguments(args)

    // Load the properties file with the Spark configuration and set entries as system properties,
    // so that user code run inside the AM also has access to them.
    // Note: we must do this before SparkHadoopUtil instantiated
    if (amArgs.propertiesFile != null) {
      Utils.getPropertiesFromFile(amArgs.propertiesFile).foreach { case (k, v) =>
        sys.props(k) = v
      }
    }
    SparkHadoopUtil.get.runAsSparkUser { () =>
      // 创建ApplicationMaster并调用run()方法
      master = new ApplicationMaster(amArgs, new YarnRMClient)
      //调用main 方法运行整个流程 
      System.exit(master.run())
    }
  }

run 方法:

 final def run(): Int = {
      ....
      ....
      //这里主要时这两个地方 根据模式执行不同的方法
      //这里正好验证了前面的分析流程
      //此时master 模式下在AppMaster启动了Driver
      if (isClusterMode) {
        runDriver(securityMgr)
      } else {
        runExecutorLauncher(securityMgr)
      }
  }
  /**

runDriver 方法:

 private def runDriver(securityMgr: SecurityManager): Unit = { 
    //这里开始执行用户App 类 ,启动SparkContext 对象等
    userClassThread = startUserApplication()
 }

startUserApplication 方法:
这里直接获取到传入的用户应用类 然后反射执行:

 private def startUserApplication(): Thread = {
 val mainMethod = userClassLoader.loadClass(args.userClass)
      .getMethod("main", classOf[Array[String]])
 val userThread = new Thread {
      override def run() {
        try {
          mainMethod.invoke(null, userArgs.toArray)
          //这里等待App 运行完毕 然后进行退出 释放资源等
          finish(FinalApplicationStatus.SUCCEEDED, ApplicationMaster.EXIT_SUCCESS)
          logDebug("Done running users class")
        } catch {
           .....
        } finally {
         .......
        }
      }
    }
}

至此Cluster模式下的整个运行流程就到这里了。

YarnClient模式下的分析:
由前面的分析可以得知 client 模式下运行的是用户编写的App应用类,根据相关的Api 我们可以得值 ,首先创建的是SparkContext对象:
SparkContext:

{
  //创建taskScheduler
 val (sched, ts) = SparkContext.createTaskScheduler(this, master, deployMode)
  //创建DAGSheduler
  _dagScheduler = new DAGScheduler(this)
 }

进入createTaskScheduler方法

//这里是通过ServiceLoad 方式加载的实现类
//返回值应该是在Yarn的模式下YarnClusterManager
case masterUrl =>
        val cm = getClusterManager(masterUrl) match {
          case Some(clusterMgr) => clusterMgr
          case None => throw new SparkException("Could not parse Master URL: '" + master + "'")
        }
        try {
          /**
            * 根据不同的Cluster Manager来创建不同的TaskScheduler和SchedulerBackend
            * YARN:
            *   - Cluster:YarnClusterScheduler和YarnClusterSchedulerBackend
            *   - Client:YarnScheduler和YarnClientSchedulerBackend
            *
            *    这里使用的是  YarnClusterManager  然后根据不同的模式创建好不同的实现类
			*    
            */
          val scheduler = cm.createTaskScheduler(sc, masterUrl)
          val backend = cm.createSchedulerBackend(sc, masterUrl, scheduler)
          cm.initialize(scheduler, backend)
          (backend, scheduler)

获取集群管理器:

  //url =yarn
  private def getClusterManager(url: String): Option[ExternalClusterManager] = {
    // 通过类加载器加载所有实现了特质ExternalClusterManager的类
    val loader = Utils.getContextOrSparkClassLoader
    //这里加载  meta/service/..... 配置的类 如果是多个则过掉掉master url 是yarn 的
    val serviceLoaders = ServiceLoader.load(classOf[ExternalClusterManager], loader).asScala
      .filter(_.canCreate(url)) // 调用实现的canCreate()方法进行过滤
    //只能是单一的 如果存在多个 会抛出异常
    if (serviceLoaders.size > 1) {
      throw new SparkException(
        s"Multiple external cluster managers registered for the url $url: $serviceLoaders")
    }
    serviceLoaders.headOption
  }

taskSchedule 与backed 的创建:
以下两个方法是 YarnClusterManager 中 的方法:
createTaskScheduler:
可以看到下面根据当前的模式分别创建的是不同类型的 scheduler 实现类:

override def createTaskScheduler(sc: SparkContext, masterURL: String): TaskScheduler = {
    sc.deployMode match {
      case "cluster" => new YarnClusterScheduler(sc)
      case "client" => new YarnScheduler(sc)
      case _ => throw new SparkException(s"Unknown deploy mode '${sc.deployMode}' for Yarn")
    }
  }

createSchedulerBackend:
这部分跟上面是相同的:

 // 根据部署模式创建SchedulerBackend
  override def createSchedulerBackend(sc: SparkContext,
      masterURL: String,
      scheduler: TaskScheduler): SchedulerBackend = {
    sc.deployMode match {
      case "cluster" =>
        new YarnClusterSchedulerBackend(scheduler.asInstanceOf[TaskSchedulerImpl], sc)
      case "client" =>
        new YarnClientSchedulerBackend(scheduler.asInstanceOf[TaskSchedulerImpl], sc)
      case  _ =>
        throw new SparkException(s"Unknown deploy mode '${sc.deployMode}' for Yarn")
    }
  }

然后我们分别分析创建的不同的实现类:
taskSchedule:
以下是cluster 模式下的 可以看到也是继承了clinet 模式下的实现:

private[spark] class YarnClusterScheduler(sc: SparkContext) extends YarnScheduler(sc)

client 模式下 的实现类 可以看到继承了TaskSchedulerImpl, taskImpl中实现了 task 调度的相关信息

private[spark] class YarnScheduler(sc: SparkContext) extends TaskSchedulerImpl(sc)

schedulerBacked:
client:
YarnClientSchedulerBackend 这里实现了 YarnSchedulerBackend,这里也定义了一些方法 ,这里我们重点关注:

这其实定义的 org.apache.spark.deploy.yarn.Client是这个类 即上面我们在分析cluster 模式下的引导类

private var client: Client = null

然后在start 方法中 创建了是实例化了client 类,然后直接调用了 submitApplication 方法 ,这个在上面已经分析过,
最后调用的是ExecutorLaunch,这里可以看到最后又调用了AppMaster 类,然后执行整个流程。

object ExecutorLauncher {

  def main(args: Array[String]): Unit = {
    ApplicationMaster.main(args)
  }
}
override def start() {
    totalExpectedExecutors = YarnSparkHadoopUtil.getInitialTargetExecutorNumber(conf)
    client = new Client(args, conf)
    bindToYarn(client.submitApplication(), None)
  }

cluster:
YarnClusterSchedulerBackend这里继承了YarnSchedulerBackend,YarnClusterSchedulerBackend中定义了 sart 方法 以及获取getDriverLogUrls方法 并没有过多的方法定义

通过上面的分析可以得知我们在开始提到得结论,client 模式下:首先会执行我们用户编写得App应用类,然后在创建scheduebacked时才去向yarn 申请了相关得资源,而在cluster 模式下,先是申请了Yarn 得相关资源,然后在AppMaster中会去反射执行用户缩写得App类,然后执行整个流程。

这里在说明下:不管cluster 还是client 模式最后在run AppMaster 类时都会执行 registerAM 方法,在这个方法可以关注下:
allocator.allocateResources() 这个地方主要是lauch executor 的。

转载请注明:https://editor.csdn.net/md/?articleId=112577323
参考链接:https://my.oschina.net/kavn/blog/1540548

  • 0
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值