2. spark源码学习分享：DAGScheduler.runJob

最新推荐文章于 2021-03-06 16:39:18 发布

zerg_ling

最新推荐文章于 2021-03-06 16:39:18 发布

阅读量1k

点赞数

分类专栏： spark 文章标签： spark 源码

本文链接：https://blog.csdn.net/firstblood1/article/details/53515996

版权

本文详细介绍了Spark中的DAGScheduler如何运行Job，从预备阶段到提交Job，再到处理JobSubmitted消息的过程。通过源码学习，探讨task数量与partition的关系以及stage划分的依据。

摘要由CSDN通过智能技术生成

零、前置

上一章分享了reduceByKey方法，发现transformation操作在最后只会将具体的操作记录到rdd中而并不会实际执行，函数的实际执行会延迟到spark解析到action类型操作才会触发。action类型的操作中会调用runJob将job提交到listenerBus中供listenerBus调度。本章就来详细地跟读一下runJob方法。

在跟读完本章的源码后，我们可以验证两个问题：

1、task的数量和partition的数量到底是什么关系

2、stage的划分依据到底是什么

一、预备阶段

以RDD的action类型的操作——reduce为例，我们来详细地跟读一下action类型操作的job提交过程。源码先上：

  /**
   * Reduces the elements of this RDD using the specified commutative and
   * associative binary operator.
   */
  def reduce(f: (T, T) => T): T = withScope {
    val cleanF = sc.clean(f)
    val reducePartition: Iterator[T] => Option[T] = iter => {       ------ 1）
      if (iter.hasNext) {
        Some(iter.reduceLeft(cleanF))
      } else {
        None
      }
    }
    var jobResult: Option[T] = None
    val mergeResult = (index: Int, taskResult: Option[T]) => {      ------ 2）
      if (taskResult.isDefined) {
        jobResult = jobResult match {
          case Some(value) => Some(f(value, taskResult.get))
          case None => taskResult
        }
      }
    }
    sc.runJob(this, reducePartition, mergeResult)                   ------ 3）
    // Get the final result out of our Option, or throw an exception if the RDD was empty
    jobResult.getOrElse(throw new UnsupportedOperationException("empty collection"))
  }

reduce方法的作用这里不再赘述。该方法首先定义了两个函数，reducePartition函数用于迭代地对iter中的每一条记录调用cleanF调用的方式类似于B2 = cleanF(A1,A2), B3 = cleanF(A3, B2), B4 = cleanF(A4, B3) …… BN = cleanF(AN, B(N-1))……（reduce操作中，两两合并实际上就是这样啦），另一个函数mergeResult用来将任务的结果进行合并。（这些都不是重点，记住函数大概干嘛用就行了，下边会用到）

紧接着，reduce会调用SparkContext中的runJob，源码：

  /**
   * Run a function on a given set of partitions in an RDD and pass the results to the given
   * handler function. This is the main entry point for all actions in Spark.
   */
  def runJob[T, U: ClassTag](
      rdd: RDD[T],
      func: (TaskContext, Iterator[T]) => U,
      partitions: Seq[Int],
      resultHandler: (Int, U) => Unit): Unit = {
    if (stopped.get()) {
      throw new IllegalStateException("SparkContext has been shutdown")
    }
    val callSite = getCallSite
    val cleanedFunc = clean(func)
    logInfo("Starting job: " + callSite.shortForm)
    if (conf.getBoolean("spark.logLineage", false)) {
      logInfo("RDD's recursive dependencies:\n" + rdd.toDebugString)
    }
    dagScheduler.runJob(rdd, cleanedFunc, partitions, callSite, resultHandler, localProperties.get)
    progressBar.foreach(_.finishAll())
    rdd.doCheckpoint()
  }

SparkContext的runJob方法主要是做一些前期的工作，比如：判断当前的SparkContext是不是已经被用户stop掉了；获取用户的调用信息（callSite保存的是用户的调用栈信息，用户也可以自己设置一个func来返回其他的信息）；对传入的闭包作clean等。以及一些善后的工作，比如更新stage的进度条以及记录Checkpoint等。接着，SparkContext中的runJob会调用dagScheduler的runJob方法，dagScheduler中的runJob方法才是实际提交job的地方。（这里的DAG值的是有向无环图，Spark中rdd之间存在依赖关系，rdd之间的依赖关系构成了一个dag图，dagScheduler会通过分析rdd依赖关系的dag图来判断stage之间的执行顺序，这块在后续的文章中会详细介绍）。

接下来看一下DAGScheduler的runJob方法：

  /**
   * Run an action job on the given RDD and pass all the results to the resultHandler function as
   * they arrive.
   *
   * @param rdd target RDD to run tasks on
   * @param func a function to run on each partition of the RDD
   * @param partitions set of partitions to run on; some jobs may not want to compute on all
   *   partitions of the target RDD, e.g. for operations like first()
   * @param callSite where in the user program this job was called
   * @param resultHandler callback to pass each result to
   * @param properties scheduler properties to attach to this job, e.g. fair scheduler pool name
   *
   * @throws Exception when the job fails
   */
  def runJob[T, U](
      rdd: RDD[T],
      func: (TaskContext, Iterator[T]) => U,
      partitions: Seq[Int],
      callSite: CallSite,
      resultHandler: (Int, U) => Unit,
      properties: Properties): Unit = {
    val start = System.nanoTime
    val waiter = submitJob(rdd, func, partitions, callSite, resultHandler, properties)                   ------1）
    // Note: Do not call Await.ready(future) because that calls `scala.concurrent.blocking`,
    // which causes concurrent SQL executions to fail if a fork-join pool is used. Note that
    // due to idiosyncrasies in Scala, `awaitPermission` is not actually used anywhere so it's
    // safe to pass in null here. For more detail, see SPARK-13747.
    val awaitPermission = null.asInstanceOf[scala.concurrent.CanAwait]                                   
    waiter.completionFutu