Spark 任务调度之Driver send Task

最新推荐文章于 2023-04-23 13:25:22 发布

pre_tender

最新推荐文章于 2023-04-23 13:25:22 发布

阅读量479

点赞数

分类专栏： Saprk

本文链接：https://blog.csdn.net/pre_tender/article/details/100848905

版权

Saprk 专栏收录该内容

47 篇文章 7 订阅

订阅专栏

文章目录

概要
1. 执行用户编写的代码
2. DAGScheduler生成task
3. TaskSchedulerImpl提交Task
4. Executor接收Task
总结
附录

概要

在前面，我们介绍了Driver的启动、注册以及Application的注册。在此之后，就要进行Task任务的执行了。

1. 执行用户编写的代码

Spark 任务调度之Register App中介绍了Driver中初始化SparkContext对象及注册APP的流程，SparkContext初始化完毕后，执行用户编写代码，仍以SparkPi为例，如下
在这里插入图片描述
如上图，SparkPi中调用RDD的reduce，reduce中
调用SparkContext.runJob方法提交任务，SparkContext.runJob方法后面又会调用DAGScheduler.runJob方法，如下

在2217行，能看到，runJob方法又回到了DAGScheduler.runJob，用来生成Task并提交。下面看看Task的生成。

2. DAGScheduler生成task

DAGScheduler中，根据RDD的Dependency生成stage，并根据stage类型生成对应的task

Stage类型	Task类型
`ShuffleMapStage`	`ShuffleMapTask`
`ResultStage`	`ResultTask`

stage分为ShuffleMapStage和ResultStage两种类型，根据stage类型生成对应的task，分别是ShuffleMapTask、ResultTask，最后调用TaskScheduler提交任务。
当Stage的父亲母是可用的时候，那么我们可以执行它的Task.，如下图，在submitMissingTasks中。
这里我们只关注流程，DAGScheduler的具体细节后续介绍，参见Spark DAG之SubmitTask。
在这里插入图片描述

3. TaskSchedulerImpl提交Task

看看TaskSchedulerImpl的介绍
在这里插入图片描述
TaskScheduler中使用TaskSetManager管理TaskSet，submitTasks方法最终调用CoarseGrainedSchedulerBackend的launchTasks方法将task发送到Executor，如下

在这里插入图片描述
234行：backend.reviveOffers()

DriverEndpoint.receive() ， DriverEndpoint.makeoffers() ，DriverEndpoint.launchTasks()

executorDataMap中保存了所有Executor的连接方式，关于Executor如何注册到executorDataMap中，参考Spark 任务调度之创建Executor。
关于如何确定执行某Task的Executor，是在DAG创建此Task时就确定了的，那个时候通过加锁，保证Executor不能在分配Task执行位置期间死亡。也确定就是320行的Task.executorId。参见Spark DAG之SubmitTask中的2.2 确定Task的执行位置

4. Executor接收Task

Worker节点的CoarseGrainedExecutorBackend进程接收Driver发送的task，交给Executor对象处理，如下
在这里插入图片描述
Executor的launchTask方法将收到的信息封装为TaskRunner对象，TaskRunner继承自Runnable，Executor使用线程池threadPool调度TaskRunner

Executor的创建过程请参考Spark 任务调度之创建Executor。

至此从RDD的action开始，至Executor对象接收任务的流程就结束了。

总结

介绍了从RDD的action开始，到Executor接收到task的流程，其中省略了DAG相关的部分，后续单独介绍，整理流程大致如下
在这里插入图片描述

附录

---------------RDD.scala-------------
  // 使用指定的可交换和关联二元运算符减少此RDD的元素。
  def reduce(f: (T, T) => T): T = withScope {
    val cleanF = sc.clean(f)
    val reducePartition: Iterator[T] => Option[T] = iter => {
      if (iter.hasNext) {
        Some(iter.reduceLeft(cleanF))
      } else {
        None
      }
    }
    var jobResult: Option[T] = None
    val mergeResult = (index: Int, taskResult: Option[T]) => {
      if (taskResult.isDefined) {
        jobResult = jobResult match {
          case Some(value) => Some(f(value, taskResult.get))
          case None => taskResult
        }
      }
    }
    // spark提交任务
    sc.runJob(this, reducePartition, mergeResult)
    // Get the final result out of our Option, or throw an exception if the RDD was empty
    jobResult.getOrElse(throw new UnsupportedOperationException("empty collection"))
  }

----------------SparkContext.runJob---------------
  //在RDD中的所有分区上运行作业，并将结果传递给处理函数。
  def runJob[T, U: ClassTag](
      rdd: RDD[T],
      processPartition: Iterator[T] => U,
      resultHandler: (Int, U) => Unit)
  {
    val processFunc = (context: TaskContext, iter: Iterator[T]) => processPartition(iter)
    runJob[T, U](rdd, processFunc, 0 until rdd.partitions.length, resultHandler)
  }

  /**
   * Run a function on a given set of partitions in an RDD and pass the results to the given
   * handler function. This is the main entry point for all actions in Spark.
   * *在RDD中的给定分区集上运行函数，并将结果传递给给定的处理函数。 这是Spark中所有操作的主要入口点。
   *
   * @param rdd target RDD to run tasks on
   * @param func a function to run on each partition of the RDD
   * @param partitions set of partitions to run on; some jobs may not want to compute on all
   * partitions of the target RDD, e.g. for operations like `first()`
   * @param resultHandler callback to pass each result to
   */
  // *在RDD中的给定分区集上运行函数，并将结果传递给给定的处理函数。 这是Spark中所有操作的主要入口点。
  def runJob[T, U: ClassTag](
      rdd: RDD[T],
      func: (TaskContext, Iterator[T]) => U,
      partitions: Seq[Int],
      resultHandler: (Int, U) => Unit): Unit = {
    if (stopped.get()) {
      throw new IllegalStateException("SparkContext has been shutdown")
    }
    val callSite = getCallSite
    val cleanedFunc = clean(func)
    logInfo("Starting job: " + callSite.shortForm)
    if (conf.getBoolean("spark.logLineage", false)) {
      logInfo("RDD's recursive dependencies:\n" + rdd.toDebugString)
    }
    //调用DAGScheduler,生成Task并提交
    dagScheduler.runJob(rdd, cleanedFunc, partitions, callSite, resultHandler, localProperties.get)
    progressBar.foreach(_.finishAll())
    rdd.doCheckpoint()
  }
--------------------DAGScheduler.scala SubmitMissingTasks()-----------------
    if (tasks.size > 0) {
      logInfo(s"Submitting ${tasks.size} missing tasks from $stage (${stage.rdd}) (first 15 " +
        s"tasks are for partitions ${tasks.take(15).map(_.partitionId)})")
      taskScheduler.submitTasks(new TaskSet(
        tasks.toArray, stage.id, stage.latestInfo.attemptNumber, jobId, properties))
    } else {
      // Because we posted SparkListenerStageSubmitted earlier, we should mark
      // the stage as completed here in case there are no tasks to run
      markStageAsFinished(stage, None)

      stage match {
        case stage: ShuffleMapStage =>
          logDebug(s"Stage ${stage} is actually done; " +
              s"(available: ${stage.isAvailable}," +
              s"available outputs: ${stage.numAvailableOutputs}," +
              s"partitions: ${stage.numPartitions})")
          markMapStageJobsAsFinished(stage)
        case stage : ResultStage =>
          logDebug(s"Stage ${stage} is actually done; (partitions: ${stage.numPartitions})")
      }
      submitWaitingChildStages(stage)
    }
------------TaskSchedulerImpl.submitTasks()--------------------------
  override def submitTasks(taskSet: TaskSet) {
    val tasks = taskSet.tasks
    logInfo("Adding task set " + taskSet.id + " with " + tasks.length + " tasks")
    this.synchronized {
      val manager = createTaskSetManager(taskSet, maxTaskFailures)
      val stage = taskSet.stageId
      val stageTaskSets =
        taskSetsByStageIdAndAttempt.getOrElseUpdate(stage, new HashMap[Int, TaskSetManager])
        
      //将此阶段的所有现有TaskSetManage标记为僵尸，因为我们正在添加一个新的。
        //这是处理角落案件的必要条件。 
        //假设一个阶段有10个分区，并且有2个TaskSetManagers：TSM1（僵尸）和TSM2（活动）。 
        //TSM1具有分区10的运行任务，并且完成。 TSM2完成分区1-9的任务，并认为他仍处于活动状态，因为分区10尚未完成。 
        //但是，DAGScheduler获取所有10个分区的任务完成事件，并认为该阶段已完成。 
        //如果它是一个洗牌阶段，并且它以某种方式缺少了地图输出，那么DAGScheduler将重新提交它并为它创建一个TSM3。
        //由于一个阶段不能有多个活动任务集管理器，我们必须将TSM2标记为僵尸（实际上是）。
      stageTaskSets.foreach { case (_, ts) =>
        ts.isZombie = true
      }
      stageTaskSets(taskSet.stageAttemptId) = manager
      schedulableBuilder.addTaskSetManager(manager, manager.taskSet.properties)

      hasReceivedTask = true
    }
    backend.reviveOffers()
  }
--------------------------CoarseGrainedSchedulerBackend.reviveOffers()------------------
  override def reviveOffers() {
    driverEndpoint.send(ReviveOffers)
  }
![在这里插入图片描述](https://img-blog.csdnimg.cn/20190915111929492.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3ByZV90ZW5kZXI=,size_16,color_FFFFFF,t_70)

-------DriverEndpoint.receive()  ，makeoffers() ，launchTasks()-----------

	case ReviveOffers =>
        makeOffers()
    ----------------------------------
    // 对所有Executor提供假资源
    private def makeOffers() {
      // 加锁，确保在task发起时不会有executor被杀死
      val taskDescs = withLock {
        // Filter out executors under killing
        val activeExecutors = executorDataMap.filterKeys(executorIsAlive)
        val workOffers = activeExecutors.map {
          case (id, executorData) =>
            new WorkerOffer(id, executorData.executorHost, executorData.freeCores,
              Some(executorData.executorAddress.hostPort))
        }.toIndexedSeq
        scheduler.resourceOffers(workOffers)
      }
      if (!taskDescs.isEmpty) {
        launchTasks(taskDescs)
      }
    }
	---------------
    // 启动一组资源报价返回的任务
    private def launchTasks(tasks: Seq[Seq[TaskDescription]]) {
      for (task <- tasks.flatten) {
        val serializedTask = TaskDescription.encode(task)
        if (serializedTask.limit() >= maxRpcMessageSize) {
          Option(scheduler.taskIdToTaskSetManager.get(task.taskId)).foreach { taskSetMgr =>
            try {
              var msg = "Serialized task %s:%d was %d bytes, which exceeds max allowed: " +
                "spark.rpc.message.maxSize (%d bytes). Consider increasing " +
                "spark.rpc.message.maxSize or using broadcast variables for large values."
              msg = msg.format(task.taskId, task.index, serializedTask.limit(), maxRpcMessageSize)
              taskSetMgr.abort(msg)
            } catch {
              case e: Exception => logError("Exception in error callback", e)
            }
          }
        }
        else {
          val executorData = executorDataMap(task.executorId)
          executorData.freeCores -= scheduler.CPUS_PER_TASK
          executorData.executorEndpoint.send(LaunchTask(new SerializableBuffer(serializedTask)))
        }
      }
    }