Spark1.6-----源码解读之TaskScheduler

 TaskScheduler是SparkContext重要成员之一,负责任务的提交,并且请求集群管理器对任务调度。他也可以看做任务调度的客户端。

SparkContext 522行 创建TaskScheduler:

    val (sched, ts) = SparkContext.createTaskScheduler(this, master)

SparkContext 2592行 为createTaskScheduler具体实现方法:

  private def createTaskScheduler(
      sc: SparkContext,
      master: String): (SchedulerBackend, TaskScheduler) = {
    import SparkMasterRegex._

    // When running locally, don't try to re-execute tasks on failure.
    val MAX_LOCAL_TASK_FAILURES = 1

    master match {
      case "local" =>
        val scheduler = new TaskSchedulerImpl(sc, MAX_LOCAL_TASK_FAILURES, isLocal = true)
        val backend = new LocalBackend(sc.getConf, scheduler, 1)
        scheduler.initialize(backend)
        (backend, scheduler)

它会根据不同的master  产生不同的行为本文以Local为例子。它会创建TaskSchedulerImpl  并且创建LocalBackend:

构造代码TaskSchedulerImpl 102行:

  var dagScheduler: DAGScheduler = null

  var backend: SchedulerBackend = null

  val mapOutputTracker = SparkEnv.get.mapOutputTracker

  var schedulableBuilder: SchedulableBuilder = null
  var rootPool: Pool = null
  // default scheduler is FIFO
  private val schedulingModeConf = conf.get("spark.scheduler.mode", "FIFO")
  val schedulingMode: SchedulingMode = try {
    SchedulingMode.withName(schedulingModeConf.toUpperCase)
  } catch {
    case e: java.util.NoSuchElementException =>
      throw new SparkException(s"Unrecognized spark.scheduler.mode: $schedulingModeConf")
  }

  // This is a var so that we can reset it for testing purposes.
  private[spark] var taskResultGetter = new TaskResultGetter(sc.env, this)

解析:(1)获取配置信息比如调度模式(FIFO,FAIR)

            (2)创建TaskResultGetter 作用是通过线程池对Worker上的Executor发送Task的执行结果进行处理。

TaskScheduleImpl的调度方式有两种,但任务的最终调度都会落到ScheduleBackend的具体实现。

SparkContext 2603行 创建LoaclBackend:

        val backend = new LocalBackend(sc.getConf, scheduler, 1)

LoaclBackend比较注意的方法 123行 :

  override def start() {
    val rpcEnv = SparkEnv.get.rpcEnv
    val executorEndpoint = new LocalEndpoint(rpcEnv, userClassPath, scheduler, this, totalCores)
    localEndpoint = rpcEnv.setupEndpoint("LocalBackendEndpoint", executorEndpoint)
    listenerBus.post(SparkListenerExecutorAdded(
      System.currentTimeMillis,
      executorEndpoint.localExecutorId,
      new ExecutorInfo(executorEndpoint.localExecutorHostname, totalCores, Map.empty)))
    launcherBackend.setAppId(appId)
    launcherBackend.setState(SparkAppHandle.State.RUNNING)
  }

解析:它会创建LocalEndpoint,可以看出LoaclBackend会同过LoaclEndpoint来进行消息的通信。

TaskSchedulerImpl和LoaclBackEnd创建好了便进行初始化。

SparkContext 2616行 调用初始化方法:

scheduler.initialize(backend)

调用TaskSchedulerImpl 126行:

  def initialize(backend: SchedulerBackend) {
    //获得LoaclBackend引用
    this.backend = backend
    // temporarily set rootPool name to empty创建缓存队列
    rootPool = new Pool("", schedulingMode, 0, 0)
    //创建不同的调度策略来操作队列
    schedulableBuilder = {
      schedulingMode match {
        case SchedulingMode.FIFO =>
          new FIFOSchedulableBuilder(rootPool)
        case SchedulingMode.FAIR =>
          new FairSchedulableBuilder(rootPool, conf)
      }
    }
    schedulableBuilder.buildPools()
  }

TaskScheduler创建完毕。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值