[spark] Standalone模式下Driver资源调度及Executor分配流程

最新推荐文章于 2022-06-17 09:51:51 发布

蛮子72

最新推荐文章于 2022-06-17 09:51:51 发布

阅读量1.1k

点赞数

分类专栏：大数据文章标签： spark

本文链接：https://blog.csdn.net/u012989317/article/details/92760184

版权

大数据专栏收录该内容

10 篇文章 0 订阅

订阅专栏

注：本文接上一篇文章【[spark] standalone集群模式Driver启动过程】继续说明Driver在启动之后，如何申请资源的一个流程......

思路：

步骤：

总结：

思路：

在Standalone模式下集群启动时，Worker会向Master注册，使得Master可以感知进而管理整个集群；

Master通过借助Zookeeper，可以简单实现高可用性；

而应用方通过SparkContext这个与集群的交互接口，在创建SparkContext时就完成了Application的注册，由Master为其分配Executor；

在应用方创建了RDD并且在这个RDD上进行了很多的转换后，触发Action，通过DAGScheduler将DAG划分为不同的Stage，并将Stage转换为TaskSet交给TaskSchedulerImpl；

再由TaskSchedulerImpl通过SparkDeploySchedulerBackend的reviveOffers，最终向ExecutorBackend发送LaunchTask的消息；

ExecutorBackend接收到消息后，启动Task，开始在集群中启动计算

步骤：

DriverMapper在Worker节点被启动之后，执行main方法的内容开始工作，所以，直接查看DriverMapper#main()

//DriverWrapper主方法
  def main(args: Array[String]) {
    args.toList match {
      //下面的mainClass就是我们真正提交的application
      case workerUrl :: userJar :: mainClass :: extraArgs =>
       ......
        //SecurityManager				spark对于认证授权的实现
        val rpcEnv = RpcEnv.create("Driver", host, port, conf, new SecurityManager(conf))
        //WorkerWatcher监控进程状态
        rpcEnv.setupEndpoint("workerWatcher", new WorkerWatcher(rpcEnv, workerUrl))
        ......
        Thread.currentThread.setContextClassLoader(loader)
        setupDependencies(loader, userJar)

        // 加载客户端运行任务的主类
        val clazz = Utils.classForName(mainClass)
        //得到提交application的主方法
        val mainMethod = clazz.getMethod("main", classOf[Array[String]])

        /**
          * 启动提交的application 中的main 方法。
          * 这里启动application，会先创建SparkConf和SparkContext
          *   SparkContext中 362行try块中会创建TaskScheduler(492)
          */
        mainMethod.invoke(null, extraArgs.toArray[String])

        rpcEnv.shutdown()

     .......
    }
  }

main()方法中注释很重要，其中“这里启动application，会先创建SparkConf和SparkContext”，在启动application也就是client任务之前，必须创建SparkConf和SparkContext，这里用一个简单的worldCount当做客户端的application

在new SparkContext(Conf)中，会做一系列初始化工作,其中一个核心方法是createTaskScheduler()，该方法会创建两个对象TaskSchedulerImpl、StandaloneSchedulerBackend

/**
      * 启动调度程序，这里(sched,ts) 分别对应 StandaloneSchedulerBackend 和 TaskSchedulerImpl 两个对象
      *   master 是提交任务写的 spark://node1:7077
      */
    val (sched, ts) = SparkContext.createTaskScheduler(this, master, deployMode)

 private def createTaskScheduler(
      sc: SparkContext,
      master: String,
      deployMode: String): (SchedulerBackend, TaskScheduler) = {
    import SparkMasterRegex._

    // When running locally, don't try to re-execute tasks on failure.
    val MAX_LOCAL_TASK_FAILURES = 1

    master match {
     。。。。。。
      //standalone 提交任务都是以 “spark://”这种方式提交
      case SPARK_REGEX(sparkUrl) =>
        //scheduler 创建TaskSchedulerImpl 对象
        val scheduler = new TaskSchedulerImpl(sc)
        val masterUrls = sparkUrl.split(",").map("spark://" + _)
        //这里的 backend 是StandaloneSchedulerBackend 这个类型
        val backend = new StandaloneSchedulerBackend(scheduler, sc, masterUrls)
        //这里会调用 TaskSchedulerImpl 对象中的  initialize 方法将 backend 初始化,一会要用到
        scheduler.initialize(backend)
        //返回了  StandaloneSchedulerBackend 和 TaskSchedulerImpl 两个对象
        (backend, scheduler)
      。。。。。。
      case masterUrl =>
        val cm = getClusterManager(masterUrl) match {
          case Some(clusterMgr) => clusterMgr
          case None => throw new SparkException("Could not parse Master URL: '" + master + "'")
        }
        try {
          val scheduler = cm.createTaskScheduler(sc, masterUrl)
          val backend = cm.createSchedulerBackend(sc, masterUrl, scheduler)
          cm.initialize(scheduler, backend)
          (backend, scheduler)
       。。。。。。
    }
  }

SparkContext中维护了两个对象： StandaloneSchedulerBackend 和 TaskSchedulerImpl，其中StandaloneSchedulerBackend 的父类是CoarseGrainedSchedulerBackend(粗粒度资源申请)，这也是spark和MR的区别之一，MR为细粒度资源申请；TaskSchedulerImpl是TaskScheduler的一个实现，根据SparkContext实例类会实例化一个DAGScheduler做任务切分

需要说明的是：scheduler和backend是相互引用的，被实例化后会调用TaskSchedulerImpl的start(）方法，而TaskSchedulerImpl的start()内部执行的是backend.start()，(backend被new后会初始化到scheduler)backend#start()方法会先调用父类CoarseGrainedSchedulerBackend的start()方法向RPC注册DriverEndpoint

SparkContext
    scheduler.initialize(backend)
        (backend, scheduler)
。。。。。。
    //TaskSchedulerImpl 对象的start方法
    _taskScheduler.start()
。。。。。。

TaskSchedulerImpl
  /**
    * TaskScheduler 启动
    */
  override def start() {
    //StandaloneSchedulerBackend 启动
    backend.start()
。。。。。。
  }

StandaloneSchedulerBackend
    //115行 要提交application的描述信息
      override def start() {
    /**
      * super.start()中有创建Driver的通信邮箱也就是Driver的引用
      * 未来Executor就是向 StandaloneSchedulerBackend中父类 CoarseGrainedSchedulerBackend     中反向注册信息的.
      */
    super.start()
........

在StandaloneSchedulerBackend的start()调用父类CoarseGrainedSchedulerBackend在start方法后，会执行以下代码

  override def start() {
    val properties = new ArrayBuffer[(String, String)]
    for ((key, value) <- scheduler.sc.conf.getAll) {
      if (key.startsWith("spark.")) {
        properties += ((key, value))
      }
    }

    // TODO (prashant) send conf instead of properties
    /**
      * 创建Driver的Endpoint ,就是创建Driver的通信邮箱，向Rpc中注册当前DriverEndpoint
      * 未来Executor就是向DriverEndpoint中反向注册信息，这里Driver中会有receiveAndReply方法一直监听匹配发过来的信息【167行】
      */
    driverEndpoint = createDriverEndpointRef(properties)
  }
 protected def createDriverEndpointRef(
      properties: ArrayBuffer[(String, String)]): RpcEndpointRef = {
    rpcEnv.setupEndpoint(ENDPOINT_NAME, createDriverEndpoint(properties))
  }

StandaloneSchedulerBackend

//115行 要提交application的描述信息
  override def start() {
    /**
      * super.start()中有创建Driver的通信邮箱也就是Driver的引用
      * 未来Executor就是向 StandaloneSchedulerBackend中父类 CoarseGrainedSchedulerBackend 中反向注册信息的.
      */
    super.start()
。。。。。。
    val command = Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",
      args, sc.executorEnvs, classPathEntries ++ testingClassPath, libraryPathEntries, javaOpts)
    val webUrl = sc.ui.map(_.webUrl).getOrElse("")
    val coresPerExecutor = conf.getOption("spark.executor.cores").map(_.toInt)
    。。。。。。
    val appDesc: ApplicationDescription = ApplicationDescription(sc.appName, maxCores, sc.executorMemory, command,
      webUrl, sc.eventLogDir, sc.eventLogCodec, coresPerExecutor, initialExecutorLimit)
    //提交应用程序的描述信息
    //封装 appDesc,这里已经传入了StandaloneAppClient 中
    client = new StandaloneAppClient(sc.env.rpcEnv, masters, appDesc, this, conf)
    //启动StandaloneAppClient，之后会向Driver注册application的信息
    client.start()
    launcherBackend.setState(SparkAppHandle.State.SUBMITTED)
    waitForRegistration()
    launcherBackend.setState(SparkAppHandle.State.RUNNING)
  }

仔细看代码，方法中先是实例化一个Command对象，然后作为参数实例化到ApplicationDescription对象，再将ApplicationDescription的实例化对象封装到StandaloneAppClient的对象中，最后启动了StandaloneAppClient的实例化对象start()方法

def start() {
    // Just launch an rpcEndpoint; it will call back into the listener.
    /**
      *  这里就是给空的 endpoint[AtomicReference] 设置下 信息，
      *  主要是rpcEnv.setupEndpoint 中创建了 ClientEndpoint 只要设置Endpoint 肯定会调用 ClientEndpoint的onStart方法
      */
    endpoint.set(rpcEnv.setupEndpoint("AppClient", new ClientEndpoint(rpcEnv)))
  }

StandaloneAppClient的start()方法注册了ClientEndpoint，所以ClientEndpoint的onstart()方法肯定会被调用，最终在ClientEndpoint的onstart()方法内会发现，它会向master注册

 //onStart 方法
    override def onStart(): Unit = {
      try {
        //向Master 注册当前application的信息
        registerWithMaster(1)
      } catch {
        case e: Exception =>
          logWarning("Failed to connect to master", e)
          markDisconnected()
          stop()
      }
    }

在向master注册的方法中，向所有的Master去注册Application的信息，master收到消息类型为RegisterApplication类型做处理并回复

 //向所有的Master去注册Application的信息
    private def tryRegisterAllMasters(): Array[JFuture[_]] = {
      //遍历所有的Master地址
      for (masterAddress <- masterRpcAddresses) yield {
        registerMasterThreadPool.submit(new Runnable {
          override def run(): Unit = try {
            if (registered.get) {
              return
            }
            logInfo("Connecting to master " + masterAddress.toSparkURL + "...")
            //获取Master的通信邮箱
            val masterRef = rpcEnv.setupEndpointRef(masterAddress, Master.ENDPOINT_NAME)
            //向Master注册application，Master类中receive方法中会匹配接收 RegisterApplication类型
            masterRef.send(RegisterApplication(appDescription, self))
          } catch {
            case ie: InterruptedException => // Cancelled
            case NonFatal(e) => logWarning(s"Failed to connect to master $masterAddress", e)
          }
        })
      }
    }

接下来看Master对RegisterApplication类型的消息怎么处理，通过整个调用流程都会知道，schedule()是核心业务方法

//Driver 端提交过来的要注册Application
    case RegisterApplication(description, driver) =>
      // TODO Prevent repeated registrations from some driver
      //如果Master状态是standby 忽略不提交任务
      if (state == RecoveryState.STANDBY) {
        // ignore, don't send response
      } else {
        logInfo("Registering app " + description.name)
        //这里封装application信息，注意，在这里可以跟进去看到默认一个application使用的core的个数就是 Int.MaxValue
        val app = createApplication(description, driver)
        //注册app ,这里面有向 waitingApps中加入当前application
        registerApplication(app)
        logInfo("Registered app " + description.name + " with ID " + app.id)
        persistenceEngine.addApplication(app)
        driver.send(RegisteredApplication(app.id, self))
        //最终又会执行通用方法schedule()
        schedule()
      }

schedule()方法中会继续执行startExecutorsOnWorkers()方法，看方法名就知道要开始分配executor资源了

/**
    * schedule() 方法是通用的方法
    * 这个方法中当申请启动Driver的时候也会执行，但是最后一行的startExecutorsOnWorkers 方法中 waitingApp是空的，只是启动Driver。
    * 在提交application时也会执行到这个scheduler方法，这个时候就是要启动的Driver是空的，但是会直接运行startExecutorsOnWorkers 方法给当前的application分配资源
    *
    */
  private def schedule(): Unit = {
   。。。。。。。。
    startExecutorsOnWorkers()
  }

private def startExecutorsOnWorkers(): Unit = {
    //从waitingApps中获取提交的app
    for (app <- waitingApps) {
      //coresPerExecutor 在application中获取启动一个Executor使用几个core 。参数--executor-core可以指定，下面指明不指定就是1
      val coresPerExecutor = app.desc.coresPerExecutor.getOrElse(1)
      //判断是否给application分配够了core,因为后面每次给application 分配core后 app.coresLeft 都会相应的减去分配的core数
      if (app.coresLeft >= coresPerExecutor) {
        // Filter out workers that don't have enough resources to launch an executor
        //过滤出可用的worker
        val usableWorkers : Array[WorkerInfo]= workers.toArray.filter(_.state == WorkerState.ALIVE)
          .filter(worker => worker.memoryFree >= app.desc.memoryPerExecutorMB &&
            worker.coresFree >= coresPerExecutor)
          .sortBy(_.coresFree).reverse

//下面就是去worker中划分每个worker提供多少core和启动多少Executor,注意：spreadOutApps 是true
//返回的 assignedCores 就是每个worker节点中应该给当前的application分配多少core

        val assignedCores = scheduleExecutorsOnWorkers(app, usableWorkers, spreadOutApps)

        for (pos <- 0 until usableWorkers.length if assignedCores(pos) > 0) {
          //在worker中给Executor划分资源
          allocateWorkerResourceToExecutors(
            app, assignedCores(pos), app.desc.coresPerExecutor, usableWorkers(pos))
        }
      }
    }
  }

这个方法做的事情就是：根据client提交的applicationDescription各种条件来匹配满足资源条件的Worker，然后在每一个Worker上的资源分配详细过程(每个worker提供多少core及启动多少executor)

注：其中spreadOutApps表示executor的分配方式

例如：可用worker列表如果有3台，用户提交application需要4个executor，这4个executor可以在同一个worker上执行，也可以在可用的workers列表均衡分布，spreadOutApps就决定这个，默认为true，水平分配资源

executor分配的核心方法就是scheduleExecutorsOnWorkers，这个方法逻辑比较复杂，不一一说明，代码全复制，有兴趣自己看，方法内部有个属性assignedCores，表示在每个可用的Worker上分配的核心数量core，整个方法就计算这个并在方法最后一行返回

private def scheduleExecutorsOnWorkers(
      app: ApplicationInfo,
      usableWorkers: Array[WorkerInfo],
      spreadOutApps: Boolean): Array[Int] = {
    //启动一个Executor使用多少core,这里如果提交任务没有指定 --executor-core这个值就是None
    val coresPerExecutor : Option[Int]= app.desc.coresPerExecutor
    //这里指定如果提交任务没有指定启动一个Executor使用几个core，默认就是1
    val minCoresPerExecutor = coresPerExecutor.getOrElse(1)
    //oneExecutorPerWorker 当前为true
    val oneExecutorPerWorker :Boolean= coresPerExecutor.isEmpty
    //默认启动一个Executor使用的内存就是1024M，这个设置在SparkContext中464行
    //若提价命令中有 --executor-memory 5*1024 就是指定的参数
    val memoryPerExecutor = app.desc.memoryPerExecutorMB
    //可用worker的个数
    val numUsable = usableWorkers.length
    //创建两个重要对象
    val assignedCores = new Array[Int](numUsable) // Number of cores to give to each worker
    val assignedExecutors = new Array[Int](numUsable) // Number of new executors on each worker
    /**
      * coresToAssign 指的是当前要给Application分配的core是多少？ app.coresLeft 与集群所有worker剩余的全部core 取个最小值
      * 这里如果提交application时指定了 --total-executor-core 那么app.coresLeft  就是指定的值
      */
    var coresToAssign = math.min(app.coresLeft, usableWorkers.map(_.coresFree).sum)

    /** Return whether the specified worker can launch an executor for this app. */
    //判断某台worker节点是否还可以启动Executor
    def canLaunchExecutor(pos: Int): Boolean = {
      //可分配的core是否大于启动一个Executor使用的1个core
      val keepScheduling = coresToAssign >= minCoresPerExecutor
      //是否有足够的core
      val enoughCores = usableWorkers(pos).coresFree - assignedCores(pos) >= minCoresPerExecutor
      //assignedExecutors(pos) == 0 为true,launchingNewExecutor就是为true
      val launchingNewExecutor = !oneExecutorPerWorker || assignedExecutors(pos) == 0
      //启动新的Executor
      if (launchingNewExecutor) {
        val assignedMemory = assignedExecutors(pos) * memoryPerExecutor
        //是否有足够的内存
        val enoughMemory = usableWorkers(pos).memoryFree - assignedMemory >= memoryPerExecutor
        //这里做安全判断，说的是要分配启动的Executor和当前application启动的使用的Executor总数是否在Application总的Executor限制之下
        val underLimit = assignedExecutors.sum + app.executors.size < app.executorLimit
        keepScheduling && enoughCores && enoughMemory && underLimit
      } else {
        keepScheduling && enoughCores
      }
    }
    var freeWorkers = (0 until numUsable).filter(canLaunchExecutor)
    while (freeWorkers.nonEmpty) {
      freeWorkers.foreach { pos =>
        var keepScheduling = true
        while (keepScheduling && canLaunchExecutor(pos)) {
          coresToAssign -= minCoresPerExecutor
          assignedCores(pos) += minCoresPerExecutor
          if (oneExecutorPerWorker) {
            assignedExecutors(pos) = 1
          } else {
            assignedExecutors(pos) += 1
          }

          if (spreadOutApps) {
            keepScheduling = false
          }
        }
      }
      freeWorkers = freeWorkers.filter(canLaunchExecutor)
    }
    //最后返回每个Worker上分配多少core
    assignedCores
  }

Master在计算出每个Worker该分配多少core之后，遍历可用worker，调用allocateWorkerResourceToExecutors()方法为worker分配executor

  private def allocateWorkerResourceToExecutors(
      app: ApplicationInfo,
      assignedCores: Int,
      coresPerExecutor: Option[Int],
      worker: WorkerInfo): Unit = {
    // If the number of cores per executor is specified, we divide the cores assigned
    // to this worker evenly among the executors with no remainder.
    // Otherwise, we launch a single executor that grabs all the assignedCores on this worker.
    val numExecutors = coresPerExecutor.map { assignedCores / _ }.getOrElse(1)
    //每个Executor要分配多少个core
    val coresToAssign = coresPerExecutor.getOrElse(assignedCores)
    for (i <- 1 to numExecutors) {
      val exec: ExecutorDesc = app.addExecutor(worker, coresToAssign)
      //去worker中启动Executor
      launchExecutor(worker, exec)
      app.state = ApplicationState.RUNNING
    }
  }

allocateWorkerResourceToExecutors()获取worker引用，发送消息给worker，worker接收消息匹配LaunchExecutor，创建ExecutorRunner执行任务，运行的是Command中的org.apache.spark.executor.CoarseGrainedExecutorBackend

//创建ExecutorRunner
val manager = new ExecutorRunner(
	appId,execId,
//appDesc 中有 Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",.......) 中
// 第一个参数就是Executor类
	appDesc.copy(command = Worker.maybeUpdateSSLSettings(appDesc.command, conf)),
	cores_,memory_,self,
	workerId,host,webUi.boundPort,
	publicAddress,sparkHome,executorDir,
	workerUri,conf,appLocalDirs, ExecutorState.RUNNING)
	executors(appId + "/" + execId) = manager
	/**
	* 启动ExecutorRunner
	* 启动的就是 CoarseGrainedExecutorBackend 类，
	* 下面看 CoarseGrainedExecutorBackend 类中的main 方法有反向注册给Driver 【293 run方法】
	*/
	manager.start()
	coresUsed += cores_
	memoryUsed += memory_
	sendToMaster(ExecutorStateChanged(appId, execId, manager.state, None, None))

到此、还没有结束，application任务还没有被真正执行，真正执行任务在这个CoarseGrainedExecutorBackend中，在他的main()方法中会调用一个run()方法注册Executor，只保留部分核心代码

private def run(
      ......

      val env = SparkEnv.createExecutorEnv(
        driverConf, executorId, hostname, cores, cfg.ioEncryptionKey, isLocal = false)
      //注册Executor的通信邮箱，会调用CoarseGrainedExecutorBackend的onstart方法【58行】
      env.rpcEnv.setupEndpoint("Executor", new CoarseGrainedExecutorBackend(
        env.rpcEnv, driverUrl, executorId, hostname, cores, userClassPath, env))
      workerUrl.foreach { url =>
        env.rpcEnv.setupEndpoint("WorkerWatcher", new WorkerWatcher(env.rpcEnv, url))
      }
      env.rpcEnv.awaitTermination()
      ......
    }
  }

这个方法会创建一个SparkEnv，然后注册Executor，也就是CoarseGrainedExecutorBackend类，所以按照惯例继续看CoarseGrainedExecutorBackend的onStart()方法，直接看注释

override def onStart() {
    logInfo("Connecting to driver: " + driverUrl)
    //从RPC中拿到Driver的引用，给Driver反向注册Executor
    rpcEnv.asyncSetupEndpointRefByURI(driverUrl).flatMap { ref =>
      //拿到Driver的引用
      driver = Some(ref)
      // 给Driver反向注册Executor信息，这里就是注册给之前看到的 
        //CoarseGrainedSchedulerBackend类中的DriverEndpoint
        //DriverEndpoint类中会有receiveAndReply 方法来匹配RegisterExecutor

      ref.ask[Boolean](RegisterExecutor(executorId, self, hostname, cores, extractLogUrls))
   。。。。。。

当Driver收到RegisterExecutor类型消息之后，driver会将当前executor放入一个HashMap集合，并发送消息给ExecutorRef 告诉 Executor已经被注册。

//反向注册的Executor
      case RegisterExecutor(executorId, executorRef, hostname, cores, logUrls) =>
       。。。。。。
          val data = new ExecutorData(executorRef, executorAddress, hostname,
            cores, cores, logUrls)
    。。。。。。
            executorDataMap.put(executorId, data)

//拿到Execuotr的通信邮箱，发送消息给ExecutorRef 告诉 Executor已经被注册。
//在 CoarseGrainedExecutorBackend 类中 receive方法一直监听有没有被注册，匹配上就会启动Executor

          executorRef.send(RegisteredExecutor)

Executor在接收到注册成功的消息后，创建Executor，Executor真正的创建Executor,Executor中有线程池用于task运行

spark中的task是以线程的方式跑在集群中，不是jvm进程，所以这里会创建Executor，所以最终跑任务的是executor中的Threadpool

    //匹配上Driver端发过来的消息，已经接受注册Executor了，下面要启动Executor
    case RegisteredExecutor =>
        //下面创建Executor，Executor真正的创建Executor,Executor中有线程池用于task运行
        executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false)
    //启动Task
    case LaunchTask(data) =>
      if (executor == null) {
        exitExecutor(1, "Received LaunchTask command but executor was null")
      } else {
        val taskDesc = TaskDescription.decode(data.value)
        logInfo("Got assigned task " + taskDesc.taskId)
        //Executor 启动Task
        executor.launchTask(this, taskDesc)
      }

Executor

  //得到task 在线程池中执行
  def launchTask(context: ExecutorBackend, taskDescription: TaskDescription): Unit = {
    val tr = new TaskRunner(context, taskDescription)
    runningTasks.put(taskDescription.taskId, tr)
    threadPool.execute(tr)
  }

框架默认情况下：

1、水平分配资源

2、每台只启动一个Executor，默认Executor申请1G内存

3、每个Executor消耗所有core

总结：

1、Driver执行客户端application任务之前必须先创建SparkContext及Sparkconf对象

2、在new sparkContext后，TaskSchedulerImpl、StandaloneSchedulerBackend、DAGScheduler类会被初始化到SparkContext对象中

3、TaskSchedulerImpl#start()==>StandaloneSchedulerBackend#start()==>

3.1、父类CoarseGrainedSchedulerBackend#start()会注册一个DriverEndpoint

3.2、StandaloneSchedulerBackend#start()会实例化一个StandaloneAPPClient(application的客户端)，StandaloneAPPClient会包含ApplicationDescription对象、ApplicationDescription会包含Command对象

4、StandaloneAPPClient会在自己的rpc中注册ClientEndpoint用来和master建立通信，发送注册application的消息

5、接下来就是看Master接收消息匹配类型然后做处理，Driver的这整个过程就是在做资源申请

6、Master对RegisterApplication类型的消息处理，会创建Application并注册，然后调用schedule()为Application分配资源

6.1、scheduleExecutorsOnWorkers方法计算每个可用的Worker上应该分配的核心数量core并返回

6.2、allocateWorkerResourceToExecutors()方法为worker分配executor

7、worker接收LaunchExecutor消息匹配，会实例化ExecutorRunner调用start()，而ExecutorRunner中最重要的属性是Command中的org.apache.spark.executor.CoarseGrainedExecutorBackend

8、CoarseGrainedExecutorBackend的main()方法中会调用一个run()方法最终经过调用反向注册一个Executor到Driver

9、当Driver收到RegisterExecutor类型消息之后，driver会将当前executor放入一个HashMap集合，并发送消息给ExecutorRef 告诉 Executor已经被注册

10、Executor在接收到注册成功的消息后，创建真正的Executor,Executor中有Threadpool用于task运行