spark Launch Executor

最新推荐文章于 2023-01-02 17:43:52 发布

woloqun

最新推荐文章于 2023-01-02 17:43:52 发布

阅读量487

点赞数

分类专栏： spark 文章标签： spark

本文链接：https://blog.csdn.net/woloqun/article/details/80634962

版权

spark 专栏收录该内容

18 篇文章 1 订阅

订阅专栏

这里写图片描述

ClientEndpoint发送RegisterApplication请求，Master返回RegisteredApplication注册成功消息，到这里application注册就完成了；接下来就是启动Executors,schedule()是启动Exexutors的入口

private def schedule(): Unit = {
  if (state != RecoveryState.ALIVE) {
    return
  }
  // 随机打乱所有的worker,避免在一个worker上启动过多的dirver;这里需要说明的是worker启动后会向master注册，
//注册完后master就有与worker通信的workendpointRef,
  val shuffledAliveWorkers = Random.shuffle(workers.toSeq.filter(_.state == WorkerState.ALIVE))
  val numWorkersAlive = shuffledAliveWorkers.size
  var curPos = 0
  for (driver <- waitingDrivers.toList) { // iterate over a copy of waitingDrivers
    var launched = false
    var numWorkersVisited = 0
    while (numWorkersVisited < numWorkersAlive && !launched) {
      val worker = shuffledAliveWorkers(curPos)
      numWorkersVisited += 1
      if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= driver.desc.cores) {
        launchDriver(worker, driver)
        waitingDrivers -= driver
        launched = true
      }
      curPos = (curPos + 1) % numWorkersAlive
    }
  }
//在worker上启动Executor
  startExecutorsOnWorkers()
}

在schedule()最后一行就是启动Executors的具体实现startExecutorsOnWorkers()，具体方法的调用流程：
startExecutorsOnWorkers()->allocateWorkerResourceToExecutors()->launchExecutor()，这里重点关注launchExecutor(),之前的方法调用我们可以暂时忽略

private def launchExecutor(worker: WorkerInfo, exec: ExecutorDesc): Unit = {
  logInfo("Launching executor " + exec.fullId + " on worker " + worker.id)
  worker.addExecutor(exec)
//向worker发送LaunchExecutor消息
  worker.endpoint.send(LaunchExecutor(masterUrl,
    exec.application.id, exec.id, exec.application.desc, exec.cores, exec.memory))
  exec.application.driver.send(
    ExecutorAdded(exec.id, worker.id, worker.hostPort, exec.cores, exec.memory))
}

在launchExecutor方法里，首先是用worker.endpoint.send(LaunchExecutor)请求,worker接收到请求后，首先创建executor的工作目录；

val executorDir = new File(workDir, appId + "/" + execId)

之后创建ExecutorRunner，并且调用start()方法，并且给worker和Master发送ExecutorStateChanged消息

val manager = new ExecutorRunner(
  appId,
  execId,
  appDesc.copy(command = Worker.maybeUpdateSSLSettings(appDesc.command, conf)),
  cores_,
  memory_,
  self,
  workerId,
  host,
  webUi.boundPort,
  publicAddress,
  sparkHome,
  executorDir,
  workerUri,
  conf,
  appLocalDirs, ExecutorState.RUNNING)
executors(appId + "/" + execId) = manager
//启动Executor，在start方法里会向worker发送ExecutorStateChanged消息
manager.start()
coresUsed += cores_
memoryUsed += memory_
//向master发送ExecutorStateChanged消息
sendToMaster(ExecutorStateChanged(appId, execId, manager.state, None, None))

先来看下ExecutorRunner.start()方法，代码很简单，就是构造了一个线程，线程内部调用fetchANdRunExecutor()

workerThread = new Thread("ExecutorRunner for " + fullId) {
  override def run() { fetchAndRunExecutor() }
}
workerThread.start()

fetchAndRunExecutor这个方法主要用ProcessBuilder拼接Linux命令行启动Executor,

val builder = CommandUtils.buildProcessBuilder(appDesc.command, new SecurityManager(conf),
  memory, sparkHome.getAbsolutePath, substituteVariables)
process = builder.start()
worker.send(ExecutorStateChanged(appId, execId, state, Some(message), Some(exitCode)))

Executor启动的时候同时给worker和master发送了ExecutorStateChanged消息，首先worker接收到消息后，直接就将改消息发送给了master；具体可以查看Worker中的handleExecutorStateChanged方法

sendToMaster(executorStateChanged)

Executor发送的消息最后都到了Master,Master收到消息后，给Dirver发送ExecutorUpdated消息，这里的Dirver也就是ClientEndpoint；

ClientEndpoint接收到消息后，打印了下状态信息，以及根据Executor的状态决定是否需要移除Executor

同时master会根据Executor的状态来决定时候需要移除Executor,最后再次调用schedule()方法,具体代码细节可以根据流程图去追踪下

woloqun

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
spark Launch Executor

ClientEndpoint发送RegisterApplication请求，Master返回RegisteredApplication注册成功消息，到这里application注册就完成了；接下来就是启动Executors,schedule()是启动Exexutors的入口private def schedule(): Unit = { if (state != RecoveryS...
复制链接

扫一扫