Spark源码分析之worker节点启动driver和executor

一、启动driver

1.首先在Master.scala类中执行schedule()方法,该方法主要有两个方法lanuchDriver()和launchExecutor()分别用来启动driver和executor。在master上面一旦可用资源发生变动或者有新的application提交进来之后就会调用该schedule()方法。

2.先去调度所有的driver,针对这些application采取严格的优先级.当该worker上面的可用内存大于application需要的内存,以及worker上面可用的core数目大于application中需要的core数目,通过launchDriver(worker, driver)才能够启动dirver,启动之后将该dirver就从正在等待的driver列表(ArrayBuffer)中删除同时将当前driver的状态设置为(launched=)true。

def launchDriver(worker: WorkerInfo, driver: DriverInfo) {
    logInfo("Launching driver " + driver.id + " on worker " + worker.id)
    worker.addDriver(driver)
    driver.worker = Some(worker)
    worker.actor ! LaunchDriver(driver.id, driver.desc)//向Worker节点发送一个基于AKKA ACTOR的事件通知模型的样例类LaunchDriver
    driver.state = DriverState.RUNNING
  }

3.worker接收到master发送来的LanuchDriver事件通知(Worker.scala类中)

case LaunchDriver(driverId, driverDesc) => {
      logInfo(s"Asked to launch driver $driverId")
      val driver = new DriverRunner(
        conf,
        driverId,
        workDir,
        sparkHome,
        driverDesc.copy(command = Worker.maybeUpdateSSLSettings(driverDesc.command, conf)),
        self,
        akkaUrl)
      drivers(driverId) = driver
      driver.start()

      coresUsed += driverDesc.cores
      memoryUsed += driverDesc.mem
    }

在LaunchRiver内部,创建DriverRunner对象(内部封装了一个线程),将DriverRunner对象添加进该Worker内部的dirver列表(HashMap<Key(driverID), Value(DriverRunner对象)>),调用DriverRunner对象的start方法,同时修改该worker进程的可用cores个数,内存个数。

4.查看driver.start()方法(DriverRunner.scala)

def start() = {
    new Thread("DriverRunner for " + driverId) {
      override def run() {
        try {
          val driverDir = createWorkingDirectory()
          val localJarFilename = downloadUserJar(driverDir)

          def substituteVariables(argument: String): String = argument match {
            case "{{WORKER_URL}}" => workerUrl
            case "{{USER_JAR}}" => localJarFilename
            case other => other
          }

          // TODO: If we add ability to submit multiple jars they should also be added here
          /*
            类似组织启动进程的命令如下:
            Storm jar jar-path classpath paramters
           */
          val builder = CommandUtils.buildProcessBuilder(driverDesc.command, driverDesc.mem,
            sparkHome.getAbsolutePath, substituteVariables)
          launchDriver(builder, driverDir, driverDesc.supervise)
        }
        catch {
          case e: Exception => finalException = Some(e)
        }
        //获取当前driver的启动状态(KILLED ERROR FINISHED FAILED)
        val state =
          if (killed) {
            DriverState.KILLED
          } else if (finalException.isDefined) {
            DriverState.ERROR
          } else {
            finalExitCode match {
              case Some(0) => DriverState.FINISHED
              case _ => DriverState.FAILED
            }
          }

        finalState = Some(state)

        worker ! DriverStateChanged(driverId, state, finalException)
      }
    }.start()
  }
该方法主要作用:①创建的driver工作的目录,②从master中下载要执行的jar包(移动计算、不移动数据),③调用内部的方法launchDriver(ProcessBuilder, driverDriver, supervise)来启动driver启动ProcessBuilder(使用java的API去操作一个java的进程|Process Runtime)。 driver在此启动起来

并向worker发送DriverStateChanged(driverId, state, finalException)事件通知,worker接收到后,向master发送相同的DriverStateChanged(driverId, state, finalException)事件通知,

二、启动executor

1.类似于启动driver。在schedule()方法满足一定的条件(e.g:资源满足等等其他条件)启动launchExecutor()  (Master.scala)

def launchExecutor(worker: WorkerInfo, exec: ExecutorDesc) {
    logInfo("Launching executor " + exec.fullId + " on worker " + worker.id)
    worker.addExecutor(exec)
    worker.actor ! LaunchExecutor(masterUrl,
      exec.application.id, exec.id, exec.application.desc, exec.cores, exec.memory)
    //启动executor之后,将该executor注册添加进driver
    exec.application.driver ! ExecutorAdded(
      exec.id, worker.id, worker.hostPort, exec.cores, exec.memory)
  }
2.在worker上面,创建了一个内部持有一个Thread线程的ExecutorRunner的对象,内部封装了启动该executor所需要的信息,调用start方法

case LaunchExecutor(masterUrl, appId, execId, appDesc, cores_, memory_) =>
......
    val manager = new ExecutorRunner(...)
    manager.start()
    coresUsed += cores_
    memoryUsed += memory_
    master ! ExecutorStateChanged(appId, execId, manager.state, None, None)
......
3. 二-2中manager.start()方法(ExecutorRunner.scala)

def start() {
    workerThread = new Thread("ExecutorRunner for " + fullId) {
      override def run() { fetchAndRunExecutor() }
    }
    workerThread.start()
    // Shutdown hook that kills actors on shutdown.
    shutdownHook = new Thread() {
      override def run() {
        killProcess(Some("Worker shutting down"))
      }
    }
    Runtime.getRuntime.addShutdownHook(shutdownHook)
  }
}
def fetchAndRunExecutor() {
    try {
      // Launch the process
      val builder = CommandUtils.buildProcessBuilder(appDesc.command, memory,
        sparkHome.getAbsolutePath, substituteVariables)
      val command = builder.command()
      logInfo("Launch command: " + command.mkString("\"", "\" \"", "\""))

      builder.directory(executorDir)
      builder.environment.put("SPARK_LOCAL_DIRS", appLocalDirs.mkString(","))
      // In case we are running this from within the Spark Shell, avoid creating a "scala"
      // parent process for the executor command
      builder.environment.put("SPARK_LAUNCH_WITH_SCALA", "0")

      // Add webUI log urls
      val baseUrl =
        s"http://$publicAddress:$webUiPort/logPage/?appId=$appId&executorId=$execId&logType="
      builder.environment.put("SPARK_LOG_URL_STDERR", s"${baseUrl}stderr")
      builder.environment.put("SPARK_LOG_URL_STDOUT", s"${baseUrl}stdout")

      process = builder.start()
      val header = "Spark Executor Command: %s\n%s\n\n".format(
        command.mkString("\"", "\" \"", "\""), "=" * 40)

      // Redirect its stdout and stderr to files
      val stdout = new File(executorDir, "stdout")
      stdoutAppender = FileAppender(process.getInputStream, stdout, conf)

      val stderr = new File(executorDir, "stderr")
      Files.write(header, stderr, UTF_8)
      stderrAppender = FileAppender(process.getErrorStream, stderr, conf)

      // Wait for it to exit; executor may exit with code 0 (when driver instructs it to shutdown)
      // or with nonzero exit code
      val exitCode = process.waitFor()
      state = ExecutorState.EXITED
      val message = "Command exited with code " + exitCode
      worker ! ExecutorStateChanged(appId, execId, state, Some(message), Some(exitCode))
    } catch {
      case interrupted: InterruptedException => {
        logInfo("Runner thread for executor " + fullId + " interrupted")
        state = ExecutorState.KILLED
        killProcess(None)
      }
      case e: Exception => {
        logError("Error running executor", e)
        state = ExecutorState.FAILED
        killProcess(Some(e.toString))
      }
    }
  }





主要作用:类似driver中创建该executor工作目录,下载运行的jar。在fetchAndRunExecutor()开启执行application的进程executor并向worker发送ExecutorStateChanged的事件通知,

4.worker向master发送接收到的ExecutorStateChanged的事件通知。







  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值