【Spark】Spark 容错及 HA--Executor 异常

最新推荐文章于 2024-04-08 07:45:53 发布

勤言不勤语

最新推荐文章于 2024-04-08 07:45:53 发布

阅读量676

点赞数

分类专栏： Spark 文章标签： spark

本文链接：https://blog.csdn.net/w1992wishes/article/details/88184694

版权

Spark 支持多种运行模式，这些运行模式中的集群管理器会为任务分配运行资源，在运行资源中启动 Executor，由 Executor 负责执行任务的运行，最终把任务运行状态发送给 Driver。

以独立运行（standalone ）模式为例分析 Executor 出现异常的情况，其运行结构如下图所示，其中虚线为正常运行中进行消息通信线路，实现为异常处理步骤。

（1）在 standalone 模式中，提交一个程序后，集群中的 Master 给应用程序分配运行资源，然后在Worker 中启动 ExecutorRunner，而 ExecutorRunner 根据当前的运行模式启动 CoarseGrainedExecutorBackend 进程，该进程启动后会向 Driver 发送 RegisterExecutor 注册信息，如果注册成功，则 CoarseGrainedExecutorBackend 在其内部启动 Executor。Executor 由 ExecutorRunner 进行管理，当 Executor 出现异常（如所运行容器 CoarseGrainedExecutorBackend 进程异常退出等）时，由ExecutorRunner 捕获该异常并发送 ExecutorStateChanged 消息给 Worker。

Worker # launchExecutor：

case LaunchExecutor(masterUrl, appId, execId, appDesc, cores_, memory_) =>
  	  ...
      val manager = new ExecutorRunner(
        appId,
        execId,
        appDesc.copy(command = Worker.maybeUpdateSSLSettings(appDesc.command, conf)),
        cores_,
        memory_,
        self,
        workerId,
        host,
        webUi.boundPort,
        publicAddress,
        sparkHome,
        executorDir,
        workerUri,
        conf,
        appLocalDirs, ExecutorState.RUNNING)
      executors(appId + "/" + execId) = manager
      manager.start()
      ...

ExecutorRunner # start：

private[worker] def start() {
  workerThread = new Thread("ExecutorRunner for " + fullId) {
    override def run() { fetchAndRunExecutor() }
  }
  workerThread.start()
  // Shutdown hook that kills actors on shutdown.
  shutdownHook = ShutdownHookManager.addShutdownHook { () =>
    // It's possible that we arrive here before calling `fetchAndRunExecutor`, then `state` will
    // be `ExecutorState.RUNNING`. In this case, we should set &

最低0.47元/天解锁文章

勤言不勤语

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
【Spark】Spark 容错及 HA--Executor 异常

Spark 支持多种运行模式，这些运行模式中的集群管理器会为任务分配运行资源，在运行资源中启动 Executor，由 Executor 负责执行任务的运行，最终把任务运行状态发送给 Driver。以独立运行（standalone ）模式为例分析 Executor 出现异常的情况，其运行结构如下图所示，其中虚线为正常运行中进行消息通信线路，实现为异常处理步骤。（1）在 standalone ...
复制链接

扫一扫