spark源码-任务提交流程之-5-CoarseGrainedExecutorBackend

26 篇文章 0 订阅
25 篇文章 2 订阅

1.概述

​ 在4-spark源码-任务提交流程之container中启动executor中分析到,AM从RM获取到资源后,会轮询资源containers,由AM向NM申请,在每个资源container中由/bin/java命令启动一个org.apache.spark.executor.CoarseGrainedExecutorBackend进程;

​ 下面就org.apache.spark.executor.CoarseGrainedExecutorBackend中执行流程进行分析;

2.入口

​ 通过/bin/java方式创建CoarseGrainedExecutorBackend进程后,会以CoarseGrainedExecutorBackend进程的main方法为入口向后执行;

​ main方法中,对参数进行解析,参数缺少则终止JVM运行,参数完整则调用run方法向后执行;

​ CoarseGrainedExecutorBackend类继承ThreadSafeRpcEndpoint特质,ThreadSafeRpcEndpoint特质继承RpcEndpoint特质;

private[spark] object CoarseGrainedExecutorBackend extends Logging {
  def main(args: Array[String]) {
    var driverUrl: String = null
    var executorId: String = null
    var hostname: String = null
    var cores: Int = 0
    var appId: String = null
    var workerUrl: Option[String] = None
    val userClassPath = new mutable.ListBuffer[URL]()
		//参数解析
    var argv = args.toList
    while (!argv.isEmpty) {
      argv match {
        case ("--driver-url") :: value :: tail =>
          driverUrl = value
          argv = tail
        case ("--executor-id") :: value :: tail =>
          executorId = value
          argv = tail
        case ("--hostname") :: value :: tail =>
          hostname = value
          argv = tail
        case ("--cores") :: value :: tail =>
          cores = value.toInt
          argv = tail
        case ("--app-id") :: value :: tail =>
          appId = value
          argv = tail
        case ("--worker-url") :: value :: tail =>
          // Worker url is used in spark standalone mode to enforce fate-sharing with worker
          workerUrl = Some(value)
          argv = tail
        case ("--user-class-path") :: value :: tail =>
          userClassPath += new URL(value)
          argv = tail
        case Nil =>
        case tail =>
          // scalastyle:off println
          System.err.println(s"Unrecognized options: ${tail.mkString(" ")}")
          // scalastyle:on println
          printUsageAndExit()
      }
    }

    if (hostname == null) {
      hostname = Utils.localHostName()
      log.info(s"Executor hostname is not provided, will use '$hostname' to advertise itself")
    }
    
		//参数缺少,终止JVM
    if (driverUrl == null || executorId == null || cores <= 0 || appId == null) {
      printUsageAndExit()
    }
		//调用run方法
    run(driverUrl, executorId, hostname, cores, appId, workerUrl, userClassPath)
    System.exit(0)
  }
}

3.run

​ 组装参数、构造env、注册CoarseGrainedExecutorBackend实例、阻塞线程关闭以等等分发器分发消息到实例;

private[spark] object CoarseGrainedExecutorBackend extends Logging {

  private def run(
      driverUrl: String,
      executorId: String,
      hostname: String,
      cores: Int,
      appId: String,
      workerUrl: Option[String],
      userClassPath: Seq[URL]) {

    //初始化守护进程
    Utils.initDaemon(log)

    //以spark用户运行
    SparkHadoopUtil.get.runAsSparkUser { () =>
      // 校验hostname格式
      Utils.checkHost(hostname)

      // 获取executor的配置信息
      val executorConf = new SparkConf
      //创建rpc调用环境
      val fetcher = RpcEnv.create(
        "driverPropsFetcher",
        hostname,
        -1,
        executorConf,
        new SecurityManager(executorConf),
        clientMode = true)
      //根据--driver-url参数,以rpc方式创建driver节点引用
      val driver = fetcher.setupEndpointRefByURI(driverUrl)
      //从driver获取SparkAppConfig
      val cfg = driver.askSync[SparkAppConfig](RetrieveSparkAppConfig)
      //从SparkAppConfig中获取spark配置,并添加spark应用id
      val props = cfg.sparkProperties ++ Seq[(String, String)](("spark.app.id", appId))
      //关闭rpc调用环境
      fetcher.shutdown()

      // 将从driver获取的spark配置信息封装到sparkConf中
      val driverConf = new SparkConf()
      for ((key, value) <- props) {
        // this is required for SSL in standalone mode
        if (SparkConf.isExecutorStartupConf(key)) {
          driverConf.setIfMissing(key, value)
        } else {
          driverConf.set(key, value)
        }
      }
			//将driver中获取的token封装到sparkConf中
      cfg.hadoopDelegationCreds.foreach { tokens =>
        SparkHadoopUtil.get.addDelegationTokens(tokens, driverConf)
      }

      //利用sparkConf中的参数信息,创建sparkEnv,即创建executor的sparkEnv
      //此时会完成env的属性rpcEnv的复赋值,将NettyRpcEnv的实例赋值给rpcEnv
      val env = SparkEnv.createExecutorEnv(
        driverConf, executorId, hostname, cores, cfg.ioEncryptionKey, isLocal = false)

      //构建一个CoarseGrainedExecutorBackend实例,将该实例以Executor名字注册到消息分派器中
      env.rpcEnv.setupEndpoint("Executor", new CoarseGrainedExecutorBackend(
        env.rpcEnv, driverUrl, executorId, hostname, cores, userClassPath, env))
      workerUrl.foreach { url =>
        env.rpcEnv.setupEndpoint("WorkerWatcher", new WorkerWatcher(env.rpcEnv, url))
      }
      //阻塞进程中main线程直到rpcEnv退出:通过判断rpcEnv中分发器的线程池状态决定是否继续阻塞;通过阻塞代码阻止main线程关闭
      env.rpcEnv.awaitTermination()
    }
  }
}

2.1.向消息分派器注册backend

​ NettyRpcEnv是RpcEnv的实现;

​ 在当前方法中调用消息分派器的registerRpcEndpoint方法进行后续执行;

private[netty] class NettyRpcEnv(
    val conf: SparkConf,
    javaSerializerInstance: JavaSerializerInstance,
    host: String,
    securityManager: SecurityManager,
    numUsableCores: Int) extends RpcEnv(conf) with Logging {
  
  private val dispatcher: Dispatcher = new Dispatcher(this, numUsableCores)
  
  override def setupEndpoint(name: String, endpoint: RpcEndpoint): RpcEndpointRef = {
    dispatcher.registerRpcEndpoint(name, endpoint)
  }
}

2.1.2.消息分派器中注册rpc终端

​ 此处的rpc终端是一个CoarseGrainedExecutorBackend实例;

​ 由EndpointData封装终端信息:名称、终端、终端引用、绑定的收件箱;

​ 在消息分派器中,由ConcurrentHashMap以key-value形式进行缓存name-EndpointData信息完成rpc终端注册;

​ 将封装后的终端信息EndpointData缓存到LinkedBlockingQueue队列中,作为消息分派器分派消息的接受者队列;相当于邮箱中的收件人列表;

private[netty] class Dispatcher(nettyEnv: NettyRpcEnv, numUsableCores: Int) extends Logging {
  def registerRpcEndpoint(name: String, endpoint: RpcEndpoint): NettyRpcEndpointRef = {
    //获取终端的地址标识符:nettyEnv.address-终端的地址,包含host和port;name-终端的名称
    val addr = RpcEndpointAddress(nettyEnv.address, name)
    //rpc终端的引用
    val endpointRef = new NettyRpcEndpointRef(nettyEnv.conf, addr, nettyEnv)
    synchronized {
      if (stopped) {
        throw new IllegalStateException("RpcEnv has been stopped")
      }
      //rpc终端(CoarseGrainedExecutorBackend实例)注册到消息分派器中:
      //	由分派器内部类EndpointData封装终端信息,由ConcurrentHashMap以key-value形式进行缓存注册
      if (endpoints.putIfAbsent(name, new EndpointData(name, endpoint, endpointRef)) != null) {
        throw new IllegalArgumentException(s"There is already an RpcEndpoint called $name")
      }
      //缓存终端与终端引用的对应关系
      val data = endpoints.get(name)
      endpointRefs.put(data.endpoint, data.ref)
      //将封装后的终端信息缓存到LinkedBlockingQueue队列中,作为消息分派器分派消息的接受者队列;相当于邮箱中的收件人列表;
      receivers.offer(data)  // for the OnStart message
    }
    endpointRef
  }
}
2.1.2.1 EndpointData 终端信息封装类

​ 封装终端信息:名称、终端、终端引用、绑定的收件箱;

​ inbox是一个收件箱:为RpcEndpoint存储消息并以线程安全方式向其发送消息;

private[netty] class Dispatcher(nettyEnv: NettyRpcEnv, numUsableCores: Int) extends Logging {

  private class EndpointData(
      val name: String,
      val endpoint: RpcEndpoint,
      val ref: NettyRpcEndpointRef) {
    //为终端绑定一个收件箱;
    val inbox = new Inbox(ref, endpoint)
  }
}

2.2. rpcEnv阻塞代码

在当前方法中调用消息分派器的awaitTermination方法进行后续执行;

private[netty] class NettyRpcEnv(
    val conf: SparkConf,
    javaSerializerInstance: JavaSerializerInstance,
    host: String,
    securityManager: SecurityManager,
    numUsableCores: Int) extends RpcEnv(conf) with Logging {
  //在NettyRpcEnv实例化的时候,完成dispatcher属性初始化,及实例化消息分派器Dispatcher
  private val dispatcher: Dispatcher = new Dispatcher(this, numUsableCores)
  
  override def awaitTermination(): Unit = {
    dispatcher.awaitTermination()
  }
}

2.2.1. 消息分发器中阻塞代码

​ 调用线程池的阻塞能力;

private[netty] class Dispatcher(nettyEnv: NettyRpcEnv, numUsableCores: Int) extends Logging {
  //线程池
  private val threadpool: ThreadPoolExecutor = {
    //.......
  }
  
  def awaitTermination(): Unit = {
    //线程池阻塞
    threadpool.awaitTermination(Long.MaxValue, TimeUnit.MILLISECONDS)
  }
}
2.2.1.1 线程池初始化说明

​ 在分发器线程池实例化过程中,根据线程池线程数限制,拉起消息循环线程,进行消息发送;

​ 线程池实例化的工作在CoarseGrainedExecutorBackend进程启动后,执行run方法过程中,利用sparkConf中的参数信息,创建executor的sparkEnv过程中完成;

​ ===>CoarseGrainedExecutorBackend进程启动后,执行run方法,run方法中创建executor的sparkEnv,sparkEnv创建过程中需要初始化rpcEnv属性,此时将NettyRpcEnv实例化后赋值给rpcEnv,NettyRpcEnv实例化时,需要初始化NettyRpcEnv的dispatcher属性,new Dispatcher进行实例化过程中,需要初始化Dispatcher的threadpool属性;至此,消息分发器的线程池初始化完成;

private[netty] class Dispatcher(nettyEnv: NettyRpcEnv, numUsableCores: Int) extends Logging {
  //线程池用来分发消息
  //在消息分发器Dispatcher实例化的时候,完成线程池初始化;即从Dispatcher实例化开始,此段代码开始执行,线程池开始工作;
  private val threadpool: ThreadPoolExecutor = {
    //确定线程池线程数
    val availableCores =
      if (numUsableCores > 0) numUsableCores else Runtime.getRuntime.availableProcessors()
    val numThreads = nettyEnv.conf.getInt("spark.rpc.netty.dispatcher.numThreads",
      math.max(2, availableCores))
    //初始化线程池
    val pool = ThreadUtils.newDaemonFixedThreadPool(numThreads, "dispatcher-event-loop")
    
    //根据线程数,拉起消息循环线程:进行消息发送
    for (i <- 0 until numThreads) {
      pool.execute(new MessageLoop)
    }
    pool
  }
  
}
2.2.1.1.1.MessageLoop-消息循环线程

​ 消息循环线程类是消息分发器内部类;

​ 轮询消息接受终端队列,向每个终端绑定的收件箱中发送并处理消息;

private[netty] class Dispatcher(nettyEnv: NettyRpcEnv, numUsableCores: Int) extends Logging {
  private class MessageLoop extends Runnable {
    override def run(): Unit = {
      try {
        while (true) {
          try {
            //轮询方式从消息接受者队列中获取消息接受者(接受信息的终端)
            val data = receivers.take()
            //所有消息接受者都获取完后,跳出轮询
            if (data == PoisonPill) {
              // Put PoisonPill back so that other MessageLoops can see it.
              receivers.offer(PoisonPill)
              return
            }
            //调用终端绑定的收件箱的process方法,处理初始化收件箱时向收件箱发送的OnStart消息;
            data.inbox.process(Dispatcher.this)
          } catch {
            case NonFatal(e) => logError(e.getMessage, e)
          }
        }
      } catch {
        case _: InterruptedException => // exit
        case t: Throwable =>
          try {
            // Re-submit a MessageLoop so that Dispatcher will still work if
            // UncaughtExceptionHandler decides to not kill JVM.
            threadpool.execute(new MessageLoop)
          } finally {
            throw t
          }
      }
    }
  }
  
  //标识MessageLoop应该退出其消息循环的有害端点
  private val PoisonPill = new EndpointData(null, null, null)
}
2.2.1.1.2 Inbox.process 收件箱中处理消息逻辑

​ Inbox的实例化:在以EndpointData封装终端信息时,会实例化一个Inbox给EndpointData的inbos属性赋值,在实例化的时候进行一次Inbox初始化;

​ 在Inbox实例化时,会给收件箱初始化一个消息队列用于缓存消息;然后向消息队列中添加一个OnStart消息;

​ 在实例化消息分发器,初始化分发器线程池属性时,会根据线程池线程数现在拉起消息循环线程,执行线程run方法;在run方法执行过程中,会执行终端收件箱的process消息处理方法;此时会首先处理收件箱中第一个添加的消息,即Onstart消息;

​ 对OnStart消息的处理过程中,会执行CoarseGrainedExecutorBackend的onStart方法;并开启多线程处理消息开关;

private[netty] case object OnStart extends InboxMessage

private[netty] class Inbox(
    val endpointRef: NettyRpcEndpointRef,
    val endpoint: RpcEndpoint)
  extends Logging {

  inbox =>  // Give this an alias so we can use it more clearly in closures.

  //消息队列:以队列形式进行消息缓存
  @GuardedBy("this")
  protected val messages = new java.util.LinkedList[InboxMessage]()
      
  //允许多个线程同时处理消息
  @GuardedBy("this")
  private var enableConcurrent = false

  //处理此处收件箱的线程数
  @GuardedBy("this")
  private var numActiveThreads = 0


  // OnStart 消息作为第一个被添加的消息,第一个被处理;在Inbox实例化时执行此段代码;
  inbox.synchronized {
    messages.add(OnStart)
  }

  /**
   * Process stored messages.
   */
  def process(dispatcher: Dispatcher): Unit = {
    //待处理的消息
    var message: InboxMessage = null
    inbox.synchronized {
      if (!enableConcurrent && numActiveThreads != 0) {
        return
      }
      //从消息队列取出消息
      message = messages.poll()
      if (message != null) {
        //处理消息的线程数+1
        numActiveThreads += 1
      } else {
        return
      }
    }
    //一直死循环,一遍随时接收消息
    while (true) {
      //根据前面代码,此处endpoint为一个CoarseGrainedExecutorBackend实例
      safelyCall(endpoint) {
        message match {
          case RpcMessage(_sender, content, context) =>
            try {
              //执行CoarseGrainedExecutorBackend的receiveAndReply方法
              endpoint.receiveAndReply(context).applyOrElse[Any, Unit](content, { msg =>
                throw new SparkException(s"Unsupported message $message from ${_sender}")
              })
            } catch {
              case e: Throwable =>
                context.sendFailure(e)
                // Throw the exception -- this exception will be caught by the safelyCall function.
                // The endpoint's onError function will be called.
                throw e
            }

          case OneWayMessage(_sender, content) =>
            //执行CoarseGrainedExecutorBackend的receive方法
            endpoint.receive.applyOrElse[Any, Unit](content, { msg =>
              throw new SparkException(s"Unsupported message $message from ${_sender}")
            })
		  //当调用process方法时,第一个处理的消息时OnStart消息,根据匹配规则,此段代码会首先执行,即在处理其他消息之前,会先执行此段代码
          case OnStart =>
            //执行CoarseGrainedExecutorBackend的onStart方法
            endpoint.onStart()
            if (!endpoint.isInstanceOf[ThreadSafeRpcEndpoint]) {
              inbox.synchronized {
                if (!stopped) {
                  //开启多线程处理消息
                  enableConcurrent = true
                }
              }
            }

          case OnStop =>
            val activeThreads = inbox.synchronized { inbox.numActiveThreads }
            assert(activeThreads == 1,
              s"There should be only a single active thread but found $activeThreads threads.")
            //从消息分发器中移除终端注册信息
            dispatcher.removeRpcEndpointRef(endpoint)
            //调用CoarseGrainedExecutorBackend的onStop方法
            endpoint.onStop()
            assert(isEmpty, "OnStop should be the last message")

          case RemoteProcessConnected(remoteAddress) =>
            endpoint.onConnected(remoteAddress)

          case RemoteProcessDisconnected(remoteAddress) =>
            endpoint.onDisconnected(remoteAddress)

          case RemoteProcessConnectionError(cause, remoteAddress) =>
            endpoint.onNetworkError(cause, remoteAddress)
        }
      }

      inbox.synchronized {
        // "enableConcurrent" will be set to false after `onStop` is called, so we should check it
        // every time.
        if (!enableConcurrent && numActiveThreads != 1) {
          // If we are not the only one worker, exit
          numActiveThreads -= 1
          return
        }
        message = messages.poll()
        if (message == null) {
          numActiveThreads -= 1
          return
        }
      }
    }
  }
}
2.2.1.2.线程池阻塞逻辑

​ 重复判断runState是否到达最终状态TERMINATED,如果是直接返回true,如果不是,调用termination.awaitNanos(nanos)阻塞一段时间,苏醒后再判断一次,如果runStateTERMINATED返回true,否则返回false。

​ 参考ThreadPoolExecutor源码解读(三)——如何优雅的关闭线程池(shutdown、shutdownNow、awaitTermination)

public class ThreadPoolExecutor extends AbstractExecutorService {
  
  public boolean awaitTermination(long timeout, TimeUnit unit)
        throws InterruptedException {
        long nanos = unit.toNanos(timeout);
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
            for (;;) {
                if (runStateAtLeast(ctl.get(), TERMINATED))
                    return true;
                if (nanos <= 0)
                    return false;
                nanos = termination.awaitNanos(nanos);
            }
        } finally {
            mainLock.unlock();
        }
    }
}

4.注册executor

4.1.backend向driver发送注册消息

​ 在backend的onStart方法中,backed向driver发送消息注册executor;

private[spark] class CoarseGrainedExecutorBackend(
    override val rpcEnv: RpcEnv,
    driverUrl: String,
    executorId: String,
    hostname: String,
    cores: Int,
    userClassPath: Seq[URL],
    env: SparkEnv)
  extends ThreadSafeRpcEndpoint with ExecutorBackend with Logging {
  
  override def onStart() {
    logInfo("Connecting to driver: " + driverUrl)
    //根据driverUrl异步获取driver终端引用
    rpcEnv.asyncSetupEndpointRefByURI(driverUrl).flatMap { ref =>
      // This is a very fast action so we can use "ThreadUtils.sameThread"
      driver = Some(ref)
      //通过driver终端引用向driver发送消息,注册executor
      ref.ask[Boolean](RegisterExecutor(executorId, self, hostname, cores, extractLogUrls))
    }(ThreadUtils.sameThread).onComplete {
      // This is a very fast action so we can use "ThreadUtils.sameThread"
      case Success(msg) =>
        // Always receive `true`. Just ignore it
      case Failure(e) =>
        exitExecutor(1, s"Cannot register with driver: $driverUrl", e, notifyDriver = false)
    }(ThreadUtils.sameThread)
  }
}

4.2. driver处理backend注册executor消息

4.2.1.driver终端注册逻辑

​ 在spark-submit提交spark应用后,启动driver线程后,由driver线程注册driver终端到rpcEnv中;

​ 在spark-submit提交spark应用后,会进行一系列的逻辑处理,其中会启动一个driver线程【参考spark源码-任务提交流程之ApplicationMaster】,这个driver线程会从执行应用程序中用户类的main方法开始执行应用程序后续逻辑;

​ 在执行应用程序后续逻辑过程中,前期会进行sparkContext的实例化,实例化过程中设计对对象属性的初始化,其中就包括_schedulerBackend变量【参考Spark源码-sparkContext初始化】;

​ 变量_schedulerBackend的初始化逻辑参考【Spark源码-sparkContext初始化之TaskScheduler任务调度器】,从中可以看到,变量_schedulerBackend是StandaloneSchedulerBackend类的实例;查看源码可以看出,StandaloneSchedulerBackend类是CoarseGrainedSchedulerBackend类的一个实现;

​ 在sparkContext#_schedulerBackend、sparkContext#_taskScheduler初始化后会执行_taskScheduler.start()方法启动任务调度器;代码如下:

private[spark] class TaskSchedulerImpl(
    val sc: SparkContext,
    val maxTaskFailures: Int,
    isLocal: Boolean = false)
  extends TaskScheduler with Logging {
  override def start() {
    //调用backend的start方法
    backend.start()

    if (!isLocal && conf.getBoolean("spark.speculation", false)) {
      logInfo("Starting speculative execution thread")
      speculationScheduler.scheduleWithFixedDelay(new Runnable {
        override def run(): Unit = Utils.tryOrStopSparkContext(sc) {
          checkSpeculatableTasks()
        }
      }, SPECULATION_INTERVAL_MS, SPECULATION_INTERVAL_MS, TimeUnit.MILLISECONDS)
    }
  }
}

​ 在任务调度器的start方法中,在spark on yarn-cluster模式下,将会调用StandaloneSchedulerBackend#start()方法:

private[spark] class StandaloneSchedulerBackend(
    scheduler: TaskSchedulerImpl,
    sc: SparkContext,
    masters: Array[String])
  extends CoarseGrainedSchedulerBackend(scheduler, sc.env.rpcEnv)
  with StandaloneAppClientListener
  with Logging {

  override def start() {
    //调用父类的start方法
    super.start()

    //。。。。。。其他代码
  }
}

​ 在StandaloneSchedulerBackend#start()方法中,首先执行的事调用父类CoarseGrainedSchedulerBackend#start()方法;

private[spark]
class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: RpcEnv)
  extends ExecutorAllocationClient with SchedulerBackend with Logging {
  var driverEndpoint: RpcEndpointRef = null
    
  override def start() {
    val properties = new ArrayBuffer[(String, String)]
    for ((key, value) <- scheduler.sc.conf.getAll) {
      if (key.startsWith("spark.")) {
        properties += ((key, value))
      }
    }

    //初始化driver终端并将终端注册到rpcEnv中
    driverEndpoint = createDriverEndpointRef(properties)
  }

  protected def createDriverEndpointRef(
      properties: ArrayBuffer[(String, String)]): RpcEndpointRef = {
    //将终端注册到rpcEnv中
    rpcEnv.setupEndpoint(ENDPOINT_NAME, createDriverEndpoint(properties))
  }

  //构建driver终端
  protected def createDriverEndpoint(properties: Seq[(String, String)]): DriverEndpoint = {
    new DriverEndpoint(rpcEnv, properties)
  }
}

driver终端注册到rpcEnv中后,会被分发器线程池中消息循环线程调度执行driver终端绑定的收件箱的process()方法,在这个方法中会调用driver终端DriverEndpoint的onStart()方法:

class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: RpcEnv)
  extends ExecutorAllocationClient with SchedulerBackend with Logging {

  //driver节点线程池
  private val reviveThread =
    ThreadUtils.newDaemonSingleThreadScheduledExecutor("driver-revive-thread")

  class DriverEndpoint(override val rpcEnv: RpcEnv, sparkProperties: Seq[(String, String)])
    extends ThreadSafeRpcEndpoint with Logging {
    //由driver节点线程池拉起一个线程,向executor定期分配任务
    override def onStart() {
      // Periodically revive offers to allow delay scheduling to work
      val reviveIntervalMs = conf.getTimeAsMs("spark.scheduler.revive.interval", "1s")
			
      reviveThread.scheduleAtFixedRate(new Runnable {
        override def run(): Unit = Utils.tryLogNonFatalError {
          //向driver节点发送ReviveOffers消息,由driver节点向executor分配任务
          Option(self).foreach(_.send(ReviveOffers))
        }
      }, 0, reviveIntervalMs, TimeUnit.MILLISECONDS)
    } 
  }
}

4.2.2 处理backend注册executor的消息

​ driver终端DriverEndpoint接收到backend注册executor的ask消息后,由DriverEndpoint#receiveAndReply进行消息处理;

​ 已经注册过的和黑名单的executor不注册,通过send方式发送OneWayMessage类型的RegisterExecutorFailed消息给executor终端;其他情况正常注册完成后,通过send方式发送OneWayMessage类型的RegisteredExecutor消息给executor终端;

class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: RpcEnv)
  extends ExecutorAllocationClient with SchedulerBackend with Logging {

  private val executorDataMap = new HashMap[String, ExecutorData]
    
  class DriverEndpoint(override val rpcEnv: RpcEnv, sparkProperties: Seq[(String, String)])
    extends ThreadSafeRpcEndpoint with Logging {
     
    override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = {
			//针对RegisterExecutor类消息
      case RegisterExecutor(executorId, executorRef, hostname, cores, logUrls) =>
      	//已注册executor不再注册吗,返回RegisterExecutorFailed给executor终端
        if (executorDataMap.contains(executorId)) {
          executorRef.send(RegisterExecutorFailed("Duplicate executor ID: " + executorId))
          context.reply(true)
        } 
      	//针对黑名单节点,不注册,返回RegisterExecutorFailed给executor终端
      	else if (scheduler.nodeBlacklist.contains(hostname)) {
          logInfo(s"Rejecting $executorId as it has been blacklisted.")
          executorRef.send(RegisterExecutorFailed(s"Executor is blacklisted: $executorId"))
          context.reply(true)
        } else {
          // If the executor's rpc env is not listening for incoming connections, `hostPort`
          // will be null, and the client connection should be used to contact the executor.
          val executorAddress = if (executorRef.address != null) {
              executorRef.address
            } else {
              context.senderAddress
            }
          logInfo(s"Registered executor $executorRef ($executorAddress) with ID $executorId")
          //缓存executor信息
          addressToExecutorId(executorAddress) = executorId
          totalCoreCount.addAndGet(cores)
          totalRegisteredExecutors.addAndGet(1)
          val data = new ExecutorData(executorRef, executorAddress, hostname,
            cores, cores, logUrls)
          // This must be synchronized because variables mutated
          // in this block are read when requesting executors
          CoarseGrainedSchedulerBackend.this.synchronized {
            //以hashMap方式缓存executor信息,完成executor在driver终端的注册
            executorDataMap.put(executorId, data)
            if (currentExecutorIdCounter < executorId.toInt) {
              currentExecutorIdCounter = executorId.toInt
            }
            if (numPendingExecutors > 0) {
              numPendingExecutors -= 1
              logDebug(s"Decremented number of pending executors ($numPendingExecutors left)")
            }
          }
          //注册完成后,向executor终端发送消息RegisteredExecutor
          executorRef.send(RegisteredExecutor)
          // Note: some tests expect the reply to come after we put the executor in the map
          context.reply(true)
          listenerBus.post(
            SparkListenerExecutorAdded(System.currentTimeMillis(), executorId, data))
          //driver向executor分配任务
          makeOffers()
        }

      //......其他代码
    }
  }
}

4.2.3.driver向executor分配任务

​ 在driver完成对executor的注册后,即调用DriverEndpoint#makeOffers向executor分配任务;

​ 从scheduler中获取tasks列表,然后轮询tasks列表,根据task选择处理任务的executor节点,向该节点分配task;

class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: RpcEnv)
  extends ExecutorAllocationClient with SchedulerBackend with Logging {

  private val executorDataMap = new HashMap[String, ExecutorData]
    
  class DriverEndpoint(override val rpcEnv: RpcEnv, sparkProperties: Seq[(String, String)])
    extends ThreadSafeRpcEndpoint with Logging {
     
    private def makeOffers() {
      // Make sure no executor is killed while some task is launching on it
      val taskDescs = withLock {
        // Filter out executors under killing
        val activeExecutors = executorDataMap.filterKeys(executorIsAlive)
        val workOffers = activeExecutors.map {
          case (id, executorData) =>
            new WorkerOffer(id, executorData.executorHost, executorData.freeCores,
              Some(executorData.executorAddress.hostPort))
        }.toIndexedSeq
        //获取tasks列表
        scheduler.resourceOffers(workOffers)
      }
      if (!taskDescs.isEmpty) {
        //分配tasks
        launchTasks(taskDescs)
      }
    }
     
    private def launchTasks(tasks: Seq[Seq[TaskDescription]]) {
      //轮询task是列表
      for (task <- tasks.flatten) {
        //序列化task
        val serializedTask = TaskDescription.encode(task)
        //被序列化task的大小不能超过最大的rpc消息的大小,否则任务被中断
        if (serializedTask.limit() >= maxRpcMessageSize) {
          Option(scheduler.taskIdToTaskSetManager.get(task.taskId)).foreach { taskSetMgr =>
            try {
              var msg = "Serialized task %s:%d was %d bytes, which exceeds max allowed: " +
                "spark.rpc.message.maxSize (%d bytes). Consider increasing " +
                "spark.rpc.message.maxSize or using broadcast variables for large values."
              msg = msg.format(task.taskId, task.index, serializedTask.limit(), maxRpcMessageSize)
              taskSetMgr.abort(msg)
            } catch {
              case e: Exception => logError("Exception in error callback", e)
            }
          }
        }
        else {
          //选择处理task的executor节点
          val executorData = executorDataMap(task.executorId)
          //启动一个task,对应的executor上CPU减1,默认启动一个task使用一个CPU core
          executorData.freeCores -= scheduler.CPUS_PER_TASK

          logDebug(s"Launching task ${task.taskId} on executor id: ${task.executorId} hostname: " +
            s"${executorData.executorHost}.")
					//向executor节点分配任务:发送OneWayMessage类型的LaunchTask消息给executor节点
          executorData.executorEndpoint.send(LaunchTask(new SerializableBuffer(serializedTask)))
        }
      }
    }
  }
}

4.3.backend接受driver注册executor的返回消息

​ 在driver处理executor的注册信息后,会发送OneWayMessage类型的消息给executor终端;OneWayMessage类型的消息由CoarseGrainedExecutorBackend#receive()方法处理;

​ driver端注册executor成功后,在backend终端构造一个executor;

private[spark] class CoarseGrainedExecutorBackend(
    override val rpcEnv: RpcEnv,
    driverUrl: String,
    executorId: String,
    hostname: String,
    cores: Int,
    userClassPath: Seq[URL],
    env: SparkEnv)
  extends ThreadSafeRpcEndpoint with ExecutorBackend with Logging {
  
  var executor: Executor = null
    
  override def receive: PartialFunction[Any, Unit] = {
    case RegisteredExecutor =>
      logInfo("Successfully registered with driver")
      try {
        //向driver注册成功,构造一个Executor
        executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false)
      } catch {
        case NonFatal(e) =>
          exitExecutor(1, "Unable to create executor due to " + e.getMessage, e)
      }

    case RegisterExecutorFailed(message) =>
      exitExecutor(1, "Slave registration failed: " + message)
    //·······其他消息处理
  }
}

5.task任务处理

5.1 启动task

​ 由CoarseGrainedExecutorBackend#receive()方法处理,在该方法中匹配LaunchTask消息处理逻辑;

private[spark] class CoarseGrainedExecutorBackend(
    override val rpcEnv: RpcEnv,
    driverUrl: String,
    executorId: String,
    hostname: String,
    cores: Int,
    userClassPath: Seq[URL],
    env: SparkEnv)
  extends ThreadSafeRpcEndpoint with ExecutorBackend with Logging {
  
  var executor: Executor = null
    
  override def receive: PartialFunction[Any, Unit] = {
     case LaunchTask(data) =>
      if (executor == null) {
        exitExecutor(1, "Received LaunchTask command but executor was null")
      } else {
        val taskDesc = TaskDescription.decode(data.value)
        logInfo("Got assigned task " + taskDesc.taskId)
        //在executor上启动task
        executor.launchTask(this, taskDesc)
      }
      //·······其他消息处理
  }
}

5.2 执行task

​ 在executor上启动一个task线程,交由executor线程池执行,并将该task线程维护在executor执行线程清单中;

private[spark] class Executor(
    executorId: String,
    executorHostname: String,
    env: SparkEnv,
    userClassPath: Seq[URL] = Nil,
    isLocal: Boolean = false,
    uncaughtExceptionHandler: UncaughtExceptionHandler = new SparkUncaughtExceptionHandler)
  extends Logging {
    
  //执行线程清单
  private val runningTasks = new ConcurrentHashMap[Long, TaskRunner]
    
  //task执行线程池
  private val threadPool = {
    val threadFactory = new ThreadFactoryBuilder()
      .setDaemon(true)
      .setNameFormat("Executor task launch worker-%d")
      .setThreadFactory(new ThreadFactory {
        override def newThread(r: Runnable): Thread =
          // Use UninterruptibleThread to run tasks so that we can allow running codes without being
          // interrupted by `Thread.interrupt()`. Some issues, such as KAFKA-1894, HADOOP-10622,
          // will hang forever if some methods are interrupted.
          new UninterruptibleThread(r, "unused") // thread name will be set by ThreadFactoryBuilder
      })
      .build()
    Executors.newCachedThreadPool(threadFactory).asInstanceOf[ThreadPoolExecutor]
  }
    
  def launchTask(context: ExecutorBackend, taskDescription: TaskDescription): Unit = {
    //在executor上启动一个task线程
    val tr = new TaskRunner(context, taskDescription)
    //将启动的线程添加到executor的执行线程清单中
    runningTasks.put(taskDescription.taskId, tr)
    //由线程池执行task线程
    threadPool.execute(tr)
  }
}

6.总结

​ backend和driver基于RPC通信机制进行通信;

​ 在backend进程启动后:

​ 首先会向rpcEnv注册backend节点;

​ 然后向driver注册executor,driver注册executor成功后,向backend返回注册成功的消息以及向executor分配任务;

​ backend接到driver返回的executor注册成功的消息后,构造一个Executor实例;

​ 然后backend再处理driver分配的任务:

​ 调用Executor,在executor上启动一个task线程,交由executor线程池执行,并将该task线程维护在executor执行线程清单中;

7.参考资料

4-spark源码-任务提交流程之container中启动executor

Spark内核之YARN Cluster模式源码详解(Submit详解)

Spark源码——Spark on YARN Executor执行Task的过程

ThreadPoolExecutor源码解读(三)——如何优雅的关闭线程池(shutdown、shutdownNow、awaitTermination)

Spark2.0.2源码分析——RPC 通信机制(消息处理)

Spark程序Yarn集群提交Executor反向注册在分析

Spark基本架构及消息通信原理

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值