Executor原理剖析与源码分析

版本:spark2.1.0

1、work中为application启动的executor,实际上是启动了CoarseGrainedExecutorBackend进程

private[spark] object CoarseGrainedExecutorBackend extends Logging {
  private def run(
      driverUrl: String,
      executorId: String,
      hostname: String,
      cores: Int,
      appId: String,
      workerUrl: Option[String],
      userClassPath: Seq[URL])
2、获取driver 的actor,向driver发送RegisterExecutor信息

override def onStart() {
    logInfo("Connecting to driver: " + driverUrl)
    rpcEnv.asyncSetupEndpointRefByURI(driverUrl).flatMap { ref =>
     //获取driver 的actor
      driver = Some(ref)
	  //向driver发送RegisterExecutor信息
      ref.ask[Boolean](RegisterExecutor(executorId, self, hostname, cores, extractLogUrls))
    }(ThreadUtils.sameThread).onComplete {
      // This is a very fast action so we can use "ThreadUtils.sameThread"
      case Success(msg) =>
        // Always receive `true`. Just ignore it
      case Failure(e) =>
        exitExecutor(1, s"Cannot register with driver: $driverUrl", e, notifyDriver = false)
    }(ThreadUtils.sameThread)
  }	
3、driver注册executor成功之后,会发送RegisteredExecutor消息

此时CoarseGrainedExecutorBackend会创建Executor执行句柄,大部分的功能都是通过Executor实现的

override def receive: PartialFunction[Any, Unit] = {
    case RegisteredExecutor =>
      logInfo("Successfully registered with driver")
      try {
        executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false)
      } catch {
        case NonFatal(e) =>
          exitExecutor(1, "Unable to create executor due to " + e.getMessage, e)
      }

    case RegisterExecutorFailed(message) =>
      exitExecutor(1, "Slave registration failed: " + message)  
4、启动task,反序列化task,调用Executor执行器的launchTask()启动一个task

//启动task
 case LaunchTask(data) =>
      if (executor == null) {
        exitExecutor(1, "Received LaunchTask command but executor was null")
      } else {
	  //反序列化task
        val taskDesc = ser.deserialize[TaskDescription](data.value)
        logInfo("Got assigned task " + taskDesc.taskId)
		调用Executor执行器的launchTask()启动一个task
        executor.launchTask(this, taskId = taskDesc.taskId, attemptNumber = taskDesc.attemptNumber,
          taskDesc.name, taskDesc.serializedTask)
      }

5、对每个task都会创建一个TaskRunner,将TaskRunner放入内存缓存中

将task封装在一个线程中(TaskRunner),将线程丢入线程池中,然后执行

线程池是是自动实现了排队机制的

def launchTask(
      context: ExecutorBackend,
      taskId: Long,
      attemptNumber: Int,
      taskName: String,
      serializedTask: ByteBuffer): Unit = {
	  //对每个task都会创建一个TaskRunner
    val tr = new TaskRunner(context, taskId = taskId, attemptNumber = attemptNumber, taskName,
      serializedTask)
	  //将TaskRunner放入内存缓存中,
    runningTasks.put(taskId, tr)
	//将task封装在一个线程中(TaskRunner),将线程丢入线程池中,然后执行
	//线程池是是自动实现了排队机制的,
    threadPool.execute(tr)
  }	  

下节将分析task的具体被执行的原理



  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值