Spark-executor

最新推荐文章于 2019-10-10 17:25:01 发布

blesslyy

最新推荐文章于 2019-10-10 17:25:01 发布

阅读量1.1k

点赞数

分类专栏： spark 文章标签： executor spark

本文链接：https://blog.csdn.net/oblesslyy/article/details/45898357

版权

spark 专栏收录该内容

23 篇文章 0 订阅

订阅专栏

Spark-executor

@(spark)[executor]

ExecutorExitCode

/**                                                                                                                                                                     
 * These are exit codes that executors should use to provide the master with information about                                                                          
 * executor failures assuming that cluster management framework can capture the exit codes (but                                                                         
 * perhaps not log files). The exit code constants here are chosen to be unlikely to conflict                                                                           
 * with "natural" exit statuses that may be caused by the JVM or user code. In particular,                                                                              
 * exit codes 128+ arise on some Unix-likes as a result of signals, and it appears that the                                                                             
 * OpenJDK JVM may use exit code 1 in some of its own "last chance" code.                                                                                               
 */                                                                                                                                                                     
private[spark]                                                                                                                                                          
object ExecutorExitCode {

ExecutorSource

主要就是一些metric

CoarseGrainedExecutorBackend

class CoarseGrainedExecutorBackend其实是个Actor，它是有main函数的：
1. 启动一个叫做fetcher的actorSystem，从driver处获取sparkConf
2. 关闭fetcher
3. createExecutorEnv即SparkEnv
4. 启动CoarseGrainedExecutorBackend这个Actor
- 向driver注册自己
- 等待收消息

  override def receiveWithLogging = {                                                                                                                                   
    case RegisteredExecutor =>                                                                                                                                          
      logInfo("Successfully registered with driver")                                                                                                                    
      val (hostname, _) = Utils.parseHostPort(hostPort)                                                                                                                 
      executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false)                                                                                

    case RegisterExecutorFailed(message) =>                                                                                                                             
      logError("Slave registration failed: " + message)                                                                                                                 
      System.exit(1)                                                                                                                                                    

    case LaunchTask(data) =>                                                                                                                                            
      if (executor == null) {                                                                                                                                           
        logError("Received LaunchTask command but executor was null")                                                                                                   
        System.exit(1)                                                                                                                                                  
      } else {                                                                                                                                                          
        val ser = env.closureSerializer.newInstance()                                                                                                                   
        val taskDesc = ser.deserialize[TaskDescription](data.value)                                                                                                     
        logInfo("Got assigned task " + taskDesc.taskId)                                                                                                                 
        executor.launchTask(this, taskId = taskDesc.taskId, attemptNumber = taskDesc.attemptNumber,                                                                     
          taskDesc.name, taskDesc.serializedTask)                                                                                                                       
      }                                                                                                                                                                 

    case KillTask(taskId, _, interruptThread) =>                                                                                                                        
      if (executor == null) {                                                                                                                                           
        logError("Received KillTask command but executor was null")                                                                                                     
        System.exit(1)                                                                                                                                                  
      } else {                                                                                                                                                          
        executor.killTask(taskId, interruptThread)                                                                                                                      
      }                                                                                                                                                                 

    case x: DisassociatedEvent =>                                                                                                                                       
      if (x.remoteAddress == driver.anchorPath.address) {                                                                                                               
        logError(s"Driver $x disassociated! Shutting down.")                                                                                                            
        System.exit(1)                                                                                                                                                  
      } else {                                                                                                                                                          
        logWarning(s"Received irrelevant DisassociatedEvent $x")                                                                                                        
      }                                                                                                                                                                 

    case StopExecutor =>                                                                                                                                                
      logInfo("Driver commanded a shutdown")                                                                                                                            
      executor.stop()                                                                                                                                                   
      context.stop(self)                                                                                                                                                
      context.system.shutdown()                                                                                                                                         
  }

值得关注的message其实是LaunchTask，它会调用executor.launchTask

Executor

/**                                                                                                                                                                     
 * Spark executor used with Mesos, YARN, and the standalone scheduler.                                                                                                  
 * In coarse-grained mode, an existing actor system is provided.                                                                                                        
 */                                                                                                                                                                     
private[spark] class Executor(                                                                                                                                          
    executorId: String,                                                                                                                                                 
    executorHostname: String,                                                                                                                                           
    env: SparkEnv,                                                                                                                                                      
    userClassPath: Seq[URL] = Nil,                                                                                                                                      
    isLocal: Boolean = false)                                                                                                                                           
  extends Logging

其中最重要的函数是:

   def launchTask(
      context: ExecutorBackend,
      taskId: Long,
      attemptNumber: Int,
      taskName: String,
      serializedTask: ByteBuffer) {
    val tr = new TaskRunner(context, taskId = taskId, attemptNumber = attemptNumber, taskName,
      serializedTask)
    runningTasks.put(taskId, tr)
    threadPool.execute(tr)
  }

显然这是一个基于线程池的异步执行：TaskRunner的Run函数逻辑如下：
1. 调用 Task.deserializeWithDependencies，得到依赖的文件和jar
2. 根据cache的状况决定是get文件还是用cache的文件；文件是否更新是由filename和timestamp共同决定的
3. 反序列化，得到真正的task
4. call task.run真正去执行task
5. 根据结果大小决定是把结果直接返回还是写入blockManager
6. 通过execBackEnd的statusUpdate把结果返回driver

从上面的流程可以看出：在整个过程中有下面几种数据流动：
1. taskFiles和taskJar的url
2. taskFiles和taskJars的文件，如果有cache的话，可以没有
3. Task的序列化byte
4. 结果（block地址或者一个比较小的结果）

即一个task的网络传输不会很多。

task的真正执行过程和task的调度在scheduler中。