内容:
1. Spark Executor 工作原理
2. ExecutorBackend 注册
3. Executor实例化
4. Executor 具体工作流程
一、Spark Executor工作原理
1.再次讨论Executor注册
(1).Master发指令给Worker启动Executor
(2).Worker接收指令通过ExecutorRunner启动另一个进程运行Executor
(3).Executor启动CoarseGrainedExecutorBackend
ExecutorRunner新启动的WorkerThread会通过Command和下载Jar找到CoarseGrainedExecutorBackend类,加载该类,执行入口函数main
def main(args: Array[String]) {
……
run(driverUrl, executorId, hostname, cores, appId, workerUrl, userClassPath)
}
private def run(
……) {
……
//创建运行时环境
val env = SparkEnv.createExecutorEnv(
driverConf, executorId, hostname, port, cores, isLocal = false)
//加载CoarseGrainedExecutorBackend类,并通过Netty的方式将OnStart消息传给
//CoarseGrianedBacked
env.rpcEnv.setupEndpoint("Executor", new CoarseGrainedExecutorBackend(
env.rpcEnv, driverUrl, executorId, sparkHostPort, cores, userClassPath, env))
workerUrl.foreach { url =>
env.rpcEnv.setupEndpoint("WorkerWatcher", new WorkerWatcher(env.rpcEnv, url))
}
}
}
(4).CoarseGrainedExecutorBackend通过发送RegisterExecutor向Driver注册
注:在CoarseGrainedBackend启动时,向Driver注册Executor其实质是注册ExecutorBackend实例,而与Executor实例之间无直接的关系!
//代码来自CoaresGrainedExecutorBackend.scala
override def onStart() {
logInfo("Connecting to driver: " + driverUrl)
rpcEnv.asyncSetupEndpointRefByURI(driverUrl).flatMap { ref =>
// This is a very fast action so we can use "ThreadUtils.sameThread"
driver = Some(ref)
ref.ask[RegisterExecutorResponse](
//向Driver注册Executor
RegisterExecutor(executorId, self, hostPort, cores, extractLogUrls))
}(ThreadUtils.sameThread).onComplete {
// This is a very fast action so we can use "ThreadUtils.sameThread"
case Success(msg) => Utils.tryLogNonFatalError {
Option(self).foreach(_.send(msg)) // msg must be RegisterExecutorResponse
}
case Failure(e) => {
logError(s"Cannot register with driver: $driverUrl", e)
System.exit(1)
}
}(ThreadUtils.sameThread)
}
注:
CoarseGrainedExecutorBackend是Executor运行所在进程的名称,Executor才是真正负责处理Task的对象,Executor内部是通过线程池的方式来完成Task的计算的。
‚CoarseGrainedExecutorBackend和Executor是一一对应的。
ƒCoarseGrainedExecutorBackend是一个消息通信体,其内部实现了ThreadSafeRpcEndpoint,可以收、发消息,启动时胡发消息给Driver,接收Driver发送来的消息,例如启动Task。
(5).Driver接收并处理消息
Driver接收注册消息,通过ExecutorData封装并注册ExecutorBackend的信息到Driver的内存数据结构executorMapData中。
首先,我们需要清楚Driver进程中最重要的两个Endpoint(后台消息通信体),一个是clientEndpoint,一个是DriverEndpoint。
1) ClientEndpoint: 主要负责向Master注册当前程序,它是APPClient的内部成员;
2) DriverEndpoint:是这个程序运行时的驱动器,它是CoarseGrainedSchedulerBackend的成员,在SparkContext初始化时由CoarseGrainedSchedulerBackend的子类SparkDeploySchedulerBackend创建。
注:driverEndpoint和clientEndpoint的创建过程参见参考文档coarseGrainedExecutorBackend要通信的对象driverUrl是driverEndpoint而不是ClientEndpoint。
//代码来自CoarseGrainedSchedulerBackend -> class DriverEndpoint ->
// receiveAndReply方法
//DriverEndpoint 接收到RegisterExecutor消息
case RegisterExecutor(executorId, executorRef, hostPort, cores, logUrls) =>
//excutorDataMap是一个存储ExecutorData的HashMap
// private val executorDataMap = new HashMap[String, ExecutorData]
// (CoarseGrainedSchedulerBackend第60行)
//ExecutorData结构见下页。
if (executorDataMap.contains(executorId)) {
context.reply(RegisterExecutorFailed("Duplicate executor ID: " + executorId))
} else {
// If the executor's rpc env is not listening for incoming connections, `hostPort`
// will be null, and the client connection should be used to contact the executor.
val executorAddress = if (executorRef.address != null) {
executorRef.address
} else {
context.senderAddress
}
logInfo(s"Registered executor $executorRef ($executorAddress) with ID $executorId")
addressToExecutorId(executorAddress) = executorId
totalCoreCount.addAndGet(cores)
totalRegisteredExecutors.addAndGet(1)
val data = new ExecutorData(executorRef, executorRef.address, executorAddress.host,
cores, cores, logUrls)
// This must be synchronized because variables mutated
// in this block are read when requesting executors
CoarseGrainedSchedulerBackend.this.synchronized { //1
executorDataMap.put(executorId, data)
if (numPendingExecutors > 0) {
numPendingExecutors -= 1
logDebug(s"Decremented number of pending executors ($numPendingExecutors left)")
}
}
// Note: some tests expect the reply to come after we put the executor in the map
context.reply(RegisteredExecutor(executorAddress.host)) //2
listenerBus.post(
SparkListenerExecutorAdded(System.currentTimeMillis(), executorId, data))
makeOffers()
}
}
/**
* Grouping of data for an executor used by CoarseGrainedSchedulerBackend.
*
* @param executorEndpoint The RpcEndpointRef representing this executor
* @param executorAddress The network address of this executor
* @param executorHost The hostname that this executor is running on
* @param freeCores The current number of cores available for work on the executor
* @param totalCores The total number of cores available to the executor
*/
private[cluster] class ExecutorData(
val executorEndpoint: RpcEndpointRef,
val executorAddress: RpcAddress,
override val executorHost: String,
var freeCores: Int,
override val totalCores: Int,
override val logUrlMap: Map[String, String]
) extends ExecutorInfo(executorHost, totalCores, logUrlMap)
而我们通过源码可以看出最终Executor(ExecutorBackend)的信息是存在了excutorDataMap数据结构中了,而这个数据结构是CoarseGrainedSchedulerBackend的一个成员,所以实质上Executor(ExecutorBackend)的信息是注册给了CoarseGrainedSchedulerBackend。
小结:实际在执行时候DriverEndpoint会把信息写入CoarseGrainedSchedulerBackend的内存数据结构ExecutorMapData,所以说最终Excutor(ExecutorBackend)的信息是注册给了CoarseGrainedSchedulerBackend。从而CoarseGrainedSchedulerBackend掌握了为当前程序分配的所有的ExecutorBackend进程的信息(存在ExecutorMapData中),而在每一个ExecutorBackend的进程实例中会通过Executor对象来负责具体的Task的运行。在运行时使用synchronized关键字来保证executorMapData安全的并发写操作,见1。
(6). Driver在Executor注册成功后会返回RegisteredExecutor信息给CoarseGrainedExecutorBackend
见上页代码2‚处。
(7). CoarseGrainedExecutorBackend接收RegisteredExecutor消息
CoarseGrainedExecutorBackend收到DriverEnpoint发送过来的RegisteredExecutor消息后会启动Executor实例对象,而Executor实例对象是事
实上负责真正Task计算的。其在实例化时会实例化一个线程池来准备Task的计算。
override def receive: PartialFunction[Any, Unit] = {
case RegisteredExecutor(hostname) =>
logInfo("Successfully registered with driver")
executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false)
private[spark] class Executor(
executorId: String,
executorHostname: String,
env: SparkEnv,
userClassPath: Seq[URL] = Nil,
isLocal: Boolean = false)
extends Logging { //省略部分代码
// Start worker thread pool
private val threadPool = ThreadUtils.newDaemonCachedThreadPool("Executor task launch worker")//后台方式的线程池,计算的线程池常驻内存中
private val executorSource = new ExecutorSource(threadPool, executorId)
/**
* Wrapper over newCachedThreadPool. Thread names are formatted as prefix-ID, where ID
*is a unique, sequentially assigned integer.
*/
def newDaemonCachedThreadPool(prefix: String): ThreadPoolExecutor = {
val threadFactory = namedThreadFactory(prefix)
Executors.newCachedThreadPool(threadFactory).asInstanceOf[ThreadPoolExecutor]
}
/**
* Wrapper over newCachedThreadPool. Thread names are formatted as prefix-ID, where ID
*is a unique, sequentially assigned integer.
*/
def newDaemonCachedThreadPool(prefix: String): ThreadPoolExecutor = {
val threadFactory = namedThreadFactory(prefix)
Executors.newCachedThreadPool(threadFactory).asInstanceOf[ThreadPoolExecutor]
}
/**
* Create a thread factory that names threads with a prefix and also sets the threads to
*daemon.
*/
def namedThreadFactory(prefix: String): ThreadFactory = {
new ThreadFactoryBuilder().setDaemon(true).setNameFormat(prefix + "-%d").build() //利用java中的线程池创建方式创建线程池
}
/** Creates a thread pool that creates new threads as needed, but
* will reuse previously constructed threads when they are
* available, and uses the provided
* ThreadFactory to create new threads when needed.
*/
public static ExecutorService newCachedThreadPool(ThreadFactory threadFactory) {//更具需要创建线程,来自Executors.java
return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
60L, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>(),
threadFactory);
}
至此,我们worker端的准备工作都已做好了,等待Driver发送Task来执行。
2.Executor具体工作机制
(1)Driver发送Task的准备
首先,Driver发送过来的Task并不是直接发送给了Executor,注意这里的Executor并不是个消息通信体,是无法接收消息的。其实,真正接收消息的是CoarseGrainedExecutorBackend,从源码中我们可以很明白的看到CoarseGrainedExecutorBackend是一个消息通信体(继承自RpcEndpoint)。(见3)所以Driver会发送Task给Worker端的CoarseGrainedExecutorBackend。
private[spark] class CoarseGrainedExecutorBackend^(...)//省略了参数 3
extends ThreadSafeRpcEndpoint with ExecutorBackend with Logging {
在CoarseGrainedSchedulerBackend中给ExecutorBackend发送RegisteredExecutor后,会执行makeOffers方法来分配具体的Task到每个worker中的Executor去执行。
(2)分配Executor,发送Task
Driver分配Executor,并发送LaunchTask这个包含Task的case class来传送Task的。
// Make fake resource offers on all executors
private def makeOffers() {
// Filter out executors under killing
val activeExecutors = executorDataMap.filterKeys(executorIsAlive)
val workOffers = activeExecutors.map { case (id, executorData) =>
new WorkerOffer(id, executorData.executorHost, executorData.freeCores)
}.toSeq
launchTasks(scheduler.resourceOffers(workOffers))
}
// Launch tasks returned by a set of resource offers
private def launchTasks(tasks: Seq[Seq[TaskDescription]]) { //确定Task与Executor的对应关系
//省略部分代码
executorData.executorEndpoint.send(LaunchTask(new SerializableBuffer(serializedTask)))
}
(3)接收Task
case LaunchTask(data) => //ExecutorBackend端接到消息
if (executor == null) {
logError("Received LaunchTask command but executor was null")
//Executor为空,进程退出
System.exit(1)
} else {
val taskDesc = ser.deserialize[TaskDescription](data.value)
logInfo("Got assigned task " + taskDesc.taskId)
executor.launchTask(this, taskId = taskDesc.taskId, attemptNumber = taskDesc.attemptNumber,
taskDesc.name, taskDesc.serializedTask)
}
(4) 调用Executor执行任务
ExecutorBackend在接收到Driver中发送过来的消息后会提供调用launchTask来交给Executor去执行。
(5) 执行任务
首先,会将Task封装在TaskRunner里,然后交给线程池中线程去执行。
def launchTask(
context: ExecutorBackend,
taskId: Long,
attemptNumber: Int,
taskName: String,
serializedTask: ByteBuffer): Unit = {
//TaskRunner是一个Runnable接口的具体实现,工作时会交给线程池中的线程去执行,此时
//会调用其run方法来执行Task
val tr = new TaskRunner(context, taskId = taskId, attemptNumber = attemptNumber, taskName,serializedTask)
runningTasks.put(taskId, tr)
threadPool.execute(tr)
}
override def run(): Unit = {//省略部分代码
val res = task.run(
taskAttemptId = taskId,
attemptNumber = attemptNumber,
metricsSystem = env.metricsSystem)
}
TaskRunner在调用run方法的时候调用Task的run方法,而Task的run方法会调用runTask,而实际Task有ShuffleMapTask和ResultTask,会有不同的执行逻辑。(具体内容会在后续章节讲述)
二、Spark Executor具体工作流程图
---------------------------------------------------------EOF---------------------------------------------------------------------------------------------------------