文章目录
概要
本篇博客是Spark 任务调度概述详细流程中的最后一部分,介绍Executor执行task并返回result给Driver。
在上文中我们了解了Driver如何生成Task并发送给Executor,并且在文末,提到了Executor接收到LaunchTask消息。本文就要详细介绍Executor从接收消息、执行Task到最后返回Result的过程。
1. 接收LaunchTask消息
接上文,Driver会向Worker的CoarseGrainedExecutorBackend发送LaunchTask消息,CoarseGrainedExecutorBackend接收消息,并交给自己的创建的那个executor去执行:
Executor的launchTask方法将收到的信息封装为TaskRunner对象,TaskRunner继承自Runnable,Executor使用线程池threadPool调度TaskRunner,如下 :
接下来查看TaskRunner中run方法对应的逻辑,我将其分为deserialize task、run task、sendback result三部分。
2. TaskRunner运行Task
TaskDescription:
//反序列化分为两部分:
- 首先,我们反序列化一个Task对象,其中包括Partition。
- 其次,Task.run()反序列化要运行的RDD和函数。
2.1 TaskRunner运行Task之deserialize task
反序列化得到Task对象。
2.2 TaskRunner运行Task之run task
如上图注释,调用Task的run方法执行计算,Task是抽象类
,其实现类有两个,ShuffleMapTask
和ResultTask
,分别对应shuffle
和非shuffle
任务。
<待填坑:Task.scala,run方法等>
Task的run方法(略过,详细看附录)调用其runTask方法执行task,我们以Task的子类ResultTask为例(ShuffleMapTask相比ResultTask多了一个步骤,使用ShuffleWriter将结果写到本地),如下
为了说明上图中的func,我们以RDD的map方法为例,如下
至此,task的计算就完成了,task的run方法返回计算结果。
2.3 TaskRunner运行Task之sendback result
如上图注释,对计算结果进行序列化,再根据其大小采取相应方式处理,最后调用CoarseGrainedExecutorBackend的statusUpdate方法返回result给Driver。
Driver接收:
- 更新TaskSchedulerImpl中的相关状态信息等
- 如果Task是Finished的状态,那么说明此Executor是正常工作状态,继续调度它来运行Task
- 2.1 修改此Executor上的可用资源
- 2.2 对它提供假资源,继续在此Executor运行其他Task。说明:
前面是调用makeOffers()是对所有的Executors提供假资源,那是在刚启动时候,所有的Executor都是空闲的
这里则是makeOffers(executorId),是因为此Executor执行完了,处于空闲状态所以才让它来继续执行其他Task
总结
从Executor接收任务,到发送结果给Driver的流程,如下
- 上图①所示路径,执行task任务。
- 上图②所示路径,将执行结果返回给Driver,后续Driver调用TaskScheduler处理返回结果,不再介绍。
附录
---------------------CoarseGrainedExecutorBackend.scala receive()-------------
case LaunchTask(data) =>
if (executor == null) {
exitExecutor(1, "Received LaunchTask command but executor was null")
} else {
val taskDesc = TaskDescription.decode(data.value)
logInfo("Got assigned task " + taskDesc.taskId)
executor.launchTask(this, taskDesc)
}
--------------------TaskDescription--------------------------------------------
private[spark] class TaskDescription(
val taskId: Long,
val attemptNumber: Int,
val executorId: String,
val name: String,
val index: Int, // Index within this task's TaskSet
val partitionId: Int,
val addedFiles: Map[String, Long],
val addedJars: Map[String, Long],
val properties: Properties,
val serializedTask: ByteBuffer) {
override def toString: String = "TaskDescription(TID=%d, index=%d)".format(taskId, index)
}
----------------------------Executor.scala updateDependencies()-------------------------
// 如果我们从SparkContext接收到一组新的files and JARs,请下载任何缺少的依赖项。 并添加我们提取给类加载器的任何新JAR。
private def updateDependencies(newFiles: Map[String, Long], newJars: Map[String, Long]) {
lazy val hadoopConf = SparkHadoopUtil.get.newConfiguration(conf)
synchronized {
// Fetch missing dependencies
for ((name, timestamp) <- newFiles if currentFiles.getOrElse(name, -1L) < timestamp) {
logInfo("Fetching " + name + " with timestamp " + timestamp)
// Fetch file with useCache mode, close cache for local mode.
Utils.fetchFile(name, new File(SparkFiles.getRootDirectory()), conf,
env.securityManager, hadoopConf, timestamp, useCache = !isLocal)
currentFiles(name) = timestamp
}
for ((name, timestamp) <- newJars) {
val localName = new URI(name).getPath.split("/").last
val currentTimeStamp = currentJars.get(name)
.orElse(currentJars.get(localName))
.getOrElse(-1L)
if (currentTimeStamp < timestamp) {
logInfo("Fetching " + name + " with timestamp " + timestamp)
// Fetch file with useCache mode, close cache for local mode.
Utils.fetchFile(name, new File(SparkFiles.getRootDirectory()), conf,
env.securityManager, hadoopConf, timestamp, useCache = !isLocal)
currentJars(name) = timestamp
// Add it to our class loader
val url = new File(SparkFiles.getRootDirectory(), localName).toURI.toURL
if (!urlClassLoader.getURLs().contains(url)) {
logInfo("Adding " + url + " to class loader")
urlClassLoader.addURL(url)
}
}
}
}
}
--------------------reaultTask.scala runTask()-------------------------------
override def runTask(context: TaskContext): U = {
val threadMXBean = ManagementFactory.getThreadMXBean
val deserializeStartTime = System.currentTimeMillis()
val deserializeStartCpuTime = if (threadMXBean.isCurrentThreadCpuTimeSupported) {
threadMXBean.getCurrentThreadCpuTime
} else 0L
// 反序列化taskBinary(广播变量),得到RDD和func
// 关于func,查看下面的例子
val ser = SparkEnv.get.closureSerializer.newInstance()
val (rdd, func) = ser.deserialize[(RDD[T], (TaskContext, Iterator[T]) => U)](
ByteBuffer.wrap(taskBinary.value), Thread.currentThread.getContextClassLoader)
_executorDeserializeTime = System.currentTimeMillis() - deserializeStartTime
_executorDeserializeCpuTime = if (threadMXBean.isCurrentThreadCpuTimeSupported) {
threadMXBean.getCurrentThreadCpuTime - deserializeStartCpuTime
} else 0L
func(context, rdd.iterator(partition, context))
}
-------------------ShuffleMapTask.scala runTask()--------------------------
override def runTask(context: TaskContext): MapStatus = {
// Deserialize the RDD using the broadcast variable.
val threadMXBean = ManagementFactory.getThreadMXBean
val deserializeStartTime = System.currentTimeMillis()
val deserializeStartCpuTime = if (threadMXBean.isCurrentThreadCpuTimeSupported) {
threadMXBean.getCurrentThreadCpuTime
} else 0L
val ser = SparkEnv.get.closureSerializer.newInstance()
val (rdd, dep) = ser.deserialize[(RDD[_], ShuffleDependency[_, _, _])](
ByteBuffer.wrap(taskBinary.value), Thread.currentThread.getContextClassLoader)
_executorDeserializeTime = System.currentTimeMillis() - deserializeStartTime
_executorDeserializeCpuTime = if (threadMXBean.isCurrentThreadCpuTimeSupported) {
threadMXBean.getCurrentThreadCpuTime - deserializeStartCpuTime
} else 0L
var writer: ShuffleWriter[Any, Any] = null
try {
val manager = SparkEnv.get.shuffleManager
writer = manager.getWriter[Any, Any](dep.shuffleHandle, partitionId, context)
writer.write(rdd.iterator(partition, context).asInstanceOf[Iterator[_ <: Product2[Any, Any]]])
writer.stop(success = true).get
} catch {
case e: Exception =>
try {
if (writer != null) {
writer.stop(success = false)
}
} catch {
case e: Exception =>
log.debug("Could not stop writer", e)
}
throw e
}
}
---------------------CoarseGrainedExecutorBackend.scala statusUpdate()------------------
override def statusUpdate(taskId: Long, state: TaskState, data: ByteBuffer) {
val msg = StatusUpdate(executorId, taskId, state, data)
driver match {
case Some(driverRef) => driverRef.send(msg)
case None => logWarning(s"Drop $msg because has not yet connected to driver")
}
}
----------------------CoarseGrainedSchedulerBackend.scala receive()---------------
override def receive: PartialFunction[Any, Unit] = {
// Executor执行完任务之后,返回的消息,Data是任务运行结果数据
case StatusUpdate(executorId, taskId, state, data) =>
scheduler.statusUpdate(taskId, state, data.value)
if (TaskState.isFinished(state)) {
executorDataMap.get(executorId) match {
case Some(executorInfo) =>
// 修改此Executor上的可用资源
executorInfo.freeCores += scheduler.CPUS_PER_TASK
makeOffers(executorId)//对它提供假资源
case None =>
// Ignoring the update since we don't know about the executor.
logWarning(s"Ignored task status update ($taskId state $state) " +
s"from unknown executor with ID $executorId")
}
}