前言
前面分析了Task Scheduler划分Task的过程,Task在Task SchedulerImpl类中被包装成stageTaskSets,然后由DriverEndpoint发送,最后由CoarseGrainedSchedulerBackend序列化并启动Executor。现在开始分析Executor执行任务的过程。
注:本人使用的Spark源码版本为2.3.0
,IDE为IDEA2019,对源码感兴趣的同学可以点击这里下载源码包,直接解压用IDEA打开即可。
正文
1、Executor接收到LunchTask消息
CoarseGrainedSchedulerBackend发送一条LunchTask
消息后(executorData.executorEndpoint.send(LaunchTask(new SerializableBuffer(serializedTask)))
),Executor收到消息,开始执行LunchTask方法。
case LaunchTask(data) =>
if (executor == null) {
exitExecutor(1, "Received LaunchTask command but executor was null")
} else {
val taskDesc = TaskDescription.decode(data.value)
logInfo("Got assigned task " + taskDesc.taskId)
executor.launchTask(this, taskDesc)
}
2、Excutor创建一个TaskRunner
executor.launchTask(this, taskDesc)
把任务详情也传入进去,然后启动一个TaskRunner,并把taskDesc也传入进去:val tr = new TaskRunner(context, taskDescription)
,这个TaskRunner会把Task反序列化出来,并且这个run()方法会在获取结果时执行task.run()方法。
task = ser.deserialize[Task[Any]](
taskDescription.serializedTask, Thread.currentThread.getContextClassLoader)
task.localProperties = taskDescription.properties
task.setTaskMemoryManager(taskMemoryManager)
## 执行task.run()方法
val res = task.run(
taskAttemptId = taskId,
attemptNumber = taskDescription.attemptNumber,
metricsSystem = env.metricsSystem)
threwException = false
res
3、执行runTask()方法
执行task.run()
方法,会执行runTask()
方法,这个runTask()方法没有实现,因为Task有两类,一类是ShuffleMapTask,一类是ResultTask。所以需要各自实现runTask()方法。我们先看一下shuffleMapTask的runTask()方法。
override def runTask(context: TaskContext): MapStatus = {
// Deserialize the RDD using the broadcast variable.
val threadMXBean = ManagementFactory.getThreadMXBean
val deserializeStartTime = System.currentTimeMillis()
val deserializeStartCpuTime = if (threadMXBean.isCurrentThreadCpuTimeSupported) {
threadMXBean.getCurrentThreadCpuTime
} else 0L
val ser = SparkEnv.get.closureSerializer.newInstance()
val (rdd, dep) = ser.deserialize[(RDD[_], ShuffleDependency[_, _, _])](
ByteBuffer.wrap(taskBinary.value), Thread.currentThread.getContextClassLoader)
_executorDeserializeTime = System.currentTimeMillis() - deserializeStartTime
_executorDeserializeCpuTime = if (threadMXBean.isCurrentThreadCpuTimeSupported) {
threadMXBean.getCurrentThreadCpuTime - deserializeStartCpuTime
} else 0L
var writer: ShuffleWriter[Any, Any] = null
try {
val manager = SparkEnv.get.shuffleManager
writer = manager.getWriter[Any, Any](dep.shuffleHandle, partitionId, context)
writer.write(rdd.iterator(partition, context).asInstanceOf[Iterator[_ <: Product2[Any, Any]]])
writer.stop(success = true).get
} catch {
case e: Exception =>
try {
if (writer != null) {
writer.stop(success = false)
}
} catch {
case e: Exception =>
log.debug("Could not stop writer", e)
}
throw e
}
}
val (rdd, dep) = ser.deserialize[(RDD[_], ShuffleDependency
这条代码就是从反序列化的数据里拿到stage的最后一个RDD和它的依赖,并通过SparkEnv创建了一个manager,然后通过manager创建了一个Shuffle writer,这个Shuffle writer就是将任务的计算结果写到本地磁盘的角色,所以它很重要,Shuffle writer也有3种,这里不详细讲。此外,还有shuffle reader负责拉取shuffle writer写入的数据。这个以后会详细写。writer调用write()方法将数据迭代地写入磁盘。
总结
Executor 执行Task的过程写的比较简单,但其实里面涉及到比较多的知识点,比如SparkEnv,shuffle Writer等等,接下来会分别介绍这些知识点。