这一篇主要是讲解下游Stage如何得到上游Stage输出的数据,对Shuffle过程的数据进行追踪,这主要是利用一个组件——MapOutputTracker。上游Stage将信息写入MapOutputTracker(每个上游Task会产生一个MapStatus记录各分区数据在文件中的偏移等信息,所以这篇文章就是指如何将MapStatus加入到MapOutputTracker中),下游Stage通过MapOutputTracker获得该Stage要处理的数据。
从Task被提交,然后Executor将收到的Task信息封装成TaskRunner对象并加入线程池开始。当该Task执行时,TaskRunner的run()方法被执行。
- TaskRunner执行run()方法,其中的重要代码有两处。
val value = Utils.tryWithSafeFinally { // 1
val res = task.run( // 1
taskAttemptId = taskId,
attemptNumber = taskDescription.attemptNumber,
metricsSystem = env.metricsSystem)
threwException = false
res
}
// ... 中间经过一些步骤将value封装为serializedResult。
// 发送Result给Driver
execBackend.statusUpdate(taskId, TaskState.FINISHED, serializedResult) // 2
得到MapStatus的过程
- Task的run()方法中调用ShuffleMapTask的runTask()。
writer.write(rdd.iterator(partition, context).asInstanceOf[Iterator[_ <: Product2[Any, Any]]])
writer.stop(success = true).get
SortShuffleWriter的write方法:
// 返回一个状态对象,包含shuffle服务Id和各个分区数据在文件中的位移
mapStatus = MapStatus(blockManager.shuffleServerId, partitionLengths)
所以ShuffleMapTask的runTask()向TaskRunner执行run()返回一个MapStatus对象。
发送MapStatus给Driver的过程
- 调用statusUpdate方法
// CoarseGrainedExecutorBackend.scala
override def statusUpdate(taskId: Long, state: TaskState, data: ByteBuffer) {
val msg = StatusUpdate(executorId, taskId, state, data)
driver match {
// 发送消息给Driver
case Some(driverRef) => driverRef.send(msg) // 1
case None => logWarning(s"Drop $msg because has not yet connected to driver")
}
}
- Driver收到消息
// CoarseGrainedSchedulerBackend.scala
override def receive: PartialFunction[Any, Unit] = {
case StatusUpdate(executorId, taskId, state, data) =>
scheduler.statusUpdate(taskId, state, data.value)
if (TaskState.isFinished(state)) {
executorDataMap.get(executorId) match {
case Some(executorInfo) =>
executorInfo.freeCores += scheduler.CPUS_PER_TASK
makeOffers(executorId)
case None =>
// Ignoring the update since we don't know about the executor.
logWarning(s"Ignored task status update ($taskId state $state) " +
s"from unknown executor with ID $executorId")
}
}
}
- 调用TaskSchedulerImpl的statusUpdate()方法
// TaskSchedulerImpl.scala
def statusUpdate(tid: Long, state: TaskState, serializedData: ByteBuffer) {
var failedExecutor: Option[String] = None
var reason: Option[ExecutorLossReason] = None
synchronized {
try {
Option(taskIdToTaskSetManager.get(tid)) match {
case Some(taskSet) =>
if (state == TaskState.LOST) {
// TaskState.LOST is only used by the deprecated Mesos fine-grained scheduling mode,
// where each executor corresponds to a single task, so mark the executor as failed.
val execId = taskIdToExecutorId.getOrElse(tid, throw new Illeg