文章目录
原理
Executor:
def launchTask(context: ExecutorBackend, taskDescription: TaskDescription): Unit = {
//实例化一个TaskRunner对象来执行Task
val tr = new TaskRunner(context, taskDescription)
//将Task加入到正在运行的Task队列
runningTasks.put(taskDescription.taskId, tr)
/**
* 任务最终运行的地方
* 在worker的线程池中,
* 线程池调用org.apache.spark.executor.Executor.TaskRunner.run()方法运行任务
* 就是这个文件中的内部类
*/
threadPool.execute(tr)
}
/**
* 这个方法是Executor执行task的主要方法。Task在Executor中执行完成后,会通过向Driver发送StatusUpdate的消息来
* 通知Driver任务的状态更新为TaskState.FINISHED。
*
* 在Executor运行Task时,得到计算结果会存入org.apache.spark.scheduler.DirectTaskResult。在将结果传回Driver时,
* 会根据结果的大小有不同的策略:对于较大的结果,将其以taskId为key存入org.apache.storage.BlockManager,如果结果不大,
* 则直接回传给Driver。回传是通过AKKA来实现的,所以能够回传的值会有一个由AKKA限制的大小,这里涉及到一个参数
* spark.akka.frameSize,默认为128,单位为Byte,在源码中最终转换成了128MB。表示AKKA最大能传递的消息大小为128MB,
* 但是同时AKKA会保留一部分空间用于存储其他数据,这部分的大小为200KB,那么结果如果小于128MB - 200KB的话就可以直接返回该值,
* 否则的话,在不大于1G的情况下(可以通过参数spark.driver.maxResultSize来修改,默认为1g),会通过BlockManager来传递。
* 详细信息会在Executor模块中描述。完整情况如下:
* (1)如果结果大于1G,直接丢弃
* (2)如果结果小于等于1G,大于128MB - 200KB,通过BlockManager记录结果的tid和其他信息
* (3)如果结果小于128MB - 200 KB,直接返回该值
* */
override def run(): Unit = {
threadId = Thread.currentThread.getId
Thread.currentThread.setName(threadName)
// 返回Java虚拟机的线程系统的托管bean。
val threadMXBean = ManagementFactory.getThreadMXBean
//为我们的Task创建内存管理器 ==》 管理单个任务分配的内存。
val taskMemoryManager = new TaskMemoryManager(env.memoryManager, taskId)
//记录反序列化时间
val deserializeStartTime = System.currentTimeMillis()
//测试如果Java虚拟机支持当前线程的CPU时间度量。这个貌似很高深,不理解先放着?
val deserializeStartCpuTime = if (threadMXBean.isCurrentThreadCpuTimeSupported) {
threadMXBean.getCurrentThreadCpuTime
} else 0L
//加载具体类时需要用到ClassLoader
Thread.currentThread.setContextClassLoader(replClassLoader)
//创建序列化器
val ser = env.closureSerializer.newInstance()
logInfo(s"Running $taskName (TID $taskId)")
// 更新任务的 状态
// 开始执行Task,
// yarn-client模式下,调用CoarseGrainedExecutorBackend的statusUpdate方法
// 将该Task的运行状态置为RUNNING
// 调用ExecutorBackend#statusUpdate向Driver发信息汇报当前状态
// 以local为例:那么这里调用的是LocalSchedulerBackend中的statusUpdate方法
execBackend.statusUpdate(taskId, TaskState.RUNNING, EMPTY_BYTE_BUFFER)
//记录运行时间和GC信息
var taskStart: Long = 0
var taskStartCpu: Long = 0
startGCTime = computeTotalGcTime()
try {
// Must be set before updateDependencies() is called, in case fetching dependencies
// requires access to properties contained within (e.g. for access control).
// 必须在updateDependencies()调用之前设置,以防获取依赖关系需要访问包含在(例如访问控制)中的属性。
Executor.taskDeserializationProps.set(taskDescription.properties)
//下载Task运行缺少的依赖。
updateDependencies(taskDescription.addedFiles, taskDescription.addedJars)
//反序列化Task
task = ser.deserialize[Task[Any]](
taskDescription.serializedTask, Thread.currentThread.getContextClassLoader)
task.localProperties = taskDescription.properties
//设置Task运行时的MemoryManager
task.setTaskMemoryManager(taskMemoryManager)
// If this task has been killed before we deserialized it, let's quit now. Otherwise,
// continue executing the task.
// 如果在序列化之前杀掉任务了,那么我们退出,否则继续执行任务
// 要运行的任务,是不是指定被杀死了,比如,我提交了一个任务,刚刚提交发现错了,直接ctrl+c终止程序了,这时候任务还没运行,相当于指定这个任务被杀死
val killReason = reasonIfKilled
// 判断如果该task被kill了,直接抛出异常
if (killReason.isDefined) {
// Throw an exception rather than returning, because returning within a try{} block
// causes a NonLocalReturnControl exception to be thrown. The NonLocalReturnControl
// exception will be caught by the catch block, leading to an incorrect ExceptionFailure
// for the task.
throw new TaskKilledException(killReason.get)
}
logDebug("Task " + taskId + "'s epoch is " + task.epoch)
env.mapOutputTracker.updateEpoch(task.epoch)
// Run the actual task and measure its runtime.
//运行的实际任务,并测量它的运行时间。
taskStart = System.currentTimeMillis()
taskStartCpu = if (threadMXBean.isCurrentThreadCpuTimeSupported) {
threadMXBean.getCurrentThreadCpuTime
} else 0L
var threwException = true
val value = try {
/** 调用Task.run方法,开始运行task */
val res = task.run(
taskAttemptId = taskId,
attemptNumber = taskDescription.attemptNumber,
metricsSystem = env.metricsSystem)
threwException = false
res
} finally {
//清理所有分配的内存和分页,并检测是否有内存泄漏
val releasedLocks = env.blockManager.releaseAllLocksForTask(taskId)
val freedMemory = taskMemoryManager.cleanUpAllAllocatedMemory()
if (freedMemory > 0 && !threwException) {
val errMsg = s"Managed memory leak detected; size = $freedMemory bytes, TID = $taskId"
if (conf.getBoolean("spark.unsafe.exceptionOnMemoryLeak", false)) {
throw new SparkException(errMsg)
} else {
logWarning(errMsg)
}
}
if (releasedLocks.nonEmpty && !threwException) {
val errMsg =
s"${releasedLocks.size} block locks were not released by TID = $taskId:\n" +
releasedLocks.mkString("[", ", ", "]")
if (conf.getBoolean("spark.storage.exceptionOnPinLeak", false)) {
throw new SparkException(errMsg)
} else {
logInfo(errMsg)
}
}
}
task.context.fetchFailed.foreach { fetchFailure =>
// uh-oh. it appears the user code has caught the fetch-failure without throwing any
// other exceptions. Its *possible* this is what the user meant to do (though highly
// unlikely). So we will log an error and keep going.
logError(s"TID ${taskId} completed successfully though internally it encountered " +
s"unrecoverable fetch failures! Most likely this means user code is incorrectly " +
s"swallowing Spark's internal ${classOf[FetchFailedException]}", fetchFailure)
}
//记录Task完成时间
val taskFinish = System.currentTimeMillis()
val taskFinishCpu = if (threadMXBean.isCurrentThreadCpuTimeSupported) {
threadMXBean.getCurrentThreadCpuTime
} else 0L
// If the task has been killed, let's fail it.
task.context.killTaskIfInterrupted()
//否则序列化得到的Task执行的结果
val resultSer = env.serializer.newInstance()
val beforeSerialization = System.currentTimeMillis()
val valueBytes = resultSer.serialize(value)
val afterSerialization = System.currentTimeMillis()
//记录相关的metrics
// Deserialization happens in two parts: first, we deserialize a Task object, which
// includes the Partition. Second, Task.run() deserializes the RDD and function to be run.
task.metrics.setExecutorDeserializeTime(
(taskStart - deserializeStartTime) + task.executorDeserializeTime)
task.metrics.setExecutorDeserializeCpuTime(
(taskStartCpu - deserializeStartCpuTime) + task.executorDeserializeCpuTime)
// We need to subtract Task.run()'s deserialization time to avoid double-counting
task.metrics.setExecutorRunTime((taskFinish - taskStart) - task.executorDeserializeTime)
task.metrics.setExecutorCpuTime(
(taskFinishCpu - taskStartCpu) - task.executorDeserializeCpuTime)
task.metrics.setJvmGCTime(computeTotalGcTime() - startGCTime)
task.metrics.setResultSerializationTime(afterSerialization - beforeSerialization)
// Note: accumulator updates must be collected after TaskMetrics is updated
val accumUpdates = task.collectAccumulatorUpdates()
//创建直接返回给Driver的结果对象DirectTaskResult
// 生成DirectTaskResult对象,并序列化Task的运行结果
// TODO: do not serialize value twice
val directResult = new DirectTaskResult(valueBytes, accumUpdates)
val serializedDirectResult = ser.serialize(directResult)
val resultSize = serializedDirectResult.limit
// directSend = sending directly back to the driver
// 如果序列化后的结果比spark.driver.maxResultSize配置的还大,直接丢弃该结果
val serializedResult: ByteBuffer = {
//对直接返回的结果对象大小进行判断
if (maxResultSize > 0 && resultSize > maxResultSize) {
// 大于最大限制1G,直接丢弃ResultTask
logWarning(s"Finished $taskName (TID $taskId). Result is larger than maxResultSize " +
s"(${Utils.bytesToString(resultSize)} > ${Utils.bytesToString(maxResultSize)}), " +
s"dropping it.")
ser.serialize(new IndirectTaskResult[Any](TaskResultBlockId(taskId), resultSize))
// 如果序列化后的结果小于上面的配置,而大于spark.akka.frameSize - 200KB
// 结果通过BlockManager回传
} else if (resultSize > maxDirectResultSize) {
// 结果大小大于设定的阀值,则放入BlockManager中
val blockId = TaskResultBlockId(taskId)
env.blockManager.putBytes(
blockId,
new ChunkedByteBuffer(serializedDirectResult.duplicate()),
StorageLevel.MEMORY_AND_DISK_SER)
logInfo(
s"Finished $taskName (TID $taskId). $resultSize bytes result sent via BlockManager)")
// 返回非直接返回给Driver的对象TaskResultTask
ser.serialize(new IndirectTaskResult[Any](blockId, resultSize))
} else {
// 结果不大,直接传回给Driver
// 如果结果小于spark.akka.frameSize - 200KB,则可通过AKKA直接返回Task的该执行结果
logInfo(s"Finished $taskName (TID $taskId). $resultSize bytes result sent to driver")
serializedDirectResult
}
}
setTaskFinishedAndClearInterruptStatus()
/**
* 更新当前Task的状态为finished //通知Driver Task已完成
* 调用ExecutorBackend.statusUpdate() ==》 org.apache.spark.executor.CoarseGrainedExecutorBackend.statusUpdate()
*/
execBackend.statusUpdate(taskId, TaskState.FINISHED, serializedResult)
} catch {
case t: Throwable if hasFetchFailure && !Utils.isFatalError(t) =>
val reason = task.context.fetchFailed.get.toTaskFailedReason
if (!t.isInstanceOf[FetchFailedException]) {
// there was a fetch failure in the task, but some user code wrapped that exception
// and threw something else. Regardless, we treat it as a fetch failure.
val fetchFailedCls = classOf[FetchFailedException].getName
logWarning(s"TID ${taskId} encountered a ${fetchFailedCls} and " +
s"failed, but the ${fetchFailedCls} was hidden by another " +
s"exception. Spark is handling this like a fetch failure and ignoring the " +
s"other exception: $t")
}
setTaskFinishedAndClearInterruptStatus()
execBackend.statusUpdate(taskId, TaskState.FAILED, ser.serialize(reason))
case t: TaskKilledException =>
logInfo(s"Executor killed $taskName (TID $taskId), reason: ${t.reason}")
setTaskFinishedAndClearInterruptStatus()
execBackend.statusUpdate(taskId, TaskState.KILLED, ser.serialize(TaskKilled(t.reason)))
case _: InterruptedException | NonFatal(_) if
task != null && task.reasonIfKilled.isDefined =>
val killReason = task.reasonIfKilled.getOrElse("unknown reason")
logInfo(s"Executor interrupted and killed $taskName (TID $taskId), reason: $killReason")
setTaskFinishedAndClearInterruptStatus()
execBackend.statusUpdate(
taskId, TaskState.KILLED, ser.serialize(TaskKilled(killReason)))
case CausedBy(cDE: CommitDeniedException) =>
val reason = cDE.toTaskFailedReason
setTaskFinishedAndClearInterruptStatus()
execBackend.statusUpdate(taskId, TaskState.FAILED, ser.serialize(reason))
case t: Throwable =>
// Attempt to exit cleanly by informing the driver of our failure.
// If anything goes wrong (or this was a fatal exception), we will delegate to
// the default uncaught exception handler, which will terminate the Executor.
logError(s"Exception in $taskName (TID $taskId)", t)
// Collect latest accumulator values to report back to the driver
val accums: Seq[AccumulatorV2[_, _]] =
if (task != null) {
task.metrics.setExecutorRunTime(System.currentTimeMillis() - taskStart)
task.metrics.setJvmGCTime(computeTotalGcTime() - startGCTime)
task.collectAccumulatorUpdates(taskFailed = true)
} else {
Seq.empty
}
val accUpdates = accums.map(acc => acc.toInfo(Some(acc.value), None))
val serializedTaskEndReason = {
try {
ser.serialize(new ExceptionFailure(t, accUpdates).withAccums(accums))
} catch {
case _: NotSerializableException =>
// t is not serializable so just send the stacktrace
ser.serialize(new ExceptionFailure(t, accUpdates, false).withAccums(accums))
}
}
setTaskFinishedAndClearInterruptStatus()
execBackend.statusUpdate(taskId, TaskState.FAILED, serializedTaskEndReason)
// Don't forcibly exit unless the exception was inherently fatal, to avoid
// stopping other tasks unnecessarily.
if (Utils.isFatalError(t)) {
uncaughtExceptionHandler.uncaughtException(Thread.currentThread(), t)
}
} finally {
// 总runnint状态的task列表中将该task移除 //将Task从运行队列中去除
runningTasks.remove(taskId)
}
}
private def hasFetchFailure: Boolean = {
task != null && task.context != null && task.context.fetchFailed.isDefined
}
}
* 代码逻辑非常简单,概述如下:
1、需要创建一个Task上下文实例,即TaskContextImpl类型的context,这个TaskContextImpl主要包括以下内容:Task所属Stage的stageId、Task对应数据分区的partitionId、Task执行的taskAttemptId、Task执行的序号attemptNumber、Task内存管理器taskMemoryManager、指标度量系统metricsSystem、内部累加器internalAccumulators、是否本地运行的标志位runningLocally(为false);
2、将context放入TaskContext的taskContext变量中,这个taskContext变量为ThreadLocal[TaskContext];
3、在任务上下文context中设置主机名localHostName、内部累加器internalAccumulators等Metrics信息;
4、设置task线程为当前线程;
5、如果需要杀死task,调用kill()方法,且调用的方式为不中断线程;
6、调用runTask()方法,传入Task上下文信息context,执行Task,并调用Task上下文的collectAccumulators()方法,收集累加器;
7、最后,任务上下文context标记Task完成,为unrolling块释放当前线程使用的内存,清楚任务上下文等。
*/
final def run(
taskAttemptId: Long,
attemptNumber: Int,
metricsSystem: MetricsSystem): T = {
SparkEnv.get.blockManager.registerTask(taskAttemptId)
// 创建一个Task上下文实例:TaskContextImpl类型的context
context = new TaskContextImpl(
stageId,
partitionId,
taskAttemptId,
attemptNumber,
taskMemoryManager,
localProperties,
metricsSystem,
metrics)
// 将context放入TaskContext的taskContext变量中
// taskContext变量为ThreadLocal[TaskContext]
TaskContext.setTaskContext(context)
// task线程为当前线程
taskThread = Thread.currentThread()
if (_reasonIfKilled != null) {
// 如果需要杀死task,调用kill()方法,且调用的方式为不中断线程
kill(interruptThread = false, _reasonIfKilled)
}
// 这个不知道是干嘛的?
new CallerContext(
"TASK",
SparkEnv.get.conf.get(APP_CALLER_CONTEXT),
appId,
appAttemptId,
jobId,
Option(stageId),
Option(stageAttemptId),
Option(taskAttemptId),
Option(attemptNumber)).setCurrentContext()
try {
// 调用runTask()方法,传入Task上下文信息context,执行Task,并调用Task上下文的collectAccumulators()方法,收集累加器
runTask(context)
} catch {
case e: Throwable =>
// Catch all errors; run task failure callbacks, and rethrow the exception.
try {
context.markTaskFailed(e)
} catch {
case t: Throwable =>
e.addSuppressed(t)
}
// 上下文标记Task完成
context.markTaskCompleted(Some(e))
throw e
} finally {
try {
// Call the task completion callbacks. If "markTaskCompleted" is called twice, the second
// one is no-op.
context.markTaskCompleted(None)
} finally {
try {
Utils.tryLogNonFatalError {
// Release memory used by this thread for unrolling blocks
// 为unrolling块释放当前线程使用的内存
SparkEnv.get.blockManager.memoryStore.releaseUnrollMemoryForThisTask(MemoryMode.ON_HEAP)
SparkEnv.get.blockManager.memoryStore.releaseUnrollMemoryForThisTask(
MemoryMode.OFF_HEAP)
// Notify any tasks waiting for execution memory to be freed to wake up and try to
// acquire memory again. This makes impossible the scenario where a task sleeps forever
// because there are no other tasks left to notify it. Since this is safe to do but may
// not be strictly necessary, we should revisit whether we can remove this in the
// future.
val memoryManager = SparkEnv.get.memoryManager
memoryManager.synchronized { memoryManager.notifyAll() }
}
} finally {
// Though we unset the ThreadLocal here, the context member variable itself is still
// queried directly in the TaskRunner to check for FetchFailedExceptions.
// 释放TaskContext
TaskContext.unset()
}
}
}
}
runTask(context: TaskContext)是接口由shufflemaptask和resulttask实现
1个shuffleMapTask会将一个RDD的元素,切分为多个 bucket
*基于一个在shuffledEpendency中指定的 partitioner,默认呢,就是 HashPartitioner
/**
*
* 运行的主要逻辑其实只有两步,如下:
1、通过使用广播变量反序列化得到RDD和ShuffleDependency:
1.1、获得反序列化的起始时间deserializeStartTime;
1.2、通过SparkEnv获得反序列化器ser;
1.3、调用反序列化器ser的deserialize()进行RDD和ShuffleDependency的反序列化,数据来源于taskBinary,得到rdd、dep;
1.4、计算Executor进行反序列化的时间_executorDeserializeTime;
2、利用shuffleManager的writer进行数据的写入:
2.1、通过SparkEnv获得shuffleManager;
2.2、通过shuffleManager的getWriter()方法,获得shuffle的writer,其中的partitionId表示的是当前RDD的某个partition,也就是说write操作作用于partition之上;
2.3、针对RDD中的分区partition,调用rdd的iterator()方法后,再调用writer的write()方法,写数据;
2.4、停止writer,并返回标志位。
*/
override def runTask(context: TaskContext): MapStatus = {
// Deserialize the RDD using the broadcast variable.
val threadMXBean = ManagementFactory.getThreadMXBean
// 反序列化的起始时间
//对task要处理的d相关的数据,做一些反序列化操作
//这个RDD,关键问题是,你是怎么拿到的???
//因为大家知道,多个task运行在多个 executor中,都是并行运行,或者并发运行的
//可能都不在一个地方
//但是呢?一个stage的task,其实要处理的RDD是一样的
//所以task怎么拿到自己要处理的那个rdd的数据呢?
//这里呢,会通过 broadcast variable,直接拿到
val deserializeStartTime = System.currentTimeMillis()
val deserializeStartCpuTime = if (threadMXBean.isCurrentThreadCpuTimeSupported) {
threadMXBean.getCurrentThreadCpuTime
} else 0L
// 获得反序列化器closureSerializer
val ser = SparkEnv.get.closureSerializer.newInstance()
// 调用反序列化器closureSerializer的deserialize()进行RDD和ShuffleDependency的反序列化,数据来源于taskBinary
val (rdd, dep) = ser.deserialize[(RDD[_], ShuffleDependency[_, _, _])](
ByteBuffer.wrap(taskBinary.value), Thread.currentThread.getContextClassLoader)
// 计算Executor进行反序列化的时间
_executorDeserializeTime = System.currentTimeMillis() - deserializeStartTime
_executorDeserializeCpuTime = if (threadMXBean.isCurrentThreadCpuTimeSupported) {
threadMXBean.getCurrentThreadCpuTime - deserializeStartCpuTime
} else 0L
var writer: ShuffleWriter[Any, Any] = null
try {
// 获得shuffleManager
val manager = SparkEnv.get.shuffleManager
// 通过shuffleManager的getWriter()方法,获得shuffle的writer
// 启动的partitionId表示的是当前RDD的某个partition,也就是说write操作作用于partition之上
writer = manager.getWriter[Any, Any](dep.shuffleHandle, partitionId, context)
// 针对RDD中的分区<span style="font-family: Arial, Helvetica, sans-serif;">partition</span>
// <span style="font-family: Arial, Helvetica, sans-serif;">,调用rdd的iterator()方法后,再调
// 用writer的write()方法,写数据</span>
writer.write(rdd.iterator(partition, context).asInstanceOf[Iterator[_ <: Product2[Any, Any]]])
// 停止writer,并返回标志位
writer.stop(success = true).get
} catch {
case e: Exception =>
try {
if (writer != null) {
writer.stop(success = false)
}
} catch {
case e: Exception =>
log.debug("Could not stop writer", e)
}
throw e
}
}
//最最重要的一行代码就在这里
//首先调用了,r的 iterator()方法,并且传入了,当前tagk要处理哪个 partition
//所以,核心的逻辑,就在xc的 Lterator()方法中,在这里,就实现了针对rd的某个 partition,执行我们自己定义的
//算子,或者是函数
//执行完了我们自己定义的算子,或者函数,是不是相当于是,针对rdd的 partition执行了处理,那么,是不是会有返回
//的数据??
//ok,返回的数据,都是通过shuffleWriter,经过H手 partitions进行分区之后,写入自己对应的分区bucket
writer.write(rdd iterator (partition context).asInstanceof[Iterator] < product2 [Any, Any]]])
//最后,返回结果, Mapstatus
// Mapstatus里面封装了 ShuffleMapTask计算后的数据,存储在哪里,其实就是s。 maNager相关的信息
//B1 mAnager,是 Spark底层的内存、数据、磁盘数据管理的组件
//讲完shuf1e之后,我们就来剖析B1 cranage
/**
* Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
* This should ''not'' be called by users directly, but is available for implementors of custom
* subclasses of RDD.
*
* RDD的内部方法;将从缓存中读取如果适用,否则计算。这应该''not''被直接用户,但可用于RDD的自定义类实现。
*
* MappedRDD的iterator方法实际上是父类RDD的iterator方法。如果分区任务初次执行,此时还没有缓存,所以会调用
* computeOrReadCheckpoint方法。
* 这里需要说明一下iterator方法的容错处理过程;如果某个分区任务执行失败,但是其他分区任务执行成功,可以利用DAG重新调度,
* 失败的分区任务将从检查点恢复状态,而那些执行成功的分区任务由于其他执行结果已经缓存到存储体系,所以调用CacheManager的
* getOrCompute方法获取即可,不需要再次执行。
*/
final def iterator(split: Partition, context: TaskContext): Iterator[T] = {
if (storageLevel != StorageLevel.NONE) {
// 如果存储级别不是NONE,那么先检查是否有缓存,没有缓存则需要进行计算
getOrCompute(split, context)
} else {
// 如果有checkpoint,那么直接读取结果,否则直接进行计算
computeOrReadCheckpoint(split, context)
}
}
/**
* Gets or computes an RDD partition. Used by RDD.iterator() when an RDD is cached.
*
* 得到或者计算一个RDD分区,使用 RDD.iterator() 当一个RDD是被缓存中
*
* 在任务迭代计算的过程中,当判断村粗级别使用了缓存,就会调用CacheManager的getOrCompute方法。
*
* 处理逻辑:
* 1.从存储体系获取Block;
* 2.如果确实获取到了Block,那么将它封装为InterruptibleIterator并且返回。如果还没有缓存Block,
* 则重新计算或者从CheckPoint中获取数据,调用putInBlockManager方法将数据写入缓存后封装为InterruptibleIterator并且返回。
*
*/
private[spark] def getOrCompute(partition: Partition, context: TaskContext): Iterator[T] = {
// 获取RDD的BlockId
val blockId = RDDBlockId(id, partition.index)
var readCachedBlock = true
// This method is called on executors, so we need call SparkEnv.get instead of sc.env.
// 这种方法被executors调用,所以我们需要调用sparkenv.get代替sc.env方法。
SparkEnv.get.blockManager.getOrElseUpdate(blockId, storageLevel, elementClassTag, () => {
readCachedBlock = false
// 在存在检查点时直接获取中间结果,否则需要调用compute继续计算
computeOrReadCheckpoint(partition, context)
}) match {
// 向BlockManager查询是否有缓存
case Left(blockResult) =>
if (readCachedBlock) {
val existingMetrics = context.taskMetrics().inputMetrics
existingMetrics.incBytesRead(blockResult.bytes)
new InterruptibleIterator[T](context, blockResult.data.asInstanceOf[Iterator[T]]) {
override def next(): T = {
existingMetrics.incRecordsRead(1)
delegate.next()
}
}
} else {
new InterruptibleIterator(context, blockResult.data.asInstanceOf[Iterator[T]])
}
case Right(iter) =>
new InterruptibleIterator(context, iter.asInstanceOf[Iterator[T]])
}
}
private[spark] def computeOrReadCheckpoint(split: Partition, context: TaskContext): Iterator[T] =
{
// 判断这个RDD是否建立检查点和实现,是可靠的或局部的。
if (isCheckpointedAndMaterialized) {
firstParent[T].iterator(split, context)
} else {
// MappedRDD的compute
compute(split, context)
}
}
这里就很有含义到了
compute实际上,什么意思?
就是,针对RDD中的某个 partition执行我们给这个RDD定义的算子和函数
我们定义的算子和函数,是什么东东??我们是不是在这里没有看到啊!!
这个f,你可以理解成我们自己定义的算子和函数,但是呢, Spark内部进行了封装的,还实现了一些其他的逻辑
调用到这里为止,其实就是在针对rdd的 partition,执行自定义的计算操作:
MapPartitionsRDD类
override def compute(split: Partition, context: TaskContext): Iterator[U] =
f(context, split.index, firstParent[T].iterator(split, context))
继续Task的run方法
/**
* 更新当前Task的状态为finished //通知Driver Task已完成
* 调用ExecutorBackend.statusUpdate() ==》 org.apache.spark.executor.CoarseGrainedExecutorBackend.statusUpdate()
*/
execBackend.statusUpdate(taskId, TaskState.FINISHED, serializedResult)
/**
* exxecutor状态改变事件
*
* @param taskId
* @param state
* @param data
*/
override def statusUpdate(taskId: Long, state: TaskState, data: ByteBuffer) {
val msg = StatusUpdate(executorId, taskId, state, data)
driver match {
/**
* 往DriverEndpoint发送状态更新消息StatusUpdate,
* StandLone模式下由org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverEndpoint.receive方法处理
*/
case Some(driverRef) => driverRef.send(msg)
case None => logWarning(s"Drop $msg because has not yet connected to driver")
}
}