Spark中DAGScheduler
的主要作用是将Job按照RDD的依赖关系划分成若干个TaskSet,也称为Stage;之后结合当前缓存情况及数据就近的原则,将Stage提交给TaskScheduler
private[spark]
class DAGScheduler(
private[scheduler] val sc: SparkContext,
private[scheduler] val taskScheduler: TaskScheduler,
listenerBus: LiveListenerBus,
mapOutputTracker: MapOutputTrackerMaster,
blockManagerMaster: BlockManagerMaster,
env: SparkEnv,
clock: Clock = new SystemClock())
extends Logging
从类的定义中看到,涉及到作为Spark入口的SparkContext
;用于执行task的TaskScheduler
;处理RDD计算过程中的Map信息的MapOutputTrackerMaster
;以及管理block存储的BlockManagerMaster
RDD的action操作,比如count,reduce等,会触发SparkContext.runJob
方法,后者实际最终调用的是DAGScheduler.submitJob
方法
// DAGScheduler.submitJob
def submitJob[T, U](
rdd: RDD[T],
func: (TaskContext, Iterator[T]) => U,
partitions: Seq[Int],
callSite: CallSite,
allowLocal: Boolean,
resultHandler: (Int, U) => Unit,
properties: Properties): JobWaiter[U] = {
// Check to make sure we are not launching a task on a partition that does not exist.
val maxPartitions = rdd.partitions.length
partitions.find(p => p >= maxPartitions || p < 0).foreach { p =>
throw new IllegalArgumentException(
"Attempting to access a non-existent partition: " + p + ". " +
"Total number of partitions: " + maxPartitions)
}
val jobId = nextJobId.getAndIncrement()
if (partitions.size == 0) {
return new JobWaiter[U](this, jobId, 0, resultHandler)
}
assert(partitions.size > 0)
val func2 = func.asInstanceOf[(TaskContext, Iterator[_]) => _]
val waiter = new JobWaiter(this, jobId, partitions.size, resultHandler)
// post方法将event加入队列,执行是由另外的线程遍历队列来处理
eventProcessLoop.post(JobSubmitted(
jobId, rdd, func2, partitions.toArray, allowLocal, callSite, waiter,
SerializationUtils.clone(properties)))
waiter
}
JobSubmitted
是继承了DAGSchedulerEvent
特征的子类,DAGScheduler
可以处理的事件类型都被包装成了DAGSchedulerEvent
eventProcessLoop
是DAGSchedulerEventProcessLoop
类的实例,后者是DAGScheduler
的私有类,继承了EventLoop
类,主要通过调用onReceive
方法来单线程的处理队列中的event
Notice:
EventLoop.post
方法只是将event
装入队列,真正的处理是由单线程的eventThread
来遍历队列,对取出的事件调用EventLoop.onReceive(event)
方法。因此不同的线程可以同时提交事件,不会存在冲突,但不保证事件会立即被执行
DAGSchedulerEventProcessLoop
覆盖了父类的onReceive
方法,我们可以看到JobSubmitted
对应的是DAGScheduler.handleJobSubmitted
方法
// DAGSchedulerEventProcessLoop.onReceive
override def onReceive(event: DAGSchedulerEvent): Unit = event match {
case JobSubmitted(jobId, rdd, func, partitions, allowLocal, callSite, listener, properties) =>
dagScheduler.handleJobSubmitted(jobId, rdd, func, partitions, allowLocal, callSite,
listener, properties)
case StageCancelled(stageId) =>
dagScheduler.handleStageCancellation(stageId)
...
}
// DAGScheduler.handleJobSubmitted
private[scheduler] def handleJobSubmitted(jobId: Int,
finalRDD: RDD[_],
func: (TaskContext, Iterator[_]) => _,
partitions: Array[Int],
allowLocal: Boolean,
callSite: CallSite,
listener: JobListener,
properties: Properties) {
var finalStage: ResultStage = null
try {
// New stage creation may throw an exception if, for example, jobs are run on a
// HadoopRDD whose underlying HDFS files have been deleted.
finalStage = newResultStage(finalRDD, partitions.size, jobId, callSite)
} catch {
case e: Exception =>
logWarning("Creating new stage failed due to exception - job: " + jobId, e)
listener.jobFailed(e)
return
}
if (finalStage != null) {
val job = new ActiveJob(jobId, finalStage, func, partitions, callSite, listener, properties)
clearCacheLocs()
logInfo("Got job %s (%s) with %d output partitions (allowLocal=%s)".format(
job.jobId, callSite.shortForm, partitions.length, allowLocal))
logInfo("Final stage: " + finalStage + "(" + finalStage.name +