Spark-Core源码学习记录 4 Stage划分以及Task本地性计算与分发

本文深入探讨Spark-Core中Stage和Task的划分过程,从Action算子开始,详细跟踪源码,揭示Stage的创建和Task的分发机制,以及Task本地性计算的关键步骤。通过对Spark源码的解读,帮助理解Spark作业执行的核心逻辑。
摘要由CSDN通过智能技术生成

Spark-Core源码学习记录

该系列作为Spark源码回顾学习的记录,旨在捋清Spark分发程序运行的机制和流程,对部分关键源码进行追踪,争取做到知其所以然,对枝节部分源码仅进行文字说明,不深入下钻,避免混淆主干内容。
前面篇章中,我们完成了MasterWorker的注册启动,DriverExecutor的注册启动,Application的注册与启动。初始化了SparkContext、SchedulerBackend、TaskScheduler,最终通过schedule()方法完成硬件资源的分配。万事俱备,只欠东风,应用如何被划分成Stage,以及Stage如何分发成具体的TaskExecutor执行?下面我们就进入JavaWordCount应用程序的末尾output = counts.collect();

从一个Action算子开始

count方法内部调用SparkContext的runJob方法,我们省略掉内部的多次周转,直接展现最终的调用,

def count(): Long = sc.runJob(this, Utils.getIteratorSize _).sum
// ... 省略runJob内部各种调用,下面是最终的调用
/**
 * Run a function on a given set of partitions in an RDD and pass the results to the given
 * handler function. This is the main entry point for all actions in Spark.
 * 这是一个所有action算子的主要入口
 * @param rdd target RDD to run tasks on
 * @param func a function to run on each partition of the RDD
 * @param partitions set of partitions to run on; some jobs may not want to compute on all
 * partitions of the target RDD, e.g. for operations like `first()`
 * @param resultHandler callback to pass each result to
 */
def runJob[T, U: ClassTag](
    rdd: RDD[T],
    func: (TaskContext, Iterator[T]) => U,
    partitions: Seq[Int],
    resultHandler: (Int, U) => Unit): Unit = {
   
  if (stopped.get()) {
   
    throw new IllegalStateException("SparkContext has been shutdown")
  }
  val callSite = getCallSite
  val cleanedFunc = clean(func)
  logInfo("Starting job: " + callSite.shortForm)
  if (conf.getBoolean("spark.logLineage", false)) {
   
    logInfo("RDD's recursive dependencies:\n" + rdd.toDebugString)
  }
  // 原来最终是靠 dagScheduler完成,这才是我们下面要关注的重点
  dagScheduler.runJob(rdd, cleanedFunc, partitions, callSite, resultHandler, localProperties.get)
  progressBar.foreach(_.finishAll())
  // checkpoint,为了RDD的复用
  rdd.doCheckpoint()
}

查看DagSchedulerrunJob方法:

def runJob[T, U](...): Unit = {
   
  // Submit an action job to the scheduler.
  val waiter = submitJob(rdd, func, partitions, callSite, resultHandler, properties)
  // Preferred alternative to `Await.ready()`
  ThreadUtils.awaitReady(waiter.completionFuture, Duration.Inf)
}

进入submitJob方法

/**
 * Submit an action job to the scheduler.
 * @return a JobWaiter object that can be used to block until the job finishes executing or can be used to cancel the job.
 */
def submitJob[T, U](...): JobWaiter[U] = {
   
  val jobId = nextJobId.getAndIncrement()
  val func2 = func.asInstanceOf[(TaskContext, Iterator[_]) => _]
  // 实例化一个 JobWaiter,内部是一些状态记录的成员
  val waiter = new JobWaiter(this, jobId, partitions.size, resultHandler)
  // 还记的前面实例化 DAGScheduler的时候,提及的 eventProcessLoop,类似于Rpc中的 Dispatcher,通过一个循环线程来处理一个队列 eventQueue中的消息
  // 此处post就是往  eventQueue中放入一个模板类 JobSubmitted,等待循环线程来处理就可以
  eventProcessLoop.post(JobSubmitted(
    jobId, rdd, func2, partitions.toArray, callSite, waiter,
    SerializationUtils.clone(properties)))
  waiter
}

我们回想一下eventProcessLoop的逻辑,就是循环从eventQueue中取出具体事件,然后调用doOnReceive(event)进行处理,具体细节可回顾前篇文章。

private def doOnReceive(event: DAGSchedulerEvent): Unit = event match {
   
  case JobSubmitted(jobId, rdd, func, partitions, callSite, listener, properties) =>
    dagScheduler.handleJobSubmitted(jobId, rdd, func, partitions, callSite, listener, properties)

即将进入handleJobSubmitted方法,接下来的内容均非常重要,包括Stage划分和Task分发等。

Stage

private[scheduler] def handleJobSubmitted(jobId: Int,
    finalRDD: RDD[_], // 触发count算子的RDD
    func: (TaskContext, Iterator[_]) => _,
    partitions: Array[Int],
    callSite: CallSite,
    listener: JobListener,
    properties: Properties) {
   
  var finalStage: ResultStage = null
  try {
   
    // 这里开始划分Stage,内容非常多,我们在下面单独展开
    finalStage = createResultStage(finalRDD, func, partitions, jobId, callSite)
  } catch {
   ...}
  // 简单的封装,其中包含了上面的 finalStage
  val job = new ActiveJob(jobId, finalStage, callSite, listener, properties)
  // 清除被持久化的RDD分区的位置
  clearCacheLocs()
  val jobSubmissionTime = clock.getTimeMillis()
  jobIdToActiveJob(jobId) = job
  // 状态记录
  activeJobs += job
  finalStage.setActiveJob(job)
  val stageIds = jobIdToStageIds(jobId).toArray
  val stageInfos = stageIds.flatMap(id => stageIdToStage.get(id).map(_.latestInfo))
  listenerBus.post(
    SparkListenerJobStart(job.jobId, jobSubmissionTime, stageInfos, properties))
  // 正式提交Stage
  submitStage(finalStage)
}

下面分别对createResultStagesubmitStage进行追踪:

/**
 * Create a ResultStage associated with the provided jobId.
 */
private def createResultStage(...): ResultStage = {
   
  // 获得ResultStage的父stage,内部循环嵌套,下面会展开
  val parents = getOrCreateParentStages(rdd, jobId)
  // 经过上面的方法,该RDD所有父辈都被划分为不同的Stages,下面就是对仅剩的这个RDD封装为 ResultStage
  // 获取一个自增ID,实例化 ResultStage,内部有字段和当前jobId绑定。因此ResultStage和job是一一对应
  val id = nextStageId.getAndIncrement()
  val stage = new ResultStage(id, rdd, func, partitions, parents, jobId, callSite)
  stageIdToStage(id) = stage
  updateJobIdStageIdMaps(jobId, stage)
  // 完成上面所有的过程就将 ResultStage返回
  stage
}
private def getOrCreateParentStages(rdd: RDD[_], firstJobId: Int): List[Stage] = {
   
  // 遍历获取当前RDD的父依赖,
  getShuffleDependencies(rdd).map {
    shuffleDep =>
    getOrCreateShuffleMapStage(shuffleDep, firstJobId)
  }.toList
}
private[scheduler] def getShuffleDependencies(
    rdd: RDD[_]): HashSet[ShuffleDependency[_, _, _]] = {
   
  // 返回值容器
  val parents = new HashSet
当你提交Spark Core源码学习笔记的时候,并注册Driver的流程,以Java的WordCount为例。 首先,编写提交Job的代码: ```java import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.JavaSparkContext; public class WordCount { public static void main(String[] args) { SparkConf conf = new SparkConf().setAppName("WordCount"); JavaSparkContext sc = new JavaSparkContext(conf); JavaRDD<String> lines = sc.textFile(args[0]); JavaRDD<String> words = lines.flatMap(line -> Arrays.asList(line.split(" ")).iterator()); JavaPairRDD<String, Integer> pairs = words.mapToPair(word -> new Tuple2<>(word, 1)); JavaPairRDD<String, Integer> counts = pairs.reduceByKey((a, b) -> a + b); counts.saveAsTextFile(args[1]); } } ``` 之后就是提交任务和注册Driver: ```java import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.JavaSparkContext; import org.apache.spark.SparkContext; public class WordCount { public static void main(String[] args) { SparkConf conf = new SparkConf().setAppName("WordCount"); JavaSparkContext sc = new JavaSparkContext(conf); SparkContext spark = SparkContext.getOrCreate(conf); spark.addSparkListener(new MySparkListener()); JavaRDD<String> lines = sc.textFile(args[0]); JavaRDD<String> words = lines.flatMap(line -> Arrays.asList(line.split(" ")).iterator()); JavaPairRDD<String, Integer> pairs = words.mapToPair(word -> new Tuple2<>(word, 1)); JavaPairRDD<String, Integer> counts = pairs.reduceByKey((a, b) -> a + b); counts.saveAsTextFile(args[1]); } } class MySparkListener extends SparkListener { public void onApplicationStart(SparkListenerApplicationStart applicationStart) { String appName = applicationStart.appName(); System.out.println("Application started: " + appName); } public void onApplicationEnd(SparkListenerApplicationEnd applicationEnd) { long time = applicationEnd.time(); System.out.println("Application ended: " + time); } } ``` 这个代码会在提交任务的时候,自动注册Driver,并且添加了自定义的Listener。注意,在提交任务前,需要先启动Spark集群,并将提交路径和结果路径传递给代码的args数组。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值