Spark源码和调优简介 Spark Core

最新推荐文章于 2024-05-16 22:05:01 发布

腾讯技术工程

最新推荐文章于 2024-05-16 22:05:01 发布

阅读量1.7k

点赞数

本文链接：https://blog.csdn.net/Tencent_TEG/article/details/104013475

版权

本文深入分析Spark 2.4.4版本的源码，聚焦Spark Core，涵盖RDD、任务调度、存储管理和内存管理等方面。讨论了RDD的转换和行动操作，介绍了Spark的架构，包括SparkContext、SparkEnv、任务调度和存储管理的细节。文章还探讨了Shuffle的源码，重点关注BlockId、BlockInfo和内存分配策略，以及Tungsten内存管理机制。通过对Spark Job执行流程的剖析，展示了Stage、Task和Shuffle操作的执行逻辑。

摘要由CSDN通过智能技术生成

作者：calvinrzluo，腾讯 IEG 后台开发工程师

本文基于 Spark 2.4.4 版本的源码，试图分析其 Core 模块的部分实现原理，其中如有错误，请指正。为了简化论述，将部分细节放到了源码中作为注释，因此正文中是主要内容。

Spark Core

RDD

RDD(Resilient Distributed Dataset)，即弹性数据集是 Spark 中的基础结构。RDD 是 distributive 的、immutable 的，可以被 persist 到磁盘或者内存中。

对 RDD 具有转换操作和行动操作两种截然不同的操作。转换(Transform)操作从一个 RDD 生成另一个 RDD，但行动(Action)操作会去掉 RDD 的 Context。例如take是行动操作，返回的是一个数组而不是 RDD 了，如下所示

scala> var rdd1 = sc.makeRDD(Seq(10, 4, 2, 12, 3))
rdd1: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[40] at makeRDD at :21

scala> rdd1.take(1)
res0: Array[Int] = Array(10)

scala> rdd1.take(2)
res1: Array[Int] = Array(10, 4)

转换操作是 Lazy 的，直到遇到一个 Eager 的 Action 操作，Spark 才会生成关于整条链的执行计划并执行。这些 Action 操作将一个 Spark Application 分为了多个 Job。

常见的Action 操作包括：reduce、collect、count、take(n)、first、takeSample(withReplacement, num, [seed])、takeOrdered(n, [ordering])、saveAsTextFile(path)、saveAsSequenceFile(path)、saveAsObjectFile(path)、countByKey()、foreach(func)。

常见 RDD

RDD 是一个抽象类abstract class RDD[T] extends Serializable with Logging，在 Spark 中有诸如ShuffledRDD、HadoopRDD等实现。每个 RDD 都有对应的compute方法，用来描述这个 RDD 的计算方法。需要注意的是，这些 RDD 可能被作为某些 RDD 计算的中间结果，例如CoGroupedRDD，对应的，例如MapPartitionsRDD也可能是经过多个 RDD 变换得到的，其决定权在于所使用的算子。
我们来具体查看一些 RDD。

ParallelCollectionRDD
这个 RDD 由parallelize得到

scala> val arr = sc.parallelize(0 to 1000)
arr: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at <console>:24

HadoopRDD

class HadoopRDD[K, V] extends RDD[(K, V)] with Logging

FileScanRDD
这个 RDD 一般从spark.read.text(...)语句中产生，所以实现在sql 模块中。

class FileScanRDD(
   @transient private val sparkSession: SparkSession,
   readFunction: (PartitionedFile) => Iterator[InternalRow],
   @transient val filePartitions: Seq[FilePartition])
 extends RDD[InternalRow](sparkSession.sparkContext, Nil "InternalRow") {

MapPartitionsRDD

class MapPartitionsRDD[U, T] extends RDD[U]

这个 RDD 是map、mapPartitions、mapPartitionsWithIndex操作的结果。

注意，在较早期的版本中，map会得到一个MappedRDD，filter会得到一个FilteredRDD、flatMap会得到一个FlatMappedRDD，不过目前已经找不到了，统一变成MapPartitionsRDD

scala> val a3 = arr.map(i => (i+1, i))
a3: org.apache.spark.rdd.RDD[(Int, Int)] = MapPartitionsRDD[2] at map at <console>:25
scala> val a3 = arr.filter(i => i > 3)
a3: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[4] at filter at <console>:25
scala> val a3 = arr.flatMap(i => Array(i))
a3: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[5] at flatMap at <console>:25

join操作的结果也是MapPartitionsRDD，这是因为其执行过程的最后一步flatMapValues会创建一个MapPartitionsRDD

scala> val rdd1 = sc.parallelize(Array((1,1),(1,2),(1,3),(2,1),(2,2),(2,3)))
rdd1: org.apache.spark.rdd.RDD[(Int, Int)] = ParallelCollectionRDD[8] at parallelize at <console>:24

scala> val rdd2 = sc.parallelize(Array((1,1),(1,2),(1,3),(2,1),(2,2),(2,3)))
rdd2: org.apache.spark.rdd.RDD[(Int, Int)] = ParallelCollectionRDD[9] at parallelize at <console>:24

scala> val rddj = rdd1.join(rdd2)
rddj: org.apache.spark.rdd.RDD[(Int, (Int, Int))] = MapPartitionsRDD[12] at join at <console>:27

ShuffledRDD
ShuffledRDD用来存储所有 Shuffle 操作的结果，其中K、V很好理解，C是 Combiner Class。

class ShuffledRDD[K, V, C] extends RDD[(K, C)]

以groupByKey为例

scala> val a2 = arr.map(i => (i+1, i))
a2: org.apache.spark.rdd.RDD[(Int, Int)] = MapPartitionsRDD[2] at map at <console>:25

scala> a2.groupByKey
res1: org.apache.spark.rdd.RDD[(Int, Iterable[Int])] = ShuffledRDD[3] at groupByKey at <console>:26

注意，groupByKey需要 K 是 Hashable 的，否则会报错。

scala> val a2 = arr.map(i => (Array.fill(10)(i), i))
a2: org.apache.spark.rdd.RDD[(Array[Int], Int)] = MapPartitionsRDD[2] at map at <console>:25

scala> a2.groupByKey
org.apache.spark.SparkException: HashPartitioner cannot partition array keys.
 at org.apache.spark.rdd.PairRDDFunctions不能识别此Latex公式:
anonfun$combineByKeyWithClassTag$1.apply(PairRDDFunctions.scala:84)
 at org.apache.spark.rdd.PairRDDFunctionsanonfun<span class="katex-html" aria-hidden="true" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;"><span class="strut" style="height:0.8888799999999999em;vertical-align:-0.19444em;" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;"><span class="mord mathit" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">c<span class="mord mathit" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">o<span class="mord mathit" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">m<span class="mord mathit" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">b<span class="mord mathit" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">i<span class="mord mathit" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">n<span class="mord mathit" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">e<span class="mord mathit" style="margin-right:0.05017em;" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">B<span class="mord mathit" style="margin-right:0.03588em;" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">y<span class="mord mathit" style="margin-right:0.07153em;" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">K<span class="mord mathit" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">e<span class="mord mathit" style="margin-right:0.03588em;" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">y<span class="mord mathit" style="margin-right:0.13889em;" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">W<span class="mord mathit" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">i<span class="mord mathit" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">t<span class="mord mathit" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">h<span class="mord mathit" style="margin-right:0.07153em;" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">C<span class="mord mathit" style="margin-right:0.01968em;" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">l<span class="mord mathit" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">a<span class="mord mathit" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">s<span class="mord mathit" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">s<span class="mord mathit" style="margin-right:0.13889em;" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">T<span class="mord mathit" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">a<span class="mord mathit" style="margin-right:0.03588em;" style="font-size: inherit;color: inherit;line-height: 1.75;overflow-wrap: inherit !important;word-break: inherit !important;">g1.apply(PairRDDFunctions.scala:77)
</span class="mord mathit" style="margin-right:0.03588em;"></span class="mord mathit"></span class="mord mathit" style="margin-right:0.13889em;"></span class="mord mathit"></span class="mord mathit"></span class="mord mathit"></span class="mord mathit" style="margin-right:0.01968em;"></span class="mord mathit" style="margin-right:0.07153em;"></span class="mord mathit"></span class="mord mathit"></span class="mord mathit"></span class="mord mathit" style="margin-right:0.13889em;"></span class="mord mathit" style="margin-right:0.03588em;"></span class="mord mathit"></span class="mord mathit" style="margin-right:0.07153em;"></span class="mord mathit" style="margin-right:0.03588em;"></span class="mord mathit" style="margin-right:0.05017em;"></span class="mord mathit"></span class="mord mathit"></span class="mord mathit"></span class="mord mathit"></span class="mord mathit"></span class="mord mathit"></span class="mord mathit"></span class="strut" style="height:0.8888799999999999em;vertical-align:-0.19444em;"></span class="katex-html" aria-hidden="true">

CoGroupedRDD
```
class CoGroupedRDD[K] extends RDD[(K, Array[Iterable[_]])]
```
首先，我们需要了解一下什么是cogroup操作，这个方法有多个重载版本。如下所示的版本，对this或other1或other2的所有的 key，生成一个RDD[(K, (Iterable[V], Iterable[W1], Iterable[W2]))，表示对于这个 key，这三个 RDD 中所有值的集合。容易看到，这个算子能够被用来实现 Join 和 Union（不过后者有点大材小用了）
```
def cogroup[W1, W2](other1: RDD[(K, W1 "W1, W2")], other2: RDD[(K, W2)], partitioner: Partitioner)
 : RDD[(K, (Iterable[V], Iterable[W1], Iterable[W2]))]
```
这里的Partitioner是一个abstract class，具有numPartitions: Int和getPartition(key: Any): Int两个方法。通过继承Partitioner可自定义分区的实现方式，目前官方提供有RangePartitioner和HashPartitioner等。

UnionRDD

class UnionRDD[T] extends RDD[T]

UnionRDD一般通过union算子得到

scala> val a5 = arr.union(arr2)
a5: org.apache.spark.rdd.RDD[Int] = UnionRDD[7] at union at <console>:27

CoalescedRDD

常见 RDD 外部函数

Spark 在 RDD 之外提供了一些外部函数，它们可以通过隐式转换的方式变成 RDD。

PairRDDFunctions
这个 RDD 被用来处理 KV 对，相比RDD，它提供了groupByKey、join等方法。以combineByKey为例，他有三个模板参数，从 RDD 过来的K和V以及自己的C。相比 reduce 和 fold 系列的(V, V)=> V，这多出来的C使combineByKey更灵活，通过combineByKey能够将V变换为C。

   def combineByKey[C](
       createCombiner: V => C,
       mergeValue: (C, V) => C,
       mergeCombiners: (C, C) => C,
       partitioner: Partitioner,
       mapSideCombine: Boolean = true,
       serializer: Serializer = null): RDD[(K, C)] = {
       //实现略
   }

OrderedRDDFunctions
这个用来提供sortByKey、filterByRange等方法。

Spark 的架构概览

Spark 在设计上的一个特点是它和下层的集群管理是分开的，一个 Spark Application 可以看做是由集群上的若干进程组成的。因此，我们需要区分 Spark 中的概念和下层集群中的概念，例如我们常见的 Master 和 Worker 是集群中的概念，表示节点；而 Driver 和 Executor 是 Spark 中的概念，表示进程。根据爆栈网，Driver 可能位于某个 Worker 节点中，或者位于 Master 节点上，这取决于部署的方式

在官网上给了这样一幅图，详细阐明了 Spark 集群下的基础架构。SparkContext是整个 Application 的管理核心，由 Driver 来负责管理。SparkContext负责管理所有的 Executor，并且和下层的集群管理进行交互，以请求资源。

在 Stage 层次及以上接受DAGScheduler的调度，而TaskScheduler则调度一个 Taskset。在 Spark on Yarn 模式下，CoarseGrainedExecutorBackend 和 Executor 一一对应，它是一个独立于 Worker 主进程之外的一个进程，我们可以 jps 查看到。而 Task 是作为一个 Executor 启动的一个线程来跑的，一个 Executor 中可以跑多个 Task。

在实现上，CoarseGrainedExecutorBackend继承了ExecutorBackend这个 trait，作为一个IsolatedRpcEndpoint，维护Executor对象实例，并通过创建的DriverEndpoint实例的与 Driver 进行交互。

在进程启动时，CoarseGrainedExecutorBackend调用onStart()方法向 Driver 注册自己，并产生一条"Connecting to driver的 INFO。CoarseGrainedExecutorBackend通过DriverEndpoint.receive方法来处理来自 Driver 的命令，包括LaunchTask、KillTask等。这里注意一下，在 scheduler 中有一个CoarseGrainedSchedulerBackend，里面实现相似，在看代码时要注意区分开。

有关 Executor 和 Driver 的关系，下面这张图更加直观，需要说明的是，一个 Worker 上面也可能跑有多个 Executor，每个 Task 也可以在多个 CPU 核心上面运行

Spark 上下文

在代码里我们操作一个 Spark 任务有两种方式，通过 SparkContext，或者通过 SparkSession

SparkContext 方式
SparkContext 是 Spark 自创建来一直存在的类。我们通过 SparkConf 直接创建 SparkContext

val sparkConf = new SparkConf().setAppName("AppName").setMaster("local")
val sc = new SparkContext(sparkConf).set("spark.some.config.option", "some-value")

SparkSession 方式
SparkSession 是在 Spark2.0 之后提供的 API，相比 SparkContext，他提供了对 SparkSQL 的支持（持有SQLContext），例如createDataFrame等方法就可以通过 SparkSession 来访问。

在builder.getOrCreate()的过程中，虽然最终得到的是一个 SparkSession，但实际上内部已经创建了一个 SparkContext，并由这个 SparkSession 持有。

   val spark: SparkSession = SparkSession.builder() // 得到一个Builder
   .master("local").appName("AppName").config("spark.some.config.option", "some-value")
.getOrCreate() // 得到一个SparkSession

   // SparkSession.scala
   val sparkContext = userSuppliedContext.getOrElse {
     val sparkConf = new SparkConf()
  options.foreach { case (k, v) => sparkConf.set(k, v) }

     // set a random app name if not given.
     if (!sparkConf.contains("spark.app.name")) {
       sparkConf.setAppName(java.util.UUID.randomUUID().toString)
  }

     SparkContext.getOrCreate(sparkConf)
     // Do not update `SparkConf` for existing `SparkContext`, as it's shared by all sessions.
}

applyExtensions(sparkContext.getConf.get(StaticSQLConf.SPARK_SESSION_EXTENSIONS).getOrElse(Seq.empty),extensions)

   session = new SparkSession(sparkContext, None, None, extensions)

SparkEnv

SparkEnv持有一个 Spark 实例在运行时所需要的所有对象，包括 Serializer、RpcEndpoint（在早期用的是 Akka actor）、BlockManager、MemoryManager、BroadcastManager、SecurityManager、MapOutputTrackerMaster/Worker 等等。

SparkEnv 由 SparkContext 创建，并在之后通过伴生对象SparkEnv的get方法来访问。

在创建时，Driver 端的 SparkEnv 是 SparkContext 创建的时候调用SparkEnv.createDriverEnv创建的。Executor 端的是其守护进程CoarseGrainedExecutorBackend创建的时候调用SparkEnv.createExecutorEnv方法创建的。这两个方法最后都会调用create方法

// Driver端
private[spark] def createSparkEnv(
    conf: SparkConf,
    isLocal: Boolean,
    listenerBus: LiveListenerBus): SparkEnv = {
  SparkEnv.createDriverEnv(conf, isLocal, listenerBus, SparkContext.numDriverCores(master, conf))
}
_env = createSparkEnv(_conf, isLocal, listenerBus)
SparkEnv.set(_env)

// Executor端
// CoarseGrainedExecutorBackend.scala
val env = SparkEnv.createExecutorEnv(driverConf, arguments.executorId, arguments.bindAddress,
  arguments.hostname, arguments.cores, cfg.ioEncryptionKey, isLocal = false)

env.rpcEnv.setupEndpoint("Executor", backendCreateFn(env.rpcEnv, arguments, env))
arguments.workerUrl.foreach { url =>
  env.rpcEnv.setupEndpoint("WorkerWatcher", new WorkerWatcher(env.rpcEnv, url))
}
env.rpcEnv.awaitTermination()

// SparkEnv.scala
// create函数
val blockManager = new BlockManager(...)