源码解析Spark各个ShuffleWriter的实现机制（三）——SortShuffleWriter

L4mbert

已于 2022-02-09 23:22:22 修改

阅读量1.1k

点赞数 1

分类专栏： Spark大数据文章标签： spark scala big data

于 2022-02-09 23:19:57 首次发布

本文链接：https://blog.csdn.net/christopher_l1n/article/details/122851742

版权

基于3.2版本分支。

SortShuffleWriter

想象远超内存大小的数据需要排序的场景，显然全量加载数据到内存进行排序是不可行的，那就需要将数据放在硬盘中，即外部排序，Spark采取的是归并排序¹，这就是SortShuffleWriter做的事情。

SortShuffleWriter在写入时，会根据是否有mapSideCombine²选择使用不同的数据结构来进行排序。有mapSideCombine，那么采取map或buffer。

归并排序分为两个阶段，第一阶段是分片输出有序文件，第二阶段是归并输出整体有序文件。先看Spark中对第一阶段是如何实现的：

// SortShuffleWriter
override def write(records: Iterator[Product2[K, V]]): Unit = {
   
  sorter = if (dep.mapSideCombine) {
   
    // 有mapSideCombine，就有传入aggerator和keyOrdering
    new ExternalSorter[K, V, C](
      context, dep.aggregator, Some(dep.partitioner), dep.keyOrdering, dep.serializer)
  } else {
   
    // In this case we pass neither an aggregator nor an ordering to the sorter, because we don't
    // care whether the keys get sorted in each partition; that will be done on the reduce side
    // if the operation being run is sortByKey.
    // 没有mapSideCombine，aggerator和keyOrdering就都是空的
    // 结合上边的官方注释，在不需要mapSideCombine的场景下，就不需要关心每个rdd分区中key的排序；
    // 如有需要，就在sortByKey即reduce端进行排序
    new ExternalSorter[K, V, V](
      context, aggregator = None, Some(dep.partitioner), ordering = None, dep.serializer)
  }
  // 也就是在insertAll中通过对有无aggreator的判断，选择了对应的数据结构，完成排序并写入到外部文件中，
  // 需要进一步查看该实现
  sorter.insertAll(records)

  // 省略归并排序第二阶段
  // ...
}

// ExternalSort
def insertAll(records: Iterator[Product2[K, V]]): Unit = {
   
  // TODO: stop combining if we find that the reduction factor isn't high
  val shouldCombine = aggregator.isDefined

  if (shouldCombine) {
   
    // 这里也就是在实现mapSideCombine，将相同Key的结果聚合起来
    // Combine values in-memory first using our AppendOnlyMap
    val mergeValue = aggregator.get.mergeValue
    val createCombiner = aggregator.get.createCombiner
    var kv: Product2[K, V] = null
    val update = (hadValue: Boolean, oldValue: C) => {
   
      // 聚合相同key对应的值
      if (hadValue) mergeValue(oldValue, kv._2) else createCombiner(kv._2)
    }
    while (records.hasNext) {
   
      // 记录下读取的数据条数
      addElementsRead()
      kv = records.next()
      // 更新key对应的值
      map.changeValue((getPartition(kv._1), kv._1), update)
      // 这里判断是否需要分割到不同的文件，需要进一步查看该实现
      maybeSpillCollection(usingMap = true)
    }
  } else {
   
    // Stick values into our buffer
    // 不需要聚合相同key
    while (records.hasNext) {
   
      addElementsRead()
      val kv = records.next