【Spark三十三】Spark Sort based Shuffle

1. N个partition,会产生N个MapTask,如果不指定ReduceTask的个数,那么默认情况下,ReduceTask个数也为N

2. N个partition,即N个MapTask,同时有N个ReduceTask(默认是N,不指定ReduceTask个数的情况下),那么会产生不大于N的.data文件以及不大于N的index文件。

即文件个数不受ReduceTask个数的影响。

 

如下所示:index个数是5,data个数是4,这些数字跟Reducer个数无关,而与Mapper个数有关。那么为什么data文件个数不是5个,这些文件是Mapper如何写入的?其中包含了什么内容,接下来分析。

 

hadoop$ pwd
/tmp/spark-local-20150129214306-5f46
hadoop$ cd spark-local-20150129214306-5f46/
hadoop$ ls
0c  0d  0e  0f  11  13  15  29  30  32  36  38
hadoop$ find . -name "*.index" | xargs ls -l
-rw-rw-r-- 1 hadoop hadoop 48  1月 29 21:43 ./0c/shuffle_0_4_0.index
-rw-rw-r-- 1 hadoop hadoop 48  1月 29 21:43 ./0d/shuffle_0_3_0.index
-rw-rw-r-- 1 hadoop hadoop 48  1月 29 21:43 ./0f/shuffle_0_1_0.index
-rw-rw-r-- 1 hadoop hadoop 48  1月 29 21:43 ./30/shuffle_0_0_0.index
-rw-rw-r-- 1 hadoop hadoop 48  1月 29 21:43 ./32/shuffle_0_2_0.index
hadoop$ find . -name "*.data" | xargs ls -l
-rw-rw-r-- 1 hadoop hadoop 884  1月 29 21:43 ./0c/shuffle_0_0_0.data
-rw-rw-r-- 1 hadoop hadoop 685  1月 29 21:43 ./15/shuffle_0_1_0.data
-rw-rw-r-- 1 hadoop hadoop 710  1月 29 21:43 ./29/shuffle_0_3_0.data
-rw-rw-r-- 1 hadoop hadoop 693  1月 29 21:43 ./36/shuffle_0_2_0.data

 

ShuffleMapTask执行逻辑

既然N个partition写入到若干个data文件中,那么当Reducer过来拉取数据时,必须得知道,属于它的那部分数据

 

data和index文件名中的reduceId都是0,原因是(见IndexShuffleBlockManager)

 

 

  def getDataFile(shuffleId: Int, mapId: Int): File = {
    ///相同的shuffleId和mapId指向同一个文件
    blockManager.diskBlockManager.getFile(ShuffleDataBlockId(shuffleId, mapId, 0))
  }

  private def getIndexFile(shuffleId: Int, mapId: Int): File = {
    ///相同的shuffleId和mapId指向同一个文件
    blockManager.diskBlockManager.getFile(ShuffleIndexBlockId(shuffleId, mapId, 0))
  }

 

 

 

 

  def writePartitionedFile(
      blockId: BlockId,
      context: TaskContext,
      outputFile: File): Array[Long] = {

    // Track location of each range in the output file
    val lengths = new Array[Long](numPartitions)

    if (bypassMergeSort && partitionWriters != null) {
      // We decided to write separate files for each partition, so just concatenate them. To keep
      // this simple we spill out the current in-memory collection so that everything is in files.
      spillToPartitionFiles(if (aggregator.isDefined) map else buffer)
      partitionWriters.foreach(_.commitAndClose())
      var out: FileOutputStream = null
      var in: FileInputStream = null
      try {
        out = new FileOutputStream(outputFile, true)
        for (i <- 0 until numPartitions) {
          in = new FileInputStream(partitionWriters(i).fileSegment().file)
          val size = org.apache.spark.util.Utils.copyStream(in, out, false, transferToEnabled)
          in.close()
          in = null
          lengths(i) = size
        }
      } finally {
        if (out != null) {
          out.close()
        }
        if (in != null) {
          in.close()
        }
      }
    } else {
      // Either we're not bypassing merge-sort or we have only in-memory data; get an iterator by
      // partition and just write everything directly.
      ///this.partitionedIterator包含了数据集合
      for ((id, elements) <- this.partitionedIterator) {
        if (elements.hasNext) { //如果elements有元素
          val writer = blockManager.getDiskWriter(
            blockId, outputFile, ser, fileBufferSize, context.taskMetrics.shuffleWriteMetrics.get)
          for (elem <- elements) { ///遍历集合,写入文件
            writer.write(elem) ///追加数据,遍历结束后再关闭
          }
          writer.commitAndClose() ///关闭文件
          val segment = writer.fileSegment() ///这是什么东东?
          lengths(id) = segment.length
        }
      }
    }

    context.taskMetrics.memoryBytesSpilled += memoryBytesSpilled
    context.taskMetrics.diskBytesSpilled += diskBytesSpilled
    context.taskMetrics.shuffleWriteMetrics.filter(_ => bypassMergeSort).foreach { m =>
      if (curWriteMetrics != null) {
        m.shuffleBytesWritten += curWriteMetrics.shuffleBytesWritten
        m.shuffleWriteTime += curWriteMetrics.shuffleWriteTime
      }
    }

    lengths
  }

 

 

 

 

 

 

 

 

 

 

 

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值