Spark源码分析（九）：ByPassMergeSortShuffleWriter

最新推荐文章于 2023-08-08 21:21:51 发布

lxlneversettle

最新推荐文章于 2023-08-08 21:21:51 发布

阅读量402

点赞数

分类专栏： spark core 文章标签： spark

本文链接：https://blog.csdn.net/lxlneversettle/article/details/89053255

版权

spark core 专栏收录该内容

10 篇文章 1 订阅

订阅专栏

ByPassMergeSortShuffleWriter

使用条件:
(1)不需要进行map端的聚合
(2)partition数量小于spark.shuffle.sort.bypassMergeThreshold，默认是200
下面首先看write()

public void write(Iterator<Product2<K, V>> records) throws IOException {
 assert (partitionWriters == null);
 if (!records.hasNext()) {
   partitionLengths = new long[numPartitions];
   shuffleBlockResolver.writeIndexFileAndCommit(shuffleId, mapId, partitionLengths, null);
   mapStatus = MapStatus$.MODULE$.apply(blockManager.shuffleServerId(), partitionLengths);
   return;
 }
 final SerializerInstance serInstance = serializer.newInstance();
 final long openStartTime = System.nanoTime();
 // 根据下游stage的partition个数，来创建和partition个数相同的writer
 partitionWriters = new DiskBlockObjectWriter[numPartitions];
 // 同时创建和partition个数相同的文件段对象
 partitionWriterSegments = new FileSegment[numPartitions];

 for (int i = 0; i < numPartitions; i++) {
   // 在磁盘上创建文件
   final Tuple2<TempShuffleBlockId, File> tempShuffleBlockIdPlusFile =
     blockManager.diskBlockManager().createTempShuffleBlock();
   final File file = tempShuffleBlockIdPlusFile._2();
   final BlockId blockId = tempShuffleBlockIdPlusFile._1();
   // 每个writer关联了一个文件
   partitionWriters[i] =
     blockManager.getDiskWriter(blockId, file, serInstance, fileBufferSize, writeMetrics);
 }
 // Creating the file to write to and creating a disk writer both involve interacting with
 // the disk, and can take a long time in aggregate when we open many files, so should be
 // included in the shuffle write time.
 writeMetrics.incWriteTime(System.nanoTime() - openStartTime);

 while (records.hasNext()) {
   final Product2<K, V> record = records.next();
   final K key = record._1();
   // 将记录写入到对应partition的文件中，此时并没有对文件进行排序
   partitionWriters[partitioner.getPartition(key)].write(key, record._2());
 }

 for (int i = 0; i < numPartitions; i++) {
   final DiskBlockObjectWriter writer = partitionWriters[i];
   // 代表文件中一段数据的对象，包括所属的文件，开始偏移量和长度
   partitionWriterSegments[i] = writer.commitAndGet();
   writer.close();
 }

 // 从这里也可以知道，这里处理的数据是每个shuffleMapTask处理的数据
 // 所以只有一个shuffleId和一个mapId
 // 使用shuffleId和mapId可以确定唯一标示
 File output = shuffleBlockResolver.getDataFile(shuffleId, mapId);
 File tmp = Utils.tempFileWith(output);
 try {
   // 合并多个临时文件，形成一个文件
   partitionLengths = writePartitionedFile(tmp);
   // 创建索引文件并提交
   shuffleBlockResolver.writeIndexFileAndCommit(shuffleId, mapId, partitionLengths, tmp);
 } finally {
   if (tmp.exists() && !tmp.delete()) {
     logger.error("Error while deleting temp file {}", tmp.getAbsolutePath());
   }
 }
 mapStatus = MapStatus$.MODULE$.apply(blockManager.shuffleServerId(), partitionLengths);
}

大致流程如下:

每个mapper都会创建reducer个数个临时文件
针对每条数据将它们写入所属文件，注意这里并没有在内存中缓存记录
当所有数据都写入完毕之后，在合并各个文件，并且生成一个索引文件

lxlneversettle

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Spark源码分析（九）：ByPassMergeSortShuffleWriter

ByPassMergeSortShuffleWriter使用条件:(1)不需要进行map端的聚合(2)partition数量小于spark.shuffle.sort.bypassMergeThreshold，默认是200下面首先看write()public void write(Iterator<Product2<K, V>> records) throws IO...
复制链接

扫一扫