从MapReduce的Shuffle原理进行生产参数调优

黄土高坡上的独孤前辈

已于 2022-08-03 15:56:16 修改

阅读量1.4k

点赞数 2

分类专栏： Hadoop 文章标签： mapreduce hadoop

于 2020-07-29 20:20:05 首次发布

本文链接：https://blog.csdn.net/lihuazaizheli/article/details/107674269

版权

Hadoop 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

文章目录

1.mapreduce的过程上图
2.map 切分输入文件
3.环形缓冲区
4.数据在spill到磁盘之前会做partition,sort操作
- 4.1 原理
- 4.2 生产调优
5. 溢写到磁盘(spill to disk)
6. Shuffle操作
7. Reduce

1.mapreduce的过程上图

上两张比较好的图,下面详细讲解,看完详解再看这两张图片会有更深刻认识

在这里插入图片描述

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-N9nf6q8N-1596025039706)(28207F00ECC24C67A64D881FB2036C5C)]

2.map 切分输入文件

首先hdfs上的文件,通过map进行切分,决定map的数量.具体map的个数由什么决定? 查看我的另一篇博客:

   https://blog.csdn.net/lihuazaizheli/article/details/107580462

//配置参数,详细源码介绍查看我的博客
mapreduce.input.fileinputformat.split.minsize //启动map最小的split size大小，默认0byte
mapreduce.input.fileinputformat.split.maxsize //启动map最大的split size大小，默认无限大
dfs.block.size  //block块大小，默认128M
计算公式：splitSize =  Math.max(minSize, Math.min(maxSize, blockSize));

上图中是有两个map,进入环形缓冲区.

3.环形缓冲区

3.1 原理

map的结果先放入缓冲区默认100M(其实先序列化)，当缓冲区的数据量达到阈值时(默认100M * 0.8 = 80M)，溢出行为会在一个后台线程执行开始spill操作。spill是将数据写入到磁盘。

# 环形缓冲区的理解
https://blog.csdn.net/qq_35468937/article/details/80669834
# map的spill理解
https://www.cnblogs.com/yesecangqiong/p/6283140.html
# MapReduce过程详解及其性能优化
https://www.jianshu.com/p/9e4d01b74600

3.2 生产调优

3.2.1 mapreduce.task.io.sort.mb(default:100m)

官网解释:The total amount of buffer memory to use while sorting files, in megabytes. By default, gives each merge stream 1MB, which should minimize seeks.

可以根据不同的硬件尤其是内存的大小来调整，调大的话，会减少磁盘spill的次数,这样减少了磁盘IO,加快了map处理速度,此时如果内存足够的话，一般都会显著提升性能。当调整这个参数时，最好同时检测Map任务的JVM的堆大小，并必要的时候增加堆空间。

<property>
	<name>mapreduce.task.io.sort.mb</name>
	<value>300</value>
	<description>shuffle 的环形缓冲区大小，默认100m</description>
</property>

3.2.2 mapreduce.map.sort.spill.percent(default:0.80)

官网解释:The soft limit in the serialization buffer. Once reached, a thread will begin to spill the contents to disk in the background. Note that collection will not block if this threshold is exceeded while a spill is already in progress, so spills may be larger than this threshold when it is set to less than .5

spill一般会在Buffer空间大小的80%开始进行spill磁盘操作.可以调大该参数,减少spill操作

<property>
	<name>mapreduce.map.sort.spill.percent</name>
	<value>0.88</value>
	<description>环形缓冲区溢出的阈值，默认80%</description>
</property>

4.数据在spill到磁盘之前会做partition,sort操作

4.1 原理

来一个自定义的分区排序就知道原理了

/**
 * @author 自定义分区继承Partitioner 就好了
 **/
public class PhonePartitioner extends Partitioner<Traffic,Text> {

    @Override
    public int getPartition(Traffic traffic, Text text, int numPartitions) {
        String phone = text.toString();
        if(phone.startsWith("13")) {
            return 0;
        } else if(phone.startsWith("15")) {
            return 1;
        } else {
            return 2;
        }
    }
}

实现WritableComparable接口或继承WritableComparator类可实现自定义排序
/**
 * 按照id升序分组
 * @author 实现WritableComparator 自定义升序
 **/
public class OrderGroupingComparator extends WritableComparator {

    // 一定要使用构造方法去调用父类的构造方法进行初始化
    public OrderGroupingComparator(){
        super(Order.class, true);
    }

    @Override
    public int compare(WritableComparable a, WritableComparable b) {
        Order order1 = (Order)a;
        Order order2 = (Order)b;

        int result;

        if(order1.getId() > order2.getId()) {
            result = 1;
        } else if(order1.getId() < order2.getId()) {
            result = -1;
        } else {
            result = 0;
        }

        return result;
    }
}

4.2 生产调优

 上图中spill到磁盘的紫色,绿色,橙色数据: 分别是3个分区,并默认是按照自然序升序.

自定义分区和排序此阶段无需调优,只需要理解 分区是 hash算法就行.

5. 溢写到磁盘(spill to disk)

5.1 原理

前面讲过环形缓冲区的数据溢写到磁盘由两个参数控制.这里讲讲merge:

Map Task在计算的时候会不断产生很多spill文件，在Map Task结束前会对这些spill文件进行合并，这些文件会根据情况合并到一个大的分区的、排序的文件中,这个过程就是merge的过程. 图中的蓝色,紫色,橙色分区内的数据进行聚合成一个大的文件.

Merge中有一个重要的调优方式,就是本地聚合Combiner.

spill是到磁盘,谈到写数据到磁盘,就可以联想到数据压缩.在数据量大的时候，对map输出要进行压缩。启用压缩，将mapreduce.map.output.compress设为true，并使用mapreduce.map.output.compress.codec设置使用的压缩算法。

5.2 生产调优参数

5.2.1 mapreduce.task.io.sort.factor（default：10）

 官网解释: The number of streams to merge at once while sorting files. This determines the number of open file handles

此参数代表进行merge的时候最多能同时merge多少spill，如果有100个spill个文件，此时就无法一次完成整个merge的过程，这个时候需要调大mapreduce.task.io.sort.factor（default：10）来减少merge的次数，从而减少磁盘的操作；

评价:此参数生产一般不用调整

5.2.2 mapreduce.map.combine.minspills （default：3）

  Combiner操作和Map在一个JVM中，是由min.num.spill.for.combine的参数决定的，默认是3，也就是说spill的文件数在默认情况下由三个的时候就要进行combine操作，最终减少磁盘数据；

评价: 这个是网上其他博主写的调优参数,在hadoop2.X的官当文档中已经找不到此参数.只找到下列参数

 mapreduce.task.combine.progress.records (default:10000)
  
 官网解释:The number of records to process during combine output collection before sending a progress notification.

5.2.3 Map输出生产设置Snappy压缩

<property>
	<name>mapreduce.map.output.compress</name>
	<value>true</value>
	<description>MAP输出压缩</description>
</property>

<property>
	<name>mapreduce.map.output.compress.codec</name>
	<value>org.apache.hadoop.io.compress.SnappyCodec</value>
	<description>压缩类
	org.apache.hadoop.io.compress.GzipCodec, 
	org.apache.hadoop.io.compress.DefaultCodec, 
	Default org.apache.hadoop.io.compress.BZip2Codec, 
	com.hadoop.compression.lzo.LzoCodec,
	com.hadoop.compression.lzo.LzopCodec,
	org.apache.hadoop.io.compress.Lz4Codec, 
	org.apache.hadoop.io.compress.SnappyCodec,
	</description>
</property>
  
  评价: 大数据的数据压缩格式一般有Snappy,Lzo,Gzip,Bzip2等,这里选择Snappy是因为其有压缩速度快,压缩率较高的特点(注意其不支持split),具体对比可查看下面博文
  
  
  hive 数据压缩与存储格式选择
  https://blog.csdn.net/wjl7813/article/details/79285542

6. Shuffle操作

shuffle是按照key将数据发送到不同的reduce,产生磁盘与网络IO,如果key分布不均匀,会产生数据倾斜.

通过下面的reduce源码也可以看到shuffle的解释: 
The Reducer copies the sorted output from each Mapper using HTTP across the network.
 
解释里面有一个关键词 copy ,下面会详细将copy干了什么?

6.1 点开Reduce的源码注释,可以看到Shuffle --> Sort --> SecondarySort

 Shuffle --> Sort --> SecondarySort
                排序     可以做二次排序
 比如一个二次排序的hive sql : select name,age,sex from student order by age asc,sex desc;
 这个上去了就是转化成mr进行二次排序的操作.

 * <p><code>Reducer</code> has 3 primary phases:</p>
 * <ol>
 *   <li>
 *   
 *   <h4 id="Shuffle">Shuffle</h4>
 *   
 *   <p>The <code>Reducer</code> copies the sorted output from each 
 *   {@link Mapper} using HTTP across the network.</p>
 *   </li>
 *   
 *   <li>
 *   <h4 id="Sort">Sort</h4>
 *   
 *   <p>The framework merge sorts <code>Reducer</code> inputs by 
 *   <code>key</code>s 
 *   (since different <code>Mapper</code>s may have output the same key).</p>
 *   
 *   <p>The shuffle and sort phases occur simultaneously i.e. while outputs are
 *   being fetched they are merged.</p>
 *      
 *   <h5 id="SecondarySort">SecondarySort</h5>
 *   
 *   <p>To achieve a secondary sort on the values returned by the value 
 *   iterator, the application should extend the key with the secondary
 *   key and define a grouping comparator. The keys will be sorted using the
 *   entire key, but will be grouped using the grouping comparator to decide
 *   which keys and values are sent in the same call to reduce.The grouping 
 *   comparator is specified via 
 *   {@link Job#setGroupingComparatorClass(Class)}. The sort order is
 *   controlled by 
 *   {@link Job#setSortComparatorClass(Class)}.</p>
@Checkpointable
@InterfaceAudience.Public
@InterfaceStability.Stable
public class Reducer<KEYIN,VALUEIN,KEYOUT,VALUEOUT> {
......
}

6.2 Copy – reduch fetch map端输出数据

1.首先,Reduce任务通过HTTP向各个Map任务下载获取的数据（需要网络传输）
为了预防reduce任务失败需要重做，map输出数据是在整个作业完成之后才被删除掉,
此处有一个参数mapreduce.job.reduce.slowstart.completedmaps来控制Map完成多少,开始reduce作业.


2.其次,默认情况下，每个Reducer只会有5个map端并行的下载线程在从map下载数据.
在Reducer内存和网络都比较好的情况下，可以调大该参数mapreduce.reduce.shuffle.parallelcopies;
大图上的是开启了3个线程下载map端的数据. 


3.reduce的每一个下载线程在下载map数据时出错，调整等待时间，尝试从其他地方下载；
如果集群环境的网络本身是瓶颈，那么用户可以通过调大这个参数来避免reduce下载线程被误判为失败的情况。
一般这种超时参数 mapreduce.reduce.shuffle.read.timeout,都是可以调大的,可以保证mapreduce运行的稳定性.

6.3 MergeSort – 重点 Merge的三种方式

这里的merge和map端的merge动作类似，只是数组中存放的是不同map端copy来的数值。

1.Copy过来的数据会先放入内存缓冲区中，然后当使用内存达到一定量的时候才spill磁盘。这里的缓冲区大小要比map端的更为灵活，它基于JVM的heap size设置,参数为mapreduce.reduce.shuffle.input.buffer.percent（default 0.7f)

2.内存到磁盘merge的启动门限可以通过mapreduce.reduce.shuffle.merge.percent（default0.66）配置,也就是溢写阈值为0.66.

如果该reduce task的最大heap使用量（通常通过mapreduce.admin.reduce.child.java.opts来设置，
比如设置为-Xmx1024m）的一定比例用来缓存数据。默认情况下，reduce会使用其heapsize的70%来在
内存中缓存数据。假设 mapreduce.reduce.shuffle.input.buffer.percent 为0.7，reducetask的max
heapsize为1G，那么用来做下载数据缓存的内存就为大概700MB左右。这700M的内存，跟map端一样，
也不是要等到全部写满才会往磁盘刷的，而是当这700M中被使用到了一定的限度（通常是一个百分比），
就会开始往磁盘刷（刷磁盘前会先做sortMerge）。这个限度阈值也是可以通过参数mapreduce.reduce.
shuffle.merge.percent（default0.66）来设定。与map端类似，这也是溢写的过程，这个过程中如果
你设置有Combiner，也是会启用的，然后在磁盘中生成了众多的溢写文件。这种merge方式一直在运行，
直到没有map端的数据时才结束，然后启动磁盘到磁盘的merge方式生成最终的那个文件



3.[重点]这里Merge有三种形式

3.1 内存到内存（memToMemMerger）

Hadoop定义了一种MemToMem合并，这种合并将内存中的map输出合并，然后再写入内存。这种合并默认关闭，
可以通过mapreduce.reduce.merge.memtomem.enabled(default:false)打开，当map输出文件达到mapreduce.
reduce.merge.memtomem.threshold时，触发这种合并。


3.2 内存中Merge（inMemoryMerger)

当缓冲中数据达到配置的阈值时，这些数据在内存中被合并、写入机器磁盘。阈值有2种配置方式：

(1)配置内存比例 mapreduce.reduce.shuffle.merge.percent(default:0.66)
 
   官网解释: The usage threshold at which an in-memory merge will be initiated, expressed as
   a percentage of the total memory allocated to storing in-memory map outputs, as defined 
   by mapreduce.reduce.shuffle.input.buffer.percent.
    
    前面提到reduceJVM堆内存的一部分用于存放来自map任务的输入，在这基础之上配置一个开始合并数据
    的比例。假设用于存放map输出的内存为500M，mapreduce.reduce.shuffle.merge.percent配置为0.66
    ，则当内存中的数据达到330M的时候，会触发合并写入。
         
(2)配置map输出数量 mapreduce.reduce.merge.inmem.threshold(default:1000)
    
   官网解释: The threshold, in terms of the number of files for the in-memory merge process. When we 
	accumulate threshold number of files we initiate the in-memory merge and spill to disk.
	A value of 0 or less than 0 indicates we want to DON'T have any threshold and instead
	depend only on the ramfs's memory consumption to trigger the merge.
  
    通过mapreduce.reduce.merge.inmem.threshold配置。在合并的过程中，会对被合并的文件做全局的排序
    。如果作业配置了Combiner，则会运行combine函数，减少写入磁盘的数据量。



3.3 磁盘上的Merge（onDiskMerger）

(1)Copy过程中磁盘Merge:

    在copy过来的数据不断写入磁盘的过程中，一个后台线程会把这些文件合并为更大的、有序的文件
    。如果map的输出结果进行了压缩，则在合并过程中，需要在内存中解压后才能给进行合并。这里的
    合并只是为了减少最终合并的工作量，也就是在map输出还在拷贝时，就开始进行一部分合并工作。
    合并的过程一样会进行全局排序。

(2)最终磁盘中Merge : 将上面不同方式的Merge,进行最终的合并.

   最后（所以map输出都拷贝到reduce之后）进行合并的map输出可能来自合并后写入磁盘的文件，也可
   能来及内存缓冲，在最后写入内存的map输出可能没有达到阈值触发合并，所以还留在内存中。

   mapreduce.task.io.sort.factor（default：10）也是作用于reduce端的合并因子.
   
   
   每一轮合并不一定合并平均数量的文件数，指导原则是使用整个合并过程中写入磁盘的数据量最小，
   为了达到这个目的，则需要最终的一轮合并中合并尽可能多的数据，因为最后一轮的数据直接作为reduce
   的输入，无需写入磁盘再读出。因此我们让最终的一轮合并的文件数达到最大，即合并因子的值，
   通过mapreduce.task.io.sort.factor（default：10）来配置

6.4 生产调优参数

6.4.1 mapreduce.job.reduce.slowstart.completedmaps(default:0.05)

   官网解释: Fraction of the number of maps in the job which should be complete before reduces are scheduled for the job.

配置多少才合适呢？

   mapreduce.job.reduce.slowstart.completedmaps这个参数
   (1)如果设置的过低，那么reduce就会过早地申请资源，造成资源浪费,试想如果资源紧张的情况下,过早申请reduc资
   源,会导致map需要资源去跑，reduce需要等map全部跑完才能进行下一个阶段，这样就形成相互等待，类似死锁的
   情况,最后AppMaster会kill掉reduce释放资源给map；
   (2)如果这个参数设置的过高，比如为1，那么只有当map全部完成后，才为reduce申请资源，开始进行reduce操作，
   实际上是串行执行，不能采用并行方式充分利用资源。如果map数量比较多，一般建议提前开始为reduce申请资源
   。

<property>
	<name>mapreduce.job.reduce.slowstart.completedmaps</name>
	<value>0.7</value>
	<description>当MAP完成多少后，申请REDUCE资源开始执行REDUCE,默认0.05</description>
</property>

评价:生产上一般配置0.7或0.8比较好.

6.4.2 mapreduce.reduce.shuffle.parallelcopies (default:5)

 官网解释:The maximum number of ms the reducer will delay before retrying to download map data.

默认情况下，每个Reducer只会有5个map端并行的下载线程在从map下数据，如果一个时间段内job完成的map有100个
或者更多，那么reduce也最多只能同时下载5个map的数据，如果想调大改参数,需要:

(1)map很多并且完成的比较快的job的情况下调大，有利于reduce更快的获取属于自己部分的数
(2)reducer内存和网络都比较好
  
评价:生产上一般不掉正

6.4.3 mapreduce.reduce.shuffle.read.timeout(default:180000ms)

官网解释: 
Expert: The maximum amount of time (in milli seconds) reduce task waits for map output data to be available for reading after obtaining connection.

reduce的每一个下载线程在下载某个map数据的时候，有可能因为那个map中间结果所在机器发生错误，
或者中间结果的文件丢失，或者网络瞬断等等情况，这样reduce的下载就有可能失败，所以reduce的
下载线程并不会无休止的等待下去，当一定时间后下载仍然失败，那么下载线程就会放弃这次下载，
并在随后尝试从另外的地方下载（因为这段时间map可能重跑）。reduce下载线程的这个最大的等待时间就是这个参数.


评价: 一般这种超时等待时间,都是可以调大的.如果没有出现reduce下载异常,不调整也是可以的.

6.4.4 mapreduce.reduce.shuffle.input.buffer.percent(default:0.70)

(1)官网解释:The percentage of memory to be allocated from the maximum heap size to storing map outputs during the shuffle

意思是说，shuffile在reduce内存中的数据最多使用内存量为：0.7 × maxHeap of reduce task。JVM的heapsize的70%


(2)评价: 生产中可以调大改值,但是要看情况

<property>
	<name>mapreduce.reduce.shuffle.input.buffer.percent</name>
	<value>0.81</value>
	<description>shuffle最大中REDUCE内存百分比,默认0.70  copy阶段用于保存map输出的堆内存比例</description>
</property>

<property>
	<name>mapreduce.reduce.shuffle.memory.limit.percent</name>
	<value>0.25</value>
	<description>单个shuffle最大中REDUCE内存百分比,默认0.25  开始spill的缓冲池比例阈值</description>
</property>


在使用phoenix的BulkLoad的时候,如果将mapreduce.reduce.shuffle.input.buffer.percent 调整的过大,
会出现如下错误:

Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#3
Caused by: java.lang.OutOfMemoryError: Java heap space

(3)此时需要将mapreduce.reduce.shuffle.input.buffer.percent调小,更多的溢写到磁盘,具体将mapred-site.xml
文件这两个参数进行设置,就能够实现phoenix的bulkLoad不报错

<property>
  <name>mapreduce.reduce.shuffle.input.buffer.percent</name>
  <value>0.6</value>
  <description>默认0.7,shuffle使用的内存比例0.6。 copy阶段用于保存map输出的堆内存比例</description>
</property>

<property>
  <name>mapreduce.reduce.shuffle.memory.limit.percent</name>
  <value>0.15</value>
  <description>默认0.25,单个shuffle任务能使用的内存限额，设置为0.15，即为 Shuffle内存 * 0.15。
        低于此值可以输出到内存，否则输出到磁盘</description>
</property>

<property>
  <name>mapreduce.reduce.shuffle.merge.percent</name>
  <value>0.9</value>
  <description>默认0.66.shuffle的数据量到Shuffle内存 ** 0.9的时候，启动合并。 开始spill的缓冲池比例阈值</description>
</property>


# phoenix bulkLoad (将csv数据批量bulkLoad到phoenix)
HADOOP_CLASSPATH=/phoenix/lib-aux/*:/hbase/hbase-1.4.8/lib/hbase-protocol-1.4.8.jar:/hbase/hbase-1.4.8/conf \
/hadoop/hadoop-2.9.1/bin/hadoop jar /phoenix/lib-aux/phoenix-4.14.1-HBase-1.4-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool  \
-conf /datax/bulkLoadConfig/mapred-site.xml \
--zookeeper  zookeeper01,zookeeper02,zookeeper03:2181:/hbase/db \
--schema ODS \
--table XXX \
--input /tmp/XXX \
--output  /tmp/XXX

# 参考资料
MapReduce 在Shuffle阶段 内存溢出原因分析及处理方法
https://blog.csdn.net/houzhizhen/article/details/84773884

6.4.5 mapreduce.reduce.shuffle.merge.percent(default:0.66)

官网解释: The usage threshold at which an in-memory merge will be initiated, expressed as a percentage of the total memory allocated to storing in-memory map outputs, as defined by mapreduce.reduce.shuffle.input.buffer.percent.

意思是说,shuffile在reduce内存中的数据最多使用内存量使用到了一定的限度（通常是一个百分比），就会开始往磁盘刷（刷磁盘前会先做sortMerge）.


评价: 生产上一般不调整此参数,在一些特定的情况下需要调整,如上述 phoenix 的bulkLoad

7. Reduce

7.1 Reduce 介绍

reduce数量决定最终文件的输出数量

当reduce将所有的map上对应自己partition的数据下载完成后，就会开始真正的reduce计算阶段

(1)mapreduce.reduce.input.buffer.percent(default 0.0)
The percentage of memory- relative to the maximum heap size- to retain map outputs during
the reduce. When the shuffle is concluded, any remaining map outputs in memory must consume 
less than this threshold before the reduce can begin.


从上述参数可以看出,默认情况下，reduce是全部从磁盘开始读处理数据.如果这个参数大于0，那么就会有
一定量的数据被缓存在内存并输送给reduce，当reduce计算逻辑消耗内存很小时，可以分一部分内存用来缓存数据
，可以提升计算的速度。所以默认情况下都是从磁盘读取数据，如果内存足够大的话，务必设置该参数让reduce
直接从缓存读数据，这样做就有点Spark Cache的感觉.


(2)Reduce在这个阶段，框架为已分组的输入数据中的每个 <key, (list of values)>对调用一次 reduce(WritableComparable,Iterator, OutputCollector, Reporter)方法。Reduce任务的输出通常是通过
调用 OutputCollector.collect(WritableComparable,Writable)写入文件系统的.

7.2 生产调优参数

7.2.1 mapreduce.reduce.memory.mb(default:1024m)

<property>
	<name>mapreduce.reduce.memory.mb</name>
	<value>2048</value>
	<description>REDUCE申请的内存大小(3072) </description>
</property>

<property>
	<name>mapreduce.map.memory.mb</name>
	<value>1024</value>
	<description>MAP申请的内存大小3072</description>
</property>

 评价:   单个map reduce的容器大小默认1024M,生产上可根据情况适当调大

7.2.2 mapred.child.java.opts(dfault:-Xmx200m)

运行map和reduce任务的JVM,下面是官方解释:

 Java opts for the task processes. The following symbol, if present, will be interpolated: @taskid@ is
 replaced by current TaskID. Any other occurrences of '@' will go unchanged. For example, to enable
 verbose gc logging to a file named for the taskid in /tmp and to set the heap maximum to be a gigabyte,
 pass a 'value' of: -Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc Usage of -Djava.library.path can cause
 programs to no longer function if hadoop native libraries are used. These values should instead be set as
 part of LD_LIBRARY_PATH in the map / reduce JVM env using the mapreduce.map.env and mapreduce.reduce.env config settings.

  <property>
      <name>mapred.child.java.opts</name>
      <value>-Xmx1200m</value>
      <description>MAP REDUCE运行的JVM内存</description>
   </property>


评价: 生产上可以将此参数过大5-10倍

7.3.3 Reduce端的输出压缩格式 Snappy 或 Lzo

<property>
	<name>mapreduce.output.fileoutputformat.compress</name>
	<value>true</value>
	<description>最终结果输出压缩</description>
</property>

<property>
	<name>mapreduce.output.fileoutputformat.compress.codec</name>
	<value>org.apache.hadoop.io.compress.SnappyCodec</value>
	<description>压缩类</description>
</property>

<property>
	<name>mapreduce.output.fileoutputformat.compress.type</name>
	<value>BLOCK</value>
	<description>压缩类型</description>
</property>


评价: (1)如果是hive可以通过参数控制输出的文件大小,保证每个除数的文件大小约等于128M, 此时可以使用Snappy.

      (2)但是如果不能够控制输出文件的大小,使用了Snappy,导致后面在处理这批数据(比如2G),只能够开启一个map来处理,因为Snappy是不能够分割的,所以需要使用Lzo这种能够分割的数据格式.


  # 参考
  hive 数据压缩与存储格式选择
  https://blog.csdn.net/wjl7813/article/details/79285542
  MapReduce过程详解及其性能优化
  https://blog.csdn.net/aijiudu/article/details/72353510

黄土高坡上的独孤前辈

关注

2
点赞
踩
12

收藏

觉得还不错? 一键收藏
0
评论
从MapReduce的Shuffle原理进行生产参数调优

文章目录1.mapreduce的过程上图2.map 切分输入文件3.环形缓冲区3.1 原理3.2 生产调优3.2.1 mapreduce.task.io.sort.mb(default:100m)3.2.2 mapreduce.map.sort.spill.percent(default:0.80)4.数据在spill到磁盘之前会做partition,sort操作4.1 原理4.2 生产调优5. 溢写到磁盘(spill to disk)5.1 原理5.2 生产调优参数5.2.1 mapreduce.task
复制链接

扫一扫