Hive SQL中的MapReduce有几个Reduce任务

源码分析

分析三个In Order to的默认值是怎么来的
1
2
3
4
5
6
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>

 

默认情况下 reduceTasks=NumOf(Reduces)

1
2
3
4
5
6
7
8
// Divide it by 2 so that we can have more reducers
//BYTESPERREDUCER:bytes per reducer 每一个reducer处理多少字符
//numberOfBytes:Reduce程序接受到的字符数目
//maxReducers:Hive 配置中 MapR程序最大的Reduce数目
long bytesPerReducer = 
context.getConf().getLongVar(HiveConf.ConfVars.BYTESPERREDUCER) / 2;
          int numReducers = Utilities.estimateReducers(numberOfBytes, bytesPerReducer,
        maxReducers, false);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
//寻找到关于如何获得reduce数量的函数
//boolean powersOfTwo:是否按照2的幂来增加reducers

public static int estimateReducers(long totalInputFileSize, long bytesPerReducer,
      int maxReducers, boolean powersOfTwo) {  
      //判断输入文件大小是否超过每个reduce处理的数量
      //获得输入文件字符,不超过bytesPerReducer,则按照不超过bytesPerReducer考虑
    double bytes = Math.max(totalInputFileSize, bytesPerReducer);//判断输入文件
     //ceil向上取整,例0.1 取1,1.5取2
    int reducers = (int) Math.ceil(bytes / bytesPerReducer);
   //至少是1
    reducers = Math.max(1, reducers);
    //判断是否超过最大限制,超过则按照最大值取,不超过则按照计算值取
    reducers = Math.min(maxReducers, reducers);
    //计算reduce函数的log2(reducers),计算以2为底,reducers的对数值
    int reducersLog = (int)(Math.log(reducers) / Math.log(2)) + 1;
   //计算以2为底,reducersLog为幂的值
    int reducersPowerTwo = (int)Math.pow(2, reducersLog);

    if (powersOfTwo) {
      // If the original number of reducers was a power of two, use that
      if (reducersPowerTwo / 2 == reducers) {
        // nothing to do
      } else if (reducersPowerTwo > maxReducers) {
        // If the next power of two greater than the original number of reducers is greater
        // than the max number of reducers, use the preceding power of two, which is strictly
        // less than the original number of reducers and hence the max
        reducers = reducersPowerTwo / 2;
      } else {
        // Otherwise use the smallest power of two greater than the original number of reducers
        reducers = reducersPowerTwo;
      }
    }
    return reducers;
  }
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

大锤爱编程

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值