Hive order by limit和sort by limit区别

一、导言

order by和sort by都是hive中的排序方式,区别在于order by只有一个reduce对全表数据进行排序,sort by会按照key值分区在不同的reduce内部排序。那么对排序结果取limit的话,Hive会怎么处理呢?

二、执行计划比较

数据表myhive.score

select * from score;
score.s_id      score.c_id      score.s_score   score.month
06      04      80      201806
06      05      80      201806
06      06      80      201806
04      05      80      201806
06      03      34      201806
07      02      89      201806
07      03      98      201806

1、order by limit
在日志中,该任务只有一个job,且Hive在编译时即已确认reduce数量为1,所以limit对reduce中所有数据排序结果取topN。

hive (myhive)> select * from score order by s_id limit 4;
2021-11-26 13:52:27,404 INFO  [ac29ef6d-3004-4b12-8f18-e7f01a936167 main] ql.Driver (SessionState.java:printInfo(1227)) - Total jobs = 1
2021-11-26 09:39:49,015 INFO  [ac29ef6d-3004-4b12-8f18-e7f01a936167 main] ql.Driver (Driver.java:launchTask(2662)) - Starting task [Stage-1:MAPRED] in serial mode
Number of reduce tasks determined at compile time: 1
2021-11-26 09:39:49,029 INFO  [ac29ef6d-3004-4b12-8f18-e7f01a936167 main] exec.Task (SessionState.java:printInfo(1227)) - Number of reduce tasks determined at compile time: 1
#编译时确定的reduce任务数:1

执行计划中,只对reduce结果进行limit操作

Explain
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: score
            Statistics: Num rows: 4 Data size: 2340 Basic stats: COMPLETE Column stats: NONE
            Select Operator
              expressions: s_id (type: string), c_id (type: string), s_score (type: int), month (type: string)
              outputColumnNames: _col0, _col1, _col2, _col3
              Statistics: Num rows: 4 Data size: 2340 Basic stats: COMPLETE Column stats: NONE
              Reduce Output Operator
                key expressions: _col0 (<
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值