Hive执行顺序

探究一下hql的执行顺序

from … on … join … where … group by … having … select … distinct … order by … limit

之前有个疑惑:

explain 
select sid,min(score) as ms
from sc2 
where sid>10
group by sid
having ms>60 and sid>20;
order by ms 
;

以上语句编译可以通过,为什么select 后面的别名,可以在having 中使用

后来看到这句话:HiveSQL基于MySQL存储的元数据信息,HAVING后可使用SELECT指定的别名;

 

执行计划:

hive (myhive2)>  explain select 100,count(*) as cc from sc2 where sid>10 group by cid having cc>2;
执行计划如下:


OK
Explain
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:  
          TableScan                                  //这里是from
            alias: sc2
            Statistics: Num rows: 23 Data size: 190 Basic stats: COMPLETE Column stats: NONE
            Filter Operator
              predicate: (sid > 10) (type: boolean)     //这里是where
              Statistics: Num rows: 7 Data size: 57 Basic stats: COMPLETE Column stats: NONE
              Select Operator   //之前还以为这里是最终的select,其实不是,但我也不知道为什么这里有个select
                expressions: cid (type: int)
                outputColumnNames: cid
                Statistics: Num rows: 7 Data size: 57 Basic stats: COMPLETE Column stats: NONE
                Group By Operator
                  aggregations: count()
                  keys: cid (type: int)
                  mode: hash
                  outputColumnNames: _col0, _col1
                  Statistics: Num rows: 7 Data size: 57 Basic stats: COMPLETE Column stats: NONE
                  Reduce Output Operator
                    key expressions: _col0 (type: int)
                    sort order: +
                    Map-reduce partition columns: _col0 (type: int)
                    Statistics: Num rows: 7 Data size: 57 Basic stats: COMPLETE Column stats: NONE
                    value expressions: _col1 (type: bigint)
      Reduce Operator Tree:
        Group By Operator                                  //这里是groupby
          aggregations: count(VALUE._col0)
          keys: KEY._col0 (type: int)
          mode: mergepartial
          outputColumnNames: _col0, _col1
          Statistics: Num rows: 3 Data size: 24 Basic stats: COMPLETE Column stats: NONE
          Filter Operator
            predicate: (_col1 > 2) (type: boolean)         //这里是having
            Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
            Select Operator                                 //这里是select
              expressions: 100 (type: int), _col1 (type: bigint)
              outputColumnNames: _col0, _col1
              Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
              File Output Operator
                compressed: false
                Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
                table:
                    input format: org.apache.hadoop.mapred.TextInputFormat
                    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink

 

  • 2
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值