Presto源码分析(和hive执行计划的比较)

1 聚合操作比较

1.1 presto groupby

explain select sum(totalprice),orderpriority from orders group by orderpriority;

 - Output[_col0, orderpriority] => [sum:double, orderpriority:varchar(15)]
         _col0 := sum
     - RemoteExchange[GATHER] => sum:double, orderpriority:varchar(15)
         - Project => [sum:double, orderpriority:varchar(15)]
             - Aggregate(FINAL)[orderpriority] => [orderpriority:varchar(15), $hashvalue:bigint, sum:double] //最终聚合
                     sum := "sum"("sum_9")
                 - RemoteExchange[REPARTITION] => orderpriority:varchar(15), sum_9:double, $hashvalue:bigint //从部分聚合节点远程拉取结果
                     - Aggregate(PARTIAL)[orderpriority] => [orderpriority:varchar(15), $hashvalue_11:bigint, sum_10:double] 
                             sum_10 := "sum"("totalprice") //部分聚合
                         - Project => [$hashvalue_11:bigint, orderpriority:varchar(15), totalprice:double]
                                 $hashvalue_11 := "combine_hash"(BIGINT '0', COALESCE("$operator$hash_code"("orderpriority"), 0))
                             - TableScan[tpch:tpch:orders:sf1.0, originalConstraint = true] => [totalprice:double, orderpriority:varchar(15)] //表扫描操作
                                     totalprice := tpch:totalprice
                                     orderpriority := tpch:orderpriority

1.2 hive groupby

explain select sum(o_totalprice),o_orderpriority from orders where o_orderkey>100 group by o_orderpriority;


STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan //表扫描
            alias: orders
            Statistics: Num rows: 6000000 Data size: 596779236 Basic stats: COMPLETE Column stats: NONE
            Filter Operator //表过滤
              predicate: (o_orderkey > 100) (type: boolean)
              Statistics: Num rows: 2000000 Data size: 198926412 Basic stats: COMPLETE Column stats: NONE
              Select Operator //表投影
                expressions: o_orderpriority (type: string), o_totalprice (type: double)
                outputColumnNames: o_orderpriority, o_totalprice
                Statistics: Num rows: 2000000 Data size: 198926412 Basic stats: COMPLETE Column stats: NONE
                Group By Operator //对扫描、过滤和投影之后 的记过进行局部groupby,hash的方式
                  aggregations: 
  • 1
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值