presto结合grouping、grouping set实现统计宽表

数据准备

  • 准备一张表t_order,字段有:
province:省
city:市
area:区
channel:渠道,1为小程序,2为app
price:价格

该表表示某个区域的某个渠道的订单交易额,这里忽略时间维度。

  • 创建表
create table demo.t_order (
    province string,
    city string,
    area string,
    channel int,
    price int
)
row format delimited fields terminated by ',';
  • t_order.txt
01,0101,010101,1,60
01,0101,010101,2,55
01,0101,010102,1,70
01,0101,010102,2,85
01,0101,010103,1,66
01,0101,010103,2,73
01,0102,010201,1,64
01,0102,010201,2,41
01,0102,010202,1,33
01,0102,010202,2,28
01,0102,010203,1,75
01,0102,010203,2,61
01,0103,010301,1,34
01,0103,010301,2,96
01,0103,010302,1,88
01,0103,010302,2,71
01,0103,010303,1,70
01,0103,010303,2,64
  • hive加载数据
load data local inpath '/export/data/t_order.txt' overwrite into table demo.t_order;

Grouping set

  • 根据(省、市、区),(省、市),(省)进行分组
select
    province,
    city,
    area

from demo.t_order
group by
    grouping sets (
    province,
    (province, city),
    (province, city, area)
    )

返回的数据为
在这里插入图片描述
即有多个组合,根据不同城市进行组合

Grouping

往上面的sql语句中添加Grouping操作,增加一个grouping_type字段,表示是根据某个维度进行分组

select
    province,
    city,
    area,
    case when grouping (province, city, area) = 0
            then 'area'
         when grouping (province, city) = 0
            then 'city'
         when grouping (province) = 0
            then 'province'
    end as group_type
from demo.t_order
group by
    grouping sets (
    province,
    (province, city),
    (province, city, area)
    )

得到的结果为:
在这里插入图片描述
grouping (province, city, area) = 0为例,当分组的数据中包含了同个区域的省、市、区时,grouping 返回的值为二进制的000,即数值为0,以此类推

  • 根据对应的分组,返回不同渠道的订单总额
select
    province,
    city,
    area,
    case when grouping (province, city, area) = 0
            then 'area'
         when grouping (province, city) = 0
            then 'city'
         when grouping (province) = 0
            then 'province'
        end as group_type,
    case when grouping (province, city, area) = 0
            then sum(case when channel = 1 then price else 0 end)
         when grouping (province, city) = 0
            then sum(case when channel = 1 then price else 0 end)
         when grouping (province) = 0
            then sum(case when channel = 1 then price else 0 end)
        end as mp_price,
    case when grouping (province, city, area) = 0
             then sum(case when channel = 2 then price else 0 end)
         when grouping (province, city) = 0
             then sum(case when channel = 2 then price else 0 end)
         when grouping (province) = 0
             then sum(case when channel = 2 then price else 0 end)
        end as app_price
from demo.t_order
group by
    grouping sets (
        province,
        (province, city),
        (province, city, area)
    )

返回结果为:
在这里插入图片描述

  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值