LATERAL VIEW EXPLODE函数详解及应用

需求背景:在进行统计分析的时候有时候会有类似这样的需求 比如求某个平台某一天所有的订单总和,或者淘宝所有pc 端的交易总和,这个时候我们可以基于原本基础的数据进行炸裂处理之后得出结结果值,方便后续进行查询,这种提前进行预聚合的思想长用于即席查询分析的场景中,比如为了适用于某张报表的多条件查询可以采用此种方式结合预聚合的方式进行操作。

数据说明:

+-----+-----------+------------+------------+---------+---------+
|id   |device_type|business_gmv|order_source|pay_money|event_day|
+-----+-----------+------------+------------+---------+---------+
|10001|jingdong   |16          |1           |1000     |20211020 |
|10004|jingdong   |15          |1           |2000     |20211021 |
+-----+-----------+------------+------------+---------+---------+

使用炸裂函数进行如下操作:

select
   device_type_all,
   business_gmv_all,
   order_source,
   pay_money,
   event_day_all
from order_hbi
LATERAL VIEW OUTER EXPLODE(array('ALL', device_type)) col1 AS device_type_all
LATERAL VIEW OUTER EXPLODE(array('ALL', business_gmv)) col1 AS business_gmv_all
LATERAL VIEW OUTER EXPLODE(array('ALL', event_day)) col1 AS event_day_all
order by event_day_all asc

语句说明:
1.首先将device_type 和ALL的组合数组 进行炸裂,那么这里原来的两行数据在各自加上ALL 之后会得到一共四行数据:

+---------------+----------------+------------+---------+-------------+
|device_type_all|business_gmv_all|order_source|pay_money|event_day_all|
+---------------+----------------+------------+---------+-------------+
|ALL            |16              |1           |1000     |20211020     |
|jingdong       |16              |1           |1000     |20211020     |
|ALL            |15              |1           |2000     |20211021     |
|jingdong       |15              |1           |2000     |20211021     |
+---------------+----------------+------------+---------+-------------+

2.将business_gmv 和ALL 的组合数组进行炸裂,则由上述的数据再次翻倍,即每一行在基于business_gmv 进行和ALL 的炸裂则得到如下的八行数据

+---------------+----------------+------------+---------+-------------+
|device_type_all|business_gmv_all|order_source|pay_money|event_day_all|
+---------------+----------------+------------+---------+-------------+
|ALL            |ALL             |1           |1000     |20211020     |
|ALL            |16              |1           |1000     |20211020     |
|jingdong       |ALL             |1           |1000     |20211020     |
|jingdong       |16              |1           |1000     |20211020     |
|ALL            |ALL             |1           |2000     |20211021     |
|jingdong       |15              |1           |2000     |20211021     |
|ALL            |15              |1           |2000     |20211021     |
|jingdong       |ALL             |1           |2000     |20211021     |
+---------------+----------------+------------+---------+-------------+

再基于上述结果进行event_day 的炸裂,则最终可以得到如下的结果的16行数据:

+---------------+----------------+------------+---------+-------------+
|device_type_all|business_gmv_all|order_source|pay_money|event_day_all|
+---------------+----------------+------------+---------+-------------+
|ALL            |ALL             |1           |1000     |20211020     |
|ALL            |16              |1           |1000     |20211020     |
|jingdong       |ALL             |1           |1000     |20211020     |
|jingdong       |16              |1           |1000     |20211020     |
|jingdong       |15              |1           |2000     |20211021     |
|ALL            |ALL             |1           |2000     |20211021     |
|ALL            |15              |1           |2000     |20211021     |
|jingdong       |ALL             |1           |2000     |20211021     |
|ALL            |ALL             |1           |1000     |ALL          |
|ALL            |16              |1           |1000     |ALL          |
|jingdong       |16              |1           |1000     |ALL          |
|jingdong       |15              |1           |2000     |ALL          |
|jingdong       |ALL             |1           |1000     |ALL          |
|ALL            |ALL             |1           |2000     |ALL          |
|ALL            |15              |1           |2000     |ALL          |
|jingdong       |ALL             |1           |2000     |ALL          |
+---------------+----------------+------------+---------+-------------+

基于上述结果统计的目的是可以统计多纬度的指标的聚合结果

4.加上group by 进行统计结果的分析,聚合订单金额

select
   device_type_all,
   business_gmv_all,
   order_source,
   sum(pay_money) pay_money,
   event_day_all
from order_hbi
LATERAL VIEW OUTER EXPLODE(array('ALL', device_type)) col1 AS device_type_all
LATERAL VIEW OUTER EXPLODE(array('ALL', business_gmv)) col1 AS business_gmv_all
LATERAL VIEW OUTER EXPLODE(array('ALL', event_day)) col1 AS event_day_all
group by device_type_all,business_gmv_all,order_source,event_day_all
order by event_day_all asc

可以得到如下的结果:

+---------------+----------------+------------+---------+-------------+
|device_type_all|business_gmv_all|order_source|pay_money|event_day_all|
+---------------+----------------+------------+---------+-------------+
|jingdong       |ALL             |1           |1000.0   |20211020     |
|ALL            |ALL             |1           |1000.0   |20211020     |
|ALL            |16              |1           |1000.0   |20211020     |
|jingdong       |16              |1           |1000.0   |20211020     |
|jingdong       |15              |1           |2000.0   |20211021     |
|ALL            |15              |1           |2000.0   |20211021     |
|jingdong       |ALL             |1           |2000.0   |20211021     |
|ALL            |ALL             |1           |2000.0   |20211021     |
|ALL            |16              |1           |1000.0   |ALL          |
|jingdong       |16              |1           |1000.0   |ALL          |
|ALL            |ALL             |1           |3000.0   |ALL          |
|ALL            |15              |1           |2000.0   |ALL          |
|jingdong       |15              |1           |2000.0   |ALL          |
|jingdong       |ALL             |1           |3000.0   |ALL          |
+---------------+----------------+------------+---------+-------------+

比如我们可以直接从这个结果中选择所有设备类型(device_type_all)并且所有的bussiness_gmv 并且所有日期的订单总量则可以直接选取如下的数据,订单总金额为3000

|ALL            |ALL             |1           |3000.0   |ALL  

也可以宣组jingdong 所有business_gmv_all 所有日期的订单总额则为:3000

|jingdong       |ALL             |1           |3000.0   |ALL          |

假如不想要某个字段的聚合结果可以用 !=ALL 先过滤掉,比如:business_gmv_all !=‘ALL’

SELECT device_type_all,sum(pay_money) gmv,event_day_all
FROM (
select
   device_type_all,business_gmv_all,order_source,
   sum(pay_money) pay_money,
   event_day_all
from order_hbi
LATERAL VIEW OUTER EXPLODE(array('ALL', device_type)) col1 AS device_type_all
LATERAL VIEW OUTER EXPLODE(array('ALL', business_gmv)) col1 AS business_gmv_all
LATERAL VIEW OUTER EXPLODE(array('ALL', event_day)) col1 AS event_day_all
group by device_type_all,business_gmv_all,order_source,event_day_all
order by event_day_all asc
) tmp
where business_gmv_all !='ALL'
group by device_type_all,event_day_all

得到如下结果:可以选取想要的数据避免重复计算,直接选取即可

+---------------+------+-------------+
|device_type_all|gmv   |event_day_all|
+---------------+------+-------------+
|ALL            |1000.0|20211020     |
|jingdong       |1000.0|20211020     |
|jingdong       |2000.0|20211021     |
|ALL            |2000.0|20211021     |
|ALL            |3000.0|ALL          |
|jingdong       |3000.0|ALL          |
+---------------+------+-------------+
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Michealkz

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值