Hive窗口函数案例

数据准备:
name,orderdate,cost

jack,2017-01-01,10 
tony,2017-01-02,15 
jack,2017-02-03,23 
tony,2017-01-04,29 
jack,2017-01-05,46 
jack,2017-04-06,42 
tony,2017-01-07,50 
jack,2017-01-08,55 
mart,2017-04-08,62 
mart,2017-04-09,68 
neil,2017-05-10,12 
mart,2017-04-11,75 
neil,2017-06-12,80 
mart,2017-04-13,94

需求:
(1)查询在 2017 年 4 月份购买过的顾客及总人数
(2)查询顾客的购买明细及月购买总额
(3)上述的场景,要将 cost 按照日期进行累加
(4)查询每个顾客上次的购买时间
(5)查询前 20%时间的订单信息

建表并导入数据:

create table business( 
name string,
orderdate string,
cost int ) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; 

load data local inpath "/opt/module/datas/business.txt" into table business

(1)查询在 2017 年 4 月份购买过的顾客及总人数:

select name,count(*) over()
from business
where month(orderdate)='04'
group by name;

结果:

mart	2
jack	2

(2)查询顾客的购买明细及月购买总额

select name,orderdate,cost,sum(cost) over(partition by name,month(orderdate))
from business
group by name,orderdate,cost;

结果:

jack	2017-01-05	46	111
jack	2017-01-08	55	111
jack	2017-01-01	10	111
jack	2017-02-03	23	23
jack	2017-04-06	42	42
mart	2017-04-09	68	299
mart	2017-04-13	94	299
mart	2017-04-11	75	299
mart	2017-04-08	62	299
neil	2017-05-10	12	12
neil	2017-06-12	80	80
tony	2017-01-02	15	94
tony	2017-01-04	29	94
tony	2017-01-07	50	94

(3)上述的场景,要将 cost 按照日期进行累加

select name,orderdate,cost,sum(cost) over(distribute by name sort by cost)
from business;

结果:

jack	2017-01-05	46	111
jack	2017-01-08	55	111
jack	2017-01-01	10	111
jack	2017-02-03	23	23
jack	2017-04-06	42	42
mart	2017-04-09	68	299
mart	2017-04-13	94	299
mart	2017-04-11	75	299
mart	2017-04-08	62	299
neil	2017-05-10	12	12
neil	2017-06-12	80	80
tony	2017-01-02	15	94
tony	2017-01-04	29	94
tony	2017-01-07	50	94

(4)查询每个顾客上次的购买时间

select name,orderdate,lag(orderdate,1,'2000-01-01') over(distribute by name sort by orderdate)
from business;

结果:

jack	2017-01-01	2000-01-01
jack	2017-01-05	2017-01-01
jack	2017-01-08	2017-01-05
jack	2017-02-03	2017-01-08
jack	2017-04-06	2017-02-03
mart	2017-04-08	2000-01-01
mart	2017-04-09	2017-04-08
mart	2017-04-11	2017-04-09
mart	2017-04-13	2017-04-11
neil	2017-05-10	2000-01-01
neil	2017-06-12	2017-05-10
tony	2017-01-02	2000-01-01
tony	2017-01-04	2017-01-02
tony	2017-01-07	2017-01-04

(5)查询前 20%时间的订单信息

select * from 
(select name,orderdate,cost,ntile(5) over(order by orderdate) as t 
from business) t1
where t=1

结果:

jack	2017-01-01	10	1
tony	2017-01-02	15	1
tony	2017-01-04	29	1
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值