Hive之窗口函数(partition) / order by / row_number / date_sub 等函数联合使用案例(9)

案例1:求出连续销售3天的店铺 

一   数据源 ,将数据源(结构化数据)导入到本地一个新建的文件中

数据源 : 
name,ctime, cost
a,2020-02-10,600
a,2020-03-01,200
a,2020-03-02,300
a,2020-03-03,200
a,2020-03-04,400
a,2020-03-05,600
a,2020-02-05,200
a,2020-02-06,300
a,2020-02-07,200
a,2020-02-08,400
b,2020-02-05,200
b,2020-02-06,300
b,2020-02-08,200
b,2020-02-09,400
b,2020-02-10,600
c,2020-01-31,200
c,2020-02-01,300
c,2020-02-02,200
c,2020-02-03,400
c,2020-02-10,600

将数据添加到linux本地,生成一个静态文件
vi /root/hive/business/sell.log

二   建表 ,将数据文件加载到表里面,查询数据加载情况

删除表
drop table tb_sell;

建表
create table tb_sell(
name string,
ctime string,
cost double
)
row format delimited fields terminated by ",";

加载数据
load data local inpath "/root/hive/business/sell.log" into table tb_sell;

查询数据加载情况
select * from tb_sell;
+---------------+----------------+---------------+
| tb_sell.name  | tb_sell.ctime  | tb_sell.cost  |
+---------------+----------------+---------------+
| a             | 2020-02-10     | 600.0         |
| a             | 2020-03-01     | 200.0         |
| a             | 2020-03-02     | 300.0         |
| a             | 2020-03-03     | 200.0         |
| a             | 2020-03-04     | 400.0         |
| a             | 2020-03-05     | 600.0         |
| a             | 2020-02-05     | 200.0         |
| a             | 2020-02-06     | 300.0         |
| a             | 2020-02-07     | 200.0         |
| a             | 2020-02-08     | 400.0         |
| b             | 2020-02-05     | 200.0         |
| b             | 2020-02-06     | 300.0         |
| b             | 2020-02-08     | 200.0         |
| b             | 2020-02-09     | 400.0         |
| b             | 2020-02-10     | 600.0         |
| c             | 2020-01-31     | 200.0         |
| c             | 2020-02-01     | 300.0         |
| c             | 2020-02-02     | 200.0         |
| c             | 2020-02-03     | 400.0         |
| c             | 2020-02-10     | 600.0         |
+---------------+----------------+---------------+

三  需求实现思路/步骤

1   将相同字段分为一个窗口并且为同一字段每一行进行编号 ,相同店铺分为一个窗口

select
*,
row_number() over(partition by name)     按照名字分区并且相同区的进行编号  
from
tb_sell;
+---------------+----------------+---------------+----------------------+
| tb_sell.name  | tb_sell.ctime  | tb_sell.cost  | row_number_window_0  |
+---------------+----------------+---------------+----------------------+
| a             | 2020-02-10     | 600.0         | 1                    |
| a             | 2020-02-08     | 400.0         | 2                    |
| a             | 2020-02-07     | 200.0         | 3                    |
| a             | 2020-02-06     | 300.0         | 4                    |
| a             | 2020-02-05     | 200.0         | 5                    |
| a             | 2020-03-05     | 600.0         | 6                    |
| a             | 2020-03-04     | 400.0         | 7                    |
| a             | 2020-03-03     | 200.0         | 8                    |
| a             | 2020-03-02     | 300.0         | 9                    |
| a             | 2020-03-01     | 200.0         | 10                   |
| b             | 2020-02-10     | 600.0         | 1                    |
| b             | 2020-02-09     | 400.0         | 2    
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值