案例1:求出连续销售3天的店铺
一 数据源 ,将数据源(结构化数据)导入到本地一个新建的文件中
数据源 :
name,ctime, cost
a,2020-02-10,600
a,2020-03-01,200
a,2020-03-02,300
a,2020-03-03,200
a,2020-03-04,400
a,2020-03-05,600
a,2020-02-05,200
a,2020-02-06,300
a,2020-02-07,200
a,2020-02-08,400
b,2020-02-05,200
b,2020-02-06,300
b,2020-02-08,200
b,2020-02-09,400
b,2020-02-10,600
c,2020-01-31,200
c,2020-02-01,300
c,2020-02-02,200
c,2020-02-03,400
c,2020-02-10,600
将数据添加到linux本地,生成一个静态文件
vi /root/hive/business/sell.log
二 建表 ,将数据文件加载到表里面,查询数据加载情况
删除表
drop table tb_sell;
建表
create table tb_sell(
name string,
ctime string,
cost double
)
row format delimited fields terminated by ",";
加载数据
load data local inpath "/root/hive/business/sell.log" into table tb_sell;
查询数据加载情况
select * from tb_sell;
+---------------+----------------+---------------+
| tb_sell.name | tb_sell.ctime | tb_sell.cost |
+---------------+----------------+---------------+
| a | 2020-02-10 | 600.0 |
| a | 2020-03-01 | 200.0 |
| a | 2020-03-02 | 300.0 |
| a | 2020-03-03 | 200.0 |
| a | 2020-03-04 | 400.0 |
| a | 2020-03-05 | 600.0 |
| a | 2020-02-05 | 200.0 |
| a | 2020-02-06 | 300.0 |
| a | 2020-02-07 | 200.0 |
| a | 2020-02-08 | 400.0 |
| b | 2020-02-05 | 200.0 |
| b | 2020-02-06 | 300.0 |
| b | 2020-02-08 | 200.0 |
| b | 2020-02-09 | 400.0 |
| b | 2020-02-10 | 600.0 |
| c | 2020-01-31 | 200.0 |
| c | 2020-02-01 | 300.0 |
| c | 2020-02-02 | 200.0 |
| c | 2020-02-03 | 400.0 |
| c | 2020-02-10 | 600.0 |
+---------------+----------------+---------------+
三 需求实现思路/步骤
1 将相同字段分为一个窗口并且为同一字段每一行进行编号 ,相同店铺分为一个窗口
select
*,
row_number() over(partition by name) 按照名字分区并且相同区的进行编号
from
tb_sell;
+---------------+----------------+---------------+----------------------+
| tb_sell.name | tb_sell.ctime | tb_sell.cost | row_number_window_0 |
+---------------+----------------+---------------+----------------------+
| a | 2020-02-10 | 600.0 | 1 |
| a | 2020-02-08 | 400.0 | 2 |
| a | 2020-02-07 | 200.0 | 3 |
| a | 2020-02-06 | 300.0 | 4 |
| a | 2020-02-05 | 200.0 | 5 |
| a | 2020-03-05 | 600.0 | 6 |
| a | 2020-03-04 | 400.0 | 7 |
| a | 2020-03-03 | 200.0 | 8 |
| a | 2020-03-02 | 300.0 | 9 |
| a | 2020-03-01 | 200.0 | 10 |
| b | 2020-02-10 | 600.0 | 1 |
| b | 2020-02-09 | 400.0 | 2