给定订单表 sale_order
id | sale_dt | user_id | sku_id | sale_count | price | amount |
---|---|---|---|---|---|---|
1 | 2019-01-01 | 1 | 1001 | 2 | 100 | 200 |
2 | 2019-01-02 | 2 | 1001 | 1 | 100 | 100 |
3 | 2019-02-10 | 3 | 1001 | 2 | 80 | 160 |
4 | 2019-02-11 | 2 | 1002 | 2 | 100 | 200 |
5 | 2019-03-01 | 3 | 1002 | 1 | 100 | 100 |
6 | 2019-03-01 | 3 | 1001 | 1 | 50 | 50 |
7 | 2019-03-02 | 3 | 1003 | 4 | 100 | 400 |
问题1,根据订单表 sale_order生成如下数据,请写出sql, 最好是hive支持的
月份 | 销量 | 销售额 |
---|---|---|
2019-01 | 3 | 300 |
2019-02 | 4 | 360 |
2019-03 | 6 | 550 |
计算:
select
substr(sale_dt,1,7) as `月份`,
sum(sale_count) as `销量`,
sum(amount) as `销售额`
from sale_order
group by substr(sale_dt,1,7)
;
问题2, 根据订单表 sale_order生成如下数据,请写出sql, 最好是hive支持的
user_id | 2019_01 | 2019_02 | 2019_03 |
---|---|---|---|
1 | 200 | 0 | 0 |
2 | 100 | 200 | 0 |
3 | 0 | 160 | 550 |
计算:
with tmp as(
select
user_id,
substr(sale_dt,1,7) as dt,
sum(amount) as am
from sale_order
group by user_id,substr(sale_dt,1,7)
)
select
user_id,
max(`2019_01`) as `2019_01`,
max(`2019_02`) as `2019_02`,
max(`2019_03`) as `2019_03`
from
(
select
user_id,
case when dt='2019-01' then am else 0 end as `2019_01`,
case when dt='2019-02' then am else 0 end as `2019_02`,
case when dt='2019-03' then am else 0 end as `2019_03`
from tmp
) t
group by user_id;
问题3,根据订单表 sale_order生成如下数据,请写出sql, 最好是hive支持的
月份 | 累计销量 | 累计销售额 |
---|---|---|
2019-01 | 3 | 300 |
2019-02 | 7 | 660 |
2019-03 | 13 | 1210 |
计算:
with tmp as(
select
substr(sale_dt,1,7) as dt,
sum(sale_count) as sum_cnts,
sum(amount) as sum_amt
from sale_order
group by substr(sale_dt,1,7)
)
select
dt as `月份`,
sum(sum_cnts) over(order by dt rows between unbounded preceding and current row) as `累计销量`,
sum(sum_amt) over(order by dt rows between unbounded preceding and current row) as `累计销售额`
from tmp
;
问题4, 另有商品表products如下,请问题,请给出累计销售量和累计销售额排名,并写出sql。
sku_id | sku_name |
---|---|
1001 | 商品1 |
1002 | 商品2 |
1003 | 商品3 |
计算:(感觉需求不完整,我个人理解是求每个月每个商品的销售量排名,销售额排名,并且每个月每个商品的销售量和销售额是累加的)
with tmp as(
select
substr(sale_dt,1,7) as dt,
sku_name,
sum(sale_count) as sum_cnts,
sum(amount) as sum_amt
from sale_order
join products on sale_order.sku_id=products.sku_id
group by substr(sale_dt,1,7),sku_name
)
select
dt,
sku_name,
row_number() over(partition by dt order by `累计销售量` desc) as `累计销售量排名`,
row_number() over(partition by dt order by `累计销售额` desc) as `累计销售额排名`
from
(
select
dt,
sku_name,
sum(sum_cnts) over(partition by sku_name order by dt rows between unbounded preceding and current row) as `累计销售量`,
sum(sum_amt) over(partition by sku_name order by dt rows between unbounded preceding and current row) as `累计销售额`
from tmp
) t
;
问题5,根据如上数据,请写出一些结论性的观点。