SQL面试题挑战01:打折日期交叉问题
题目
如下为某平台的商品促销数据,字段含义分别为品牌名称、打折开始日期、打折结束日期,现在要计算每个品牌的打折销售天数(注意其中的交叉日期)。比如vivo的打折销售天数就为17天。
brand start_date end_date
xiaomi 2021-06-05 2021-06-09
xiaomi 2021-06-11 2021-06-21
vivo 2021-06-05 2021-06-15
vivo 2021-06-09 2021-06-21
honor 2021-06-05 2021-06-21
honor 2021-06-09 2021-06-15
redmi 2021-06-17 2021-06-26
huawei 2021-06-05 2021-06-26
huawei 2021-06-09 2021-06-15
huawei 2021-06-17 2021-06-21
数据
drop table if exists discount;
create table discount(
brand varchar(20),
start_date date,
end_date date
);
insert into discount
select 'xiaomi' as brand ,'2021-06-05' as start_date,'2021-06-09' as end_date
union all
select 'xiaomi' as brand ,'2021-06-11' as start_date,'2021-06-21' as end_date
union all
select 'vivo' as brand ,'2021-06-05' as start_date,'2021-06-15' as end_date
union all
select 'vivo' as brand ,'2021-06-09' as start_date,'2021-06-21' as end_date
union all
select 'honor' as brand ,'2021-06-05' as start_date,'2021-06-21' as end_date
union all
select 'honor' as brand ,'2021-06-09' as start_date,'2021-06-15' as end_date
union all
select 'honor' as brand ,'2021-06-17' as start_date,'2021-06-26' as end_date
union all
select 'huawei' as brand ,'2021-06-05' as start_date,'2021-06-26' as end_date
union all
select 'huawei' as brand ,'2021-06-09' as start_date,'2021-06-15' as end_date
union all
select 'huawei' as brand ,'2021-06-17' as start_date,'2021-06-21' as end_date;
查询语句
思路:
- 给每行添加一个字段max_end_date,这个字段记录了该行之前中的最大结束日期。具体做法:开窗,按照品牌分组,按照打折开始日期排序。
- 通过和max_end_date比较就知道打折日期是否重复计算。具体而言:
- 如果end_date<=max_end_date,那么说明在前面已经统计过这段时间(前面开始的早结束的更晚)。
- start_date>max_end_date或者max_end_date是空(第一行)说明这行没有被统计过,就直接计算即可,注意求差+1
- 否则 就有日期交叉的情况,比如vivo的两条数据,这个时候就要避免重复统计。因此这时候的开始日期就是max_end_date+1天,然后再和end_date求差即可。
- 最后将之前所得的天数累加即可
select
brand,
sum(days)total
from
(
select
brand,
case
when end_date<=max_end_date then 0
when start_date>max_end_date or max_end_date is null then datediff(end_date,start_date)+1
else datediff(end_date,max_end_date)
end as days
from
(
select
brand,
start_date,
end_date,
max(end_date) over (partition by brand order by start_date rows between unbounded preceding and 1 preceding )max_end_date
from
discount
)t0
)t1
group by
brand;
注意:这里是MySQL 8语法,Hive SQL语法会有出入