连续登录问题(详细解答)
一、题目
1、连续登录问题
间隔两天也算连续登录,最终结果要用户的开始时间、结束时间和最大连续登录天数
2、数据
with
xie as(
select '1' as user_id,'2023-02-01' as login_date
union all
select '1' as user_id,'2023-02-03' as login_date
union all
select '1' as user_id,'2023-02-05' as login_date
union all
select '1' as user_id,'2023-02-08' as login_date
union all
select '1' as user_id,'2023-02-10' as login_date
union all
select '2' as user_id,'2023-02-02' as login_date
union all
select '2' as user_id,'2023-02-04' as login_date
union all
select '2' as user_id,'2023-02-06' as login_date
)
3、结果预览
二、思路
1、第一步:按user_id进行分组,把符合连续登录规则的算成同一组,并打上标记,思路图如下:
2、第二步:按user_id和标记进行分组,取出连续登录的用户的连续登录开始时间和结束时间,再拿结束时间减开始时间得到连续登录的天数,思路图如下:
3、第三步:用rank()开窗,按user_id进行分组,按连续登录的天数排序,最后取排名等于1的,就是最大连续登录天数,思路图如下:
三、答案
sql
with
xie as(
select '1' as user_id,'2023-02-01' as login_date
union all
select '1' as user_id,'2023-02-03' as login_date
union all
select '1' as user_id,'2023-02-05' as login_date
union all
select '1' as user_id,'2023-02-08' as login_date
union all
select '1' as user_id,'2023-02-10' as login_date
union all
select '2' as user_id,'2023-02-02' as login_date
union all
select '2' as user_id,'2023-02-04' as login_date
union all
select '2' as user_id,'2023-02-06' as login_date
)
select user_id
,min_login_date
,max_login_date
,days
from(
select user_id
,min_login_date
,max_login_date
,days
,rank() over (partition by user_id order by days desc) as rn
from(
select user_id
,min(login_date) as min_login_date
,max(login_date) as max_login_date
,datediff(max(login_date),min(login_date))+1 as days
from(
select user_id
,login_date
,sum(aa) over (partition by user_id order by login_date) as bb
from (
select user_id
,login_date
,if(datediff(login_date,lag(login_date,1,'0000-00-00') over (partition by user_id order by login_date))>2,1,0) as aa
from xie
)t1
)tt1
group by user_id,bb
)ttt1
)tttt1
where rn = 1