sql连续登录问题
今天刷到一个视频,需求是计算一个库表里,查询2022年1月份里连续三天登录的用户以及连续登录的时间段。
首先,t_login记录了用户每次登录的时间信息。
一天有多次登录行为,只需记录一条;
(使用函数导致索引失效)
select distinct uid,date(login_time) ymd
from t_login
where login_time between '2022-01-01 00:00:00' and timestamp '2022-01-31 23:59:59';
比较粗暴方式,将上面的表自关联三次,缺点,扩展性差。
select t1.uid,t1.ymd,t2.ymd,t3.ymd
from
(select distinct uid,date(login_time) ymd
from t_login
where login_time between '2022-01-01 00:00:00' and timestamp '2022-01-31 23:59:59') t1
join
(select distinct uid,date(login_time) ymd
from t_login
where login_time between '2022-01-01 00:00:00' and timestamp '2022-01-31 23:59:59') t2
on
(t1.uid=t2.uid and datediff(t2.ymd,t1.tmd)=1)
join
(select distinct uid,date(login_time) ymd
from t_login
where login_time between '2022-01-01 00:00:00' and timestamp '2022-01-31 23:59:59') t3
on
(t2.uid=t3.uid and datediff(t3.ymd,t2.tmd)=1);
方式二
使用窗口函数,分区限制条件,得到
with t1 as (select distinct uid,date(login_time) ymd
from t_login
where login_time between timestamp '2022-01-01 00:00:00'
and timestamp '2022-01-31 23:59:59')
select uid,ymd,
row_number() over (partition by uid order by ymd) num
from t1;
只要不连续,插值一定存在结果跳跃。
with t1 as (select distinct uid,date(login_time) ymd
from t_login
where login_time between timestamp '2022-01-01 00:00:00'
and timestamp '2022-01-31 23:59:59')
t2 as(select uid,ymd,
date_sub(ymd,interval row_number() over (partition by uid order by ymd) day) sub_date
from t1)
select uid,min(ymd),max(ymd),count(*)
from t2
group by uid,sub_date
having count(*)>=3;
妈的后悔了,浪费时间了。哎没时间感叹。