需求:有一张用户登录表,字段有id(用户id),time(用户登录的时间);计算连续登录天数为3天的用户。
创表语句
create table continous
(
id INT ,
time DATETIME
);
INSERT into continous VALUES(201,'2017-1-1');
INSERT into continous VALUES(201,'2017-1-2');
INSERT into continous VALUES(201,'2017-1-5');
INSERT into continous VALUES(202,'2017-1-2');
INSERT into continous VALUES(202,'2017-1-3');
INSERT into continous VALUES(203,'2017-1-3');
INSERT into continous VALUES(201,'2017-1-4');
INSERT into continous VALUES(202,'2017-1-4');
INSERT into continous VALUES(201,'2017-1-5');
INSERT into continous VALUES(202,'2017-1-5');
INSERT into continous VALUES(201,'2017-1-6');
INSERT into continous VALUES(203,'2017-1-6');
INSERT into continous VALUES(203,'2017-1-7');
SELECT * FROM continous order BY id;
思路一:使用ROW_NUMBER()
第零步:去重
我们可以看到201号用户一天内登录了两次,但是我们还是得按照一天来算,所以需要先去重
select DISTINCT date(time) as time,id from continous;
第一步:根据用户分组,登陆时间排序(使用排名函数Mysql8.0后才有的),结果按照登陆时间升序排列
SELECT ID,time,row_number() OVER(PARTITION BY ID order by time) as sort
FROM (select DISTINCT date(time) as time,id from continous) temp_0;
temp_0就是第零步得到的结果,上面整条sql语句的结果是
第二步:判断是否连续:利用登陆时间,排序得到的序列号,如果减去【使用date_sub()函数】后得到的结果是同一天,则说明是连续的;否则是不连续的
例如:2017-01-01减去它对应的sort,就是2016-12-31;2017-01-02减去它对应的sort也是2016-12-31;到2017-01-04减去它对应的sort就是2017-01-01;到目前为止该用户已经连续登录两天了。就是这个逻辑,我们可以使用date_sub()函数。
SELECT *,DATE_SUB(time,INTERVAL sort DAY) AS result FROM
(
SELECT ID,time,row_number() OVER(PARTITION BY ID order by time) as sort
FROM (select DISTINCT date(time) as time,id from continous) temp_0
) temp_1;
temp_1就是第一步得到的结果,上面整条sql语句的结果是
根据第二步的结果以id分组和result分组求和
SELECT id,COUNT(*) FROM
(
SELECT *,DATE_SUB(time,INTERVAL sort DAY) AS result FROM
(
SELECT ID,time,row_number() OVER(PARTITION BY ID order by time) as sort --
FROM (select DISTINCT date(time) as time,id from continous) temp_0
) temp_1
)temp_2
GROUP BY id,result having count(*)>=1;
having后过滤的条件大于等于多少就是连续多少天登录了,我这里写的是大于等于1,就是连续登录一天以上的用户了
我们就可以看到201用户最大登录天数是3天,202用户是4天,203用户是两天,有了这个表你想怎么求就怎么求了
思路二:使用LEAD()
第零步:还是去重
-- 第0步,因为用户可能一天登录多次,去重,按一次算的
select DISTINCT date(time) as time,id from continous;
第一步:使用lead()窗口函数取它下面两行日期作为新的一列别名为next_2_time,如果下面两行为空就默认为1970年
-- temp_0就是上面第0步的结果
select id,time,lead(time,2,'1970-01-01') OVER (PARTITION BY ID ORDER BY time) next_2_time FROM
(
select DISTINCT date(time) as time,id from continous
) temp_0;
第二步:使用datediff函数(两个日期类型相减)用next_2_time减去time得到一个新列diff
SELECT *,datediff(next_2_time,time) diff FROM
(
select id,time,lead(time,2,'1970-01-01') OVER (PARTITION BY ID ORDER BY time) next_2_time FROM
(
select DISTINCT date(time) as time,id from continous
) temp_0
) temp_1;
第三步:过滤出diff为2的就是连续登录三天的用户记录了
SELECT * FROM
(
SELECT *,datediff(next_2_time,time) diff FROM
(
select id,time,lead(time,2,'1970-01-01') OVER (PARTITION BY ID ORDER BY time) next_2_time FROM
(
select DISTINCT date(time) as time,id from continous
) temp_0
) temp_1
) temp_2
WHERE diff=2;
最后我们将id去重就是连续登录三天的用户了
看懂了点个赞不过分吧