数据表如下
... | ... |
2021-12-01 | 2 |
2021-12-02 | 1 |
2021-12-03 | 0 |
2021-12-04 | 0 |
2021-12-05 | 0 |
2021-12-06 | 1 |
... | ... |
首先计算并标记出是否打卡
select a.day,a.num,
CASE when ISNULL(b.num, 0)=0 then 0 else 1 end as clocked,
CASE when ISNULL(b.num, 0)=0 then 1 else 0 end as unclocked
from a
user | day | num | clocked | unclocked |
... | ... | |||
1 | 2021-12-01 | 2 | 1 | 0 |
1 | 2021-12-02 | 1 | 1 | 0 |
1 | 2021-12-03 | 0 | 0 | 1 |
1 | 2021-12-04 | 0 | 0 | 1 |
1 | 2021-12-05 | 0 | 0 | 1 |
1 | 2021-12-06 | 1 | 1 | 0 |
2 | ... | ... |
希望得到结果:最长连续打卡记录或者最长连续未打卡记录,计算方法一样
user | clocked_days | unclocked_days |
1 | 2 | 3 |
2 | ... | ... |
在此处可以用是否打卡的标识位与一个递增的数字依次做减法的差,如果差值相等说明是连续打卡。
所以第一步先补上一列递增数字,此处使用row_number()窗口函数
select user,day,
row_number () OVER (partition BY user ORDER BY day) date_rank
from a where clocked=1
然后计算打卡日期day与递增数date_rank之间的差
SELECT
user,
day,
date_rank,
( DATEDIFF(day,'2020-12-31',day) - date_rank ) AS day_cha
FROM
( SELECT user, day, row_number () OVER (partition BY userORDER BY day) date_rank FROM REPORT_CLOCK WHERE unclocked = 1) t1
得出结果,连续打卡日期的day_char都是相等的。
user | day | date_rank | day_cha |
1 | 2021-12-01 | 1 | 0 |
1 | 2021-12-02 | 2 | 0 |
1 | 2021-12-06 | 3 | 1 |
2 | ... |
然后根据user跟day_char group by之后获取count之后的结果
SELECT
user,
day_cha,
COUNT ( day) flag_days
FROM
(
SELECT
user,
day,
date_rank,
( DATEDIFF(day,'2020-12-31',day) - date_rank ) AS day_cha
FROM
( SELECT user, day, row_number () OVER (partition BY userORDER BY day) date_rank FROM REPORT_CLOCK WHERE clocked = 1) t1
) t2
GROUP BY
user,
day_cha
user | day_cha | flag_days |
1 | 0 | 2 |
1 | 1 | 1 |
2 | ... | ... |
此时,flag_days的最大值即为最长连续打卡时间,day_cha最大值对应的flag_days即为最近连续打卡日期