🥸 2020 抖音春招
问题:有用户行为记录表t_act_records表,包含两个字段:uid(用户ID),imp_date(日期)
计算2020年每个月,每个用户连续签到的最多天数
计算2020年每个月,连续2天都有登陆的用户名单
计算2020年每个月,连续5天都有登陆的用户数
解答:
/* 2020年每个月,每个用户连续签到的最多天数
1. 选定时间范围 Year(impdate)=2020/where impdate between 20200101 and 20201231
2. 连续时间:排序- row_number() over(partition by month(impdate),userid) as rank, 相减- date_sub(impdate,rank) as sign
3. 按月份聚合 group by month(impdate) */
select userid, max(cnt),month(impdate)
from(
select userid,impdate,date_sub(impdate,rank) as sign,count(1) as cnt
from(
select userid,impdate,row_number() over(partition by month(impdate),userid) as rank
from t_act_records
where Year(impdate)=2020
group by userid,impdate
)
group by userid,impdate
)
group by userid,month(impdate)
/* 2020年每个月,连续2天都有登陆的用户名单
1. 选定时间范围 Year(impdate)=2020/where impdate between 20200101 and 20201231
2. 连续时间:排序- row_number() over(partition by month(impdate),userid) as rank, 相减- date_sub(impdate,rank) as sign
3. 连续 2 天都登录
4. 按月份聚合 group by month(impdate) */
select userid, month(impdate)
from(
select userid,impdate,date_sub(impdate,rank) as sign
from(
select userid,impdate,row_number() over(partition by month(impdate),userid) as rank
from t_act_records
where Year(impdate)=2020
)
)
group by month(impdate),userid
having count(sign)>=2
/* 2020年每个月,连续5天都有登陆的用户数*/
select count(distinct userid), month(impdate)
from(
select userid,impdate,date_sub(impdate,rank) as sign
from(
select userid,impdate,row_number() over(partition by month(impdate),userid) as rank
from t_act_records
where Year(impdate)=2020
)
)
group by month(impdate),userid
having count(sign)>=5
此类问题的思路:
- 按要求排序:row_number() over (partition by xxx order by) as rank
- 求时间差:date_sub(date, rank) as diff
- 根据题目要求对diff进行限制