提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档
目录
前言
连续登陆问题应该也是高频。
可以看我之前发过的一篇文章有总结过常规连续登陆问题的解法。
然后在b站上看到过其他连续登陆问题的视频,感觉还是可以写篇文章再次整理和复习
参考b站数分奶爸华老师的视频
一、有哪些变种的连续登陆问题呢?
连续n小时在线的游戏用户,连续n天有商品卖出的店铺,连续n天听歌的用户,连续n天打车的用户...
二、基本解题思路
查询八月份连续登陆的5天的用户有哪些
1,首先构造一个用户PN345,看看他连续登陆的特征
先进行日期和用户的去重,保证每天登陆只有一条记录,避免影响之后的排序
1,用distinct 去重
2,聚合函数子查询去重(记得表别名)
select...
from t1
(select user_id,load_dt
from t1
group by user_id,load_dt)t2
2,添加辅助列。(case when,date(),排序窗口函数)
1,通过row_number窗口函数添加连续编号的列
row_number() over(partition by user_id order by login_date) rnk
2,日期列-序号列=date_sub列
date_sub(load_dt-interval row_number() over(partition by user_id order by login_date) day) as date_sub
3,date_sub列进行过滤处理
...
group by user_id,date_sub
having count(date_sub)>5
4,以上步骤得到的都是连续登陆五天的用户,最后一步只需要计算人头就ok。
总结代码:
select user_id,count(distinct user_id)
from(
select user_id,load_dt,
row_number() over(partition by user_id order by load_dt) rnk,
date_sub(load_dt-interval row_number() over(partition by user_id order by load_dt) day) date_sub
from
(select user_id,load_dt
from u_login
group by user_id,load_dt)t1)t2
group by user_id,date_sub
having count(date_sub)>5
三、美团小鹏汽车腾讯微保都考过的连续登录问题真题拆解
连续快充12次及以上的用户
1,增加三列辅助列
select*
row_number() over(partition by user_id order by charge_time) rnk1,
sum(charge_type) over(partition by user_id order by charge_time) rnk2,
row_number() over(partition by user_id order by charge_time)-sum(charge_type) over(partition by user_id order by charge_time) diff
from xp_t1
知识点:
1,聚合窗口函数的运用
rnk2是根据充电次数累加type,如果一直是快充就是一直累加1,如果突然慢冲就累加0
所以rnk1-rnk2得到的序号列可以看出相同的diff代表连续相同的type充电(此处注意会相同diff的第一个会是慢冲,计数时要按照大于等于13来算,这样就是快充大于等于12的用户了。)
2,聚合函数计数和过滤
select*,count(user_id) times
from
(select*
row_number() over(partition by user_id order by charge_time) rnk1,
sum(charge_type) over(partition by user_id order by charge_time) rnk2,
row_number() over(partition by user_id order by charge_time)-sum(charge_type) over(partition by user_id order by charge_time) diff
from xp_t1)t1
group by user_id,diff
having count(user_id)>=13
或者在外面过滤
select user_id, max(times)
from
(select user_id ,count(user_id) times
from
(select*
row_number() over(partition by user_id order by charge_time) rnk1,
sum(charge_type) over(partition by user_id order by charge_time) rnk2,
row_number() over(partition by user_id order by charge_time)-sum(charge_type) over(partition by user_id order by charge_time) diff
from xp_t1
)t1
group by user_id,diff
)t2
group by user_id
having max(times)>=13
总结
自链接也可以解决连续登陆,但是像上题一样连续快充12次就不太适合用连表解决。
用窗口函数解决连续登陆问题的话,不仅可以用排序也可以用前后函数。
主要是添加辅助列然后将辅助列之间进行操作找规律。