select t1.dt,
round(count(t2.dt)/count(t1.dt),2) as uv_left_rate from
(select uid,min(date(in_time)) as dt
from tb_user_log
group by uid
) t1
left join (select uid,date(in_time) as dt
from tb_user_log
union select uid,date(out_time) as dt
from tb_user_log) t2 on t1.uid=t2.uid and t1.dt=date_sub(t2.dt, interval 1 day)
where date_format(t1.dt,‘%Y-%m’)=‘2021-11’
group by t1.dt
order by t1.dt;
拆解流程:
1.先查询出每个用户第一次登陆时间(最小登陆时间)–每天新用户表
2. 因为涉及到跨天活跃,所以要进行并集操作,将登录时间和登出时间取并集,这里union会去重–用户活跃表
3.将每天新用户表和用户活跃表左连接,只有是同一用户并且该用户第2天依旧登陆才会保留整个记录,否则右表记录为空
4.得到每天新用户第二天是否登陆表后,开始计算每天的次日留存率:根据日期分组计算,次日活跃用户个数/当天新用户个数.
新用户表:
select uid,min(date(in_time)) as dt
from tb_user_log
group by uid;
用户最早登陆时间作为新用户min(date(in_time))
用户活跃时间表:
select uid,date(in_time) as dt
from tb_user_log
union (select uid,date(out_time) as dt
from tb_user_log) 把用户登陆时间和登出时间做联结去重
聚合判定t1.uid=t2.uid and t1.dt=date_sub(t2.dt, interval 1 day),用户活跃时间减去1天要等于新用户最早登陆时间
次日时间:where date_format(t1.dt,‘%Y-%m’)=‘2021-11’