留存率是衡量用户质量的最重要指标之一,因此计算用户留存率是用户数据分析中必须掌握的技能之一,同样也成为了面试经典sql之一。
留存率指标中,通常需要关注次日留存、3日留存、7日留存和月留存。对新增用户而言,需要关注更细颗粒度的数据,也就是7日内每天的留存率。
代码实现
select
dt
,count( if(id=lead_id and datediff(lead_dt,dt) =1 ,id, null ) ) as `1日留存`
,count( if(id=lead_id7 and datediff(lead_dt7,dt)=7 ,id, null ) ) as `7日留存`
from
(
select
id
,dt
,lead(dt,1) over(partition by id order by dt asc ) as lead_dt
,lead(id,1) over(partition by id order by dt asc ) as lead_id
,lead(dt,7) over(partition by id order by dt asc ) as lead_dt7
,lead(id,7) over(partition by id order by dt asc ) as lead_id7
from
(
select 'slm' as id, '2018-12-26' as dt
union all select 'slm' as id, '2018-12-27' as dt
union all select 'slm' as id, '2018-12-28' as dt
union all select 'hh ' as id, '2018-12-26' as dt
union all select 'hh ' as id, '2018-12-28' as dt
) a
) b
group by dt
order by dt