目录
题目来源 n套SQL面试题--行转列、留存、日活等,原文答案有错误,这里完全按题目需求进行查询
(题目清晰版本可以参考数据分析SQL面试题目9套汇总,答案同样存在错误)
第二套【窗口函数 实现分组取TOP N】
思路:
(1)先处理场景重复的情况,建立子表a
(2)添加一列row_n,按id分组进行排序
(3)取每组前两名,按id分组后,在组内用连接字符串
select concat(temp.id, '-', group_concat(temp.scene seperator '-'))
from
(select id, scene, time, row_number() over(partition by id order by scene, time) as row_n
from
(select id, scene, min(time) as time
from tb
group by id, scene
order by id, scene) a
) temp
where row_n<=2
group by id
第三套 【日活、留存:行转列+datediff函数】
这里的留存定义比较奇葩,但这样子计算比较简单,正常来说留存应该考虑的是当日注册用户在N日仍然登录的比例。
思路:
(1)复用表a,连接的条件为uid相同
(2)通过datediff()筛选出b.dayno和a.dayno相差1,3,7天,行转列
update userinfo set dayno=str_to_date(dayno,'%Y-%m-%d');
select
a.dayno 日期,count(distinct a.uid) 活跃,
count(distinct case when datediff(b.dayno,a.dayno)=1 then a.uid end) 次留,
count(distinct case when datediff(b.dayno,a.dayno)=3 then a.uid end) 三留,
count(distinct case when datediff(b.dayno,a.dayno)=7 then a.uid end) 七留,
concat(count(distinct case when datediff(b.dayno,a.dayno)=1 then a.uid end)/count(distinct a.uid)*100,'%') 次日留存率,
concat(count(distinct case when datediff(b.dayno,a.dayno)=3 then a.uid end)/count(distinct a.uid)*100,'%') 三日留存率,
concat(count(distinct case when datediff(b.dayno,a.dayno)=7 then a.uid end)/count(distinct a.uid)*100,'%') 七日留存率
from userinfo a
left join userinfo b
on a.uid=b.uid
where a.app_name='相机'
A