链家——留存问题
找出表1中相机APP的当日用户活跃数以及次留,三留,七留。所得结果表如表2所示。
表1:
| uid | app_name | duration | times | dayno |
|---|---|---|---|---|
| 01 | 相机 | 1 | 2 | 2018-05-01 |
| 02 | 微信 | 2 | 3 | 2018-05-02 |
| 03 | 美团 | 4 | 2 | 2018-05-03 |
| 04 | 微信 | 6 | 3 | 2018-05-01 |
| 05 | 相机 | 3 | 1 | 2018-05-03 |
| 06 | 相机 | 2 | 3 | 2018-05-01 |
| 07 | 相机 | 2 | 2 | 2018-05-02 |
| 08 | 微信 | 1 | 1 | 2018-05-01 |
| 09 | 美团 | 3 | 2 | 2018-05-02 |
| 10 | 相机 | 4 | 3 | 2018-05-03 |
| 11 | 相机 | 5 | 4 | 2018-05-02 |
| 12 | 美团 | 6 | 5 | 2018-05-01 |
| 13 | 微信 | 7 | 1 | 2018-05-02 |
| 14 | 相机 | 2 | 2 | 2018-05-03 |
| 15 | 相机 | 1 | 3 | 2018-05-01 |
| 01 | 美团 | 1 | 2 | 2018-05-01 |
| 02 | 微信 | 1 | 3 | 2018-05-04 |
| 03 | 相机 | 3 | 2 | 2018-05-03 |
| 04 | 微信 | 4 | 3 | 2018-05-01 |
| 05 | 相机 | 4 | 2 | 2018-05-03 |
| 06 | 相机 | 2 | 2 | 2018-05-04 |
| 07 | 美团 | 2 | 3 | 2018-05-04 |
| 08 | 微信 | 1 | 2 | 2018-05-04 |
| 09 | 相机 | 3 | 3 | 2018-05-04 |
| 10 | 相机 | 4 | 3 | 2018-05-03 |
| 11 | 相机 | 5 | 4 | 2018-05-01 |
| 12 | 美团 | 6 | 5 | 2018-05-02 |
| 13 | 微信 | 5 | 4 | 2018-05-03 |
| 14 | 相机 | 2 | 2 | 2018-05-02 |
| 15 | 相机 | 1 | 2 | 2018-05-01 |
| 01 | 美团 | 1 | 2 | 2018-05-05 |
| 02 | 微信 | 1 | 3 | 2018-05-06 |
| 03 | 相机 | 3 | 2 | 2018-05-04 |
| 04 | 微信 | 4 | 3 | 2018-05-01 |
| 05 | 相机 | 4 | 2 | 2018-05-04 |
| 06 | 相机 | 2 | 2 | 2018-05-04 |
| 07 | 美团 | 2 | 3 | 2018-05-05 |
| 08 | 微信 | 1 | 2 | 2018-05-06 |
| 09 | 相机 | 3 | 3 | 2018-05-06 |
| 10 | 相机 | 4 | 3 | 2018-05-05 |
| 11 | 相机 | 5 | 4 | 2018-05-04 |
| 12 | 美团 | 6 | 5 | 2018-05-02 |
| 13 | 微信 | 5 | 4 | 2018-05-05 |
| 14 | 相机 | 2 | 2 | 2018-05-06 |
| 15 | 相机 | 1 | 2 | 2018-05-06 |
| 01 | 美团 | 1 | 2 | 2018-05-08 |
| 02 | 微信 | 1 | 3 | 2018-05-06 |
| 03 | 相机 | 3 | 2 | 2018-05-04 |
| 04 | 微信 | 4 | 3 | 2018-05-10 |
| 05 | 相机 | 4 | 2 | 2018-05-08 |
| 06 | 相机 | 2 | 2 | 2018-05-07 |
| 07 | 美团 | 2 | 3 | 2018-05-09 |
| 08 | 微信 | 1 | 2 | 2018-05-09 |
| 09 | 相机 | 3 | 3 | 2018-05-09 |
| 10 | 相机 | 4 | 3 | 2018-05-05 |
| 11 | 相机 | 5 | 4 | 2018-05-05 |
| 12 | 美团 | 6 | 5 | 2018-05-10 |
| 13 | 微信 | 5 | 4 | 2018-05-09 |
| 14 | 相机 | 2 | 2 | 2018-05-06 |
| 15 | 相机 | 1 | 2 | 2018-05-06 |
| 01 | 美团 | 1 | 2 | 2018-05-08 |
| 02 | 微信 | 1 | 3 | 2018-05-06 |
| 03 | 相机 | 3 | 2 | 2018-05-04 |
| 04 | 微信 | 4 | 3 | 2018-05-10 |
| 05 | 相机 | 4 | 2 | 2018-05-10 |
| 06 | 相机 | 2 | 2 | 2018-05-07 |
| 07 | 美团 | 2 | 3 | 2018-05-09 |
| 08 | 微信 | 1 | 2 | 2018-05-08 |
| 09 | 相机 | 3 | 3 | 2018-05-09 |
| 10 | 相机 | 4 | 3 | 2018-05-10 |
| 11 | 相机 | 5 | 4 | 2018-05-05 |
| 12 | 美团 | 6 | 5 | 2018-05-10 |
| 13 | 微信 | 5 | 4 | 2018-05-08 |
| 14 | 相机 | 2 | 2 | 2018-05-07 |
| 15 | 相机 | 1 | 2 | 2018-05-07 |
| 01 | 美团 | 1 | 2 | 2018-05-08 |
| 02 | 微信 | 1 | 3 | 2018-05-06 |
| 03 | 相机 | 3 | 2 | 2018-05-10 |
| 04 | 微信 | 4 | 3 | 2018-05-10 |
| 05 | 相机 | 4 | 2 | 2018-05-10 |
| 06 | 相机 | 2 | 2 | 2018-05-07 |
| 07 | 美团 | 2 | 3 | 2018-05-09 |
| 08 | 微信 | 1 | 2 | 2018-05-08 |
| 09 | 相机 | 3 | 3 | 2018-05-10 |
| 10 | 相机 | 4 | 3 | 2018-05-10 |
| 11 | 相机 | 5 | 4 | 2018-05-05 |
| 12 | 美团 | 6 | 5 | 2018-05-10 |
| 13 | 微信 | 5 | 4 | 2018-05-08 |
| 14 | 相机 | 2 | 2 | 2018-05-07 |
| 15 | 相机 | 1 | 2 | 2018-05-07 |
表2:
| dayno | 用户活跃数 | 次日留存数 | 三日留存数 | 七日留存数 | 次日留存率 | 三日留存率 | 七日留存率 |
|---|---|---|---|---|---|---|---|
| 2018-05-01 | 4 | 1 | 0 | 2 | 25.00% | 0.00% | 50.00% |
| 2018-05-02 | 3 | 1 | 1 | 0 | 33.33% | 33.33% | 0.00% |
| 2018-05-03 | 4 | 2 | 1 | 0 | 50.00% | 25.00% | 0.00% |
| 2018-05-04 | 5 | 1 | 1 | 3 | 20.00% | 20.00% | 60.00% |
| 2018-05-05 | 2 | 0 | 0 | 0 | 0.00% | 0.00% | 0.00% |
| 2018-05-06 | 3 | 2 | 0 | 0 | 66.67% | 0.00% | 0.00% |
| 2018-05-07 | 3 | 0 | 0 | 0 | 0.00% | 0.00% | 0.00% |
| 2018-05-08 | 1 | 0 | 1 | 0 | 0.00% | 100.00% | 0.00% |
| 2018-05-09 | 1 | 1 | 0 | 0 | 100.00% | 0.00% | 0.00% |
| 2018-05-10 | 4 | 0 | 0 | 0 | 0.00% | 0.00% | 0.00% |
// 创建表格代码
create table userinfo(
uid varchar(10),
app_name varchar(20),
duration int(10),
times int(10),
dayno varchar(30)
);
insert into userinfo values
(01,'相机',1,2,'2018-05-01'),
(02,'微信',2,3,'2018-05-02'),
(03,'美团',4,2,'2018-05-03'),
(04,'微信',6,3,'2018-05-01'),
(05,'相机',3,1,'2018-05-03'),
(06,'相机',2,3,'2018-05-01'),
(07,'相机',2,2,'2018-05-02'),
(08,'微信',1,1,'2018-05-01'),
(09,'美团',3,2,'2018-05-02'),
(10,'相机',4,3,'2018-05-03'),
(11,'相机',5,4,'2018-05-02'),
(12,'美团',6,5,'2018-05-01'),
(13,'微信',7,1,'2018-05-02'),
(14,'相机',2,2,'2018-05-03'),
(15,'相机',1,3,'2018-05-01'),
(01,'美团',1,2,'2018-05-01'),
(02,'微信',1,3,'2018-05-04'),
(03,'相机',3,2,'2018-05-03'),
(04,'微信',4,3,'2018-05-01'),
(05,'相机',4,2,'2018-05-03'),
(06,'相机',2,2,'2018-05-04'),
(07,'美团',2,3,'2018-05-04'),
(08,'微信',1,2,'2018-05-04'),
(09,'相机',3,3,'2018-05-04'),
(10,'相机',4,3,'2018-05-03'),
(11,'相机',5,4,'2018-05-01'),
(12,'美团',6,5,'2018-05-02'),
(13,'微信',5,4,'2018-05-03'),
(14,'相机',2,2,'2018-05-02'),
(15,'相机',1,2,'2018-05-01'),
(01,'美团',1,2,'2018-05-05'),
(02,'微信',1,3,'2018-05-06'),
(03,'相机',3,2,'2018-05-04'),
(04,'微信',4,3,'2018-05-01'),
(05,'相机',4,2,'2018-05-04'),
(06,'相机',2,2,'2018-05-04'),
(07,'美团',2,3,'2018-05-05'),
(08,'微信',1,2,'2018-05-06'),
(09,'相机',3,3,'2018-05-06'),
(10,'相机',4,3,'2018-05-05'),
(11,'相机',5,4,'2018-05-04'),
(12,'美团',6,5,'2018-05-02'),
(13,'微信',5,4,'2018-05-05'),
(14,'相机',2,2,'2018-05-06'),
(15,'相机',1,2,'2018-05-06'),
(01,'美团',1,2,'2018-05-08'),
(02,'微信',1,3,'2018-05-06'),
(03,'相机',3,2,'2018-05-04'),
(04,'微信',4,3,'2018-05-10'),
(05,'相机',4,2,'2018-05-08'),
(06,'相机',2,2,'2018-05-07'),
(07,'美团',2,3,'2018-05-09'),
(08,'微信',1,2,'2018-05-09'),
(09,'相机',3,3,'2018-05-09'),
(10,'相机',4,3,'2018-05-05'),
(11,'相机',5,4,'2018-05-05'),
(12,'美团',6,5,'2018-05-10'),
(13,'微信',5,4,'2018-05-09'),
(14,'相机',2,2,'2018-05-06'),
(15,'相机',1,2,'2018-05-06'),
(01,'美团',1,2,'2018-05-08'),
(02,'微信',1,3,'2018-05-06'),
(03,'相机',3,2,'2018-05-04'),
(04,'微信',4,3,'2018-05-10'),
(05,'相机',4,2,'2018-05-10'),
(06,'相机',2,2,'2018-05-07'),
(07,'美团',2,3,'2018-05-09'),
(08,'微信',1,2,'2018-05-08'),
(09,'相机',3,3,'2018-05-09'),
(10,'相机',4,3,'2018-05-10'),
(11,'相机',5,4,'2018-05-05'),
(12,'美团',6,5,'2018-05-10'),
(13,'微信',5,4,'2018-05-08'),
(14,'相机',2,2,'2018-05-07'),
(15,'相机',1,2,'2018-05-07'),
(01,'美团',1,2,'2018-05-08'),
(02,'微信',1,3,'2018-05-06'),
(03,'相机',3,2,'2018-05-10'),
(04,'微信',4,3,'2018-05-10'),
(05,'相机',4,2,'2018-05-10'),
(06,'相机',2,2,'2018-05-07'),
(07,'美团',2,3,'2018-05-09'),
(08,'微信',1,2,'2018-05-08'),
(09,'相机',3,3,'2018-05-10'),
(10,'相机',4,3,'2018-05-10'),
(11,'相机',5,4,'2018-05-05'),
(12,'美团',6,5,'2018-05-10'),
(13,'微信',5,4,'2018-05-08'),
(14,'相机',2,2,'2018-05-07'),
(15,'相机',1,2,'2018-05-07');
--答案:
--次日留存:select * from 第一天 left join 第二天 on uid相同
--第一天活跃用户:
select * from userinfo where app_name = '相机' and dayno = '2018-05-01';
--次日留存left join+子查询:
select * from
(select * from userinfo where app_name = '相机' and dayno = '2018-05-01') a
left join
(select * from userinfo where app_name = '相机' and dayno = '2018-05-02') b
on a.uid = b.uid;
--计算活跃用户数:
select a.dayno,count(a.uid) as '用户活跃数',count(b.uid) as '次日留存数'
from
(select * from userinfo where app_name = '相机' and dayno = '2018-05-01') a
left join
(select * from userinfo where app_name = '相机' and dayno = '2018-05-02') b
on a.uid = b.uid;
-- 可用CASE...WHEN和DATEDIFF计算次留,三留,七留:
select a.dayno,count(distinct a.uid) as '用户活跃数',
count(distinct case when datediff(b.dayno,a.dayno) = 1 then b.uid else null end) as '次日留存数',
count(distinct case when datediff(b.dayno,a.dayno) = 2 then b.uid else null end) as '三日留存数',
count(distinct case when datediff(b.dayno,a.dayno) = 6 then b.uid else null end) as '七日留存数'
from
(select * from userinfo where app_name = '相机' and dayno = '2018-05-01') a
left join
(select * from userinfo where app_name = '相机') b
on a.uid = b.uid;
--最终答案:
select a.dayno,count(distinct a.uid) as '用户活跃数',
count(distinct case when datediff(b.dayno,a.dayno) = 1 then b.uid else null end) as '次日留存数',
count(distinct case when datediff(b.dayno,a.dayno) = 2 then b.uid else null end) as '三日留存数',
count(distinct case when datediff(b.dayno,a.dayno) = 6 then b.uid else null end) as '七日留存数',
concat(round(count(distinct case when datediff(b.dayno,a.dayno) = 1 then b.uid else null end)/count(distinct a.uid)*100,2),"%") as '次日留存率',
concat(round(count(distinct case when datediff(b.dayno,a.dayno) = 2 then b.uid else null end)/count(distinct a.uid)*100,2),"%") as '三日留存率',
concat(round(count(distinct case when datediff(b.dayno,a.dayno) = 6 then b.uid else null end)/count(distinct a.uid)*100,2),"%") as '七日留存率'
from
(select * from userinfo where app_name = '相机') a
left join
(select * from userinfo where app_name = '相机') b
on a.uid = b.uid
group by dayno;
总结: 遇到只有一个表,但是需要计数时间间隔的问题,就要想到用自联结来求时间间隔
类似题目:找出连续出现N次的内容
链家面试题:如何分析留存率?
行列互换
将表1转换为表2
表1:
| name | english | maths | music |
|---|---|---|---|
| Jim | 90 | 88 | 99 |
表2:
| name | subject | score |
|---|---|---|
| Jim | english | 90 |
| Jim | maths | 88 |
| Jim | music | 99 |
--创建表格
create table a (
name varchar(20),
english int,
maths int,
music int);
insert into a values
("Jim",90,88,99);
--答案:
select name,'english' as subject,english as score
from a
union
select name,'maths' as subject,maths as score
from a
union
select name,'music' as subject,music as score
from a;
分组求和
将表1计算后得到表2,并描述计算过程
表1:
| FDATE | VALUE |
|---|---|
| 2018/11/23 | 10 |
| 2018/12/31 | 3 |
| 2019/2/9 | 53 |
| 2019/3/31 | 23 |
| 2019/7/8 | 11 |
| 2019/7/31 | 10 |
表2:
--创建表格
create table A2 (
FDATE datetime,
value int
);
insert into a2 values
("2018/11/23",10),
("2018/12/31",3),
("2019/2/9",53),
("2019/3/31",23),
("2019/7/8",11),
("2019/7/31",10);
--答案:
SELECT A.*,
SUM(VALUE) OVER (PARTITION BY FYEAR ORDER BY FMONTH) AS YSUM,
SUM(VALUE) OVER (ORDER BY FYEAR,FMONTH) AS SUM
FROM
(SELECT YEAR(FDATE) FYEAR, MONTH(FDATE) FMONTH,
SUM(VALUE) AS VALUE
FROM A2 GROUP BY YEAR(FDATE),MONTH(FDATE)) A
用户行为分析
表1:
表2:
| name | 最近登陆时间 | 登录次数 |
|---|---|---|
| test1 | 2007-10-25-14:20:10 | 1 |
| test2 | 2007-05-25-14:22:36 | 2 |
| test4 | 2007-11-25-16:31:26 | 1 |
表3:
--创建表格
create table userlog
(
id int ,
name varchar(10),
EmailAddress varchar(50),
lastlogon varchar(50)
);
insert into userlog values(100,'test4','test4@yahoo.cn','2007-11-25 16:31:26'),
(19,'test1','test4@yahoo.cn','2007-10-25 14:13:46'),
(42,'test1','test4@yahoo.cn','2007-10-25 14:20:10'),
(45,'test2','test4@yahoo.cn','2007-4-25 14:17:39'),
(49,'test2','test4@yahoo.cn','2007-5-25 14:22:36');
--第1题答案:
SELECT name,
max(date_format(lastlogon,'%Y-%m-%d-%T')) AS 最近登陆时间,
count(distinct date_format(lastlogon,'%Y-%m-%d')) as 登录次数
from userlog
group by name
--第2题答案:
SELECT name,
max(date_format(lastlogon,'%Y-%m-%d-%T')) AS 最近登陆时间,
count(distinct date_format(lastlogon,'%Y-%m-%d')) as 登录次数
from userlog
group by name





