1097 游戏玩法分析V
SQL架构
Create table If Not Exists Activity_1097 (player_id int, device_id int, event_date date, games_played int);
Truncate table Activity_1097;
insert into Activity_1097 (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-01', '5');
insert into Activity_1097 (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-02', '6');
insert into Activity_1097 (player_id, device_id, event_date, games_played) values ('2', '3', '2017-06-25', '1');
insert into Activity_1097 (player_id, device_id, event_date, games_played) values ('3', '1', '2016-03-01', '0');
insert into Activity_1097 (player_id, device_id, event_date, games_played) values ('3', '4', '2018-07-03', '5');
Activity 活动记录表
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| player_id | int |
| device_id | int |
| event_date | date |
| games_played | int |
+--------------+---------+
(player_id,event_date)是此表的主键
这张表显示了某些游戏的玩家的活动情况
每一行表示一个玩家的记录,在某一天使用某个设备注销之前,登录并玩了很多游戏(可能是 0)
玩家的 安装日期 定义为该玩家的第一个登录日。
玩家的 第一天留存率 定义为:假定安装日期为 X 的玩家的数量为 N ,其中在 X 之后的一天重新登录的玩家数量为 M ,M/N 就是第一天留存率,四舍五入到小数点后两位。
编写一个 SQL 查询,报告所有安装日期、当天安装游戏的玩家数量和玩家的第一天留存率。
查询结果格式如下所示:
Activity 表:
+-----------+-----------+------------+--------------+
| player_id | device_id | event_date | games_played |
+-----------+-----------+------------+--------------+
| 1 | 2 | 2016-03-01 | 5 |
| 1 | 2 | 2016-03-02 | 6 |
| 2 | 3 | 2017-06-25 | 1 |
| 3 | 1 | 2016-03-01 | 0 |
| 3 | 4 | 2016-07-03 | 5 |
+-----------+-----------+------------+--------------+
Result 表:
+------------+----------+----------------+
| install_dt | installs | Day1_retention |
+------------+----------+----------------+
| 2016-03-01 | 2 | 0.50 |
| 2017-06-25 | 1 | 0.00 |
+------------+----------+----------------+
玩家 1 和 3 在 2016-03-01 安装了游戏,但只有玩家 1 在 2016-03-02 重新登录,所以 2016-03-01 的第一天留存率是 1/2=0.50
玩家 2 在 2017-06-25 安装了游戏,但在 2017-06-26 没有重新登录,因此 2017-06-25 的第一天留存率为 0/1=0.00
解题
select a1.event_date as install_dt, count(*) as installs,
round(avg(a2.event_date is not null),2) as Day1_retention
from (select player_id, min(event_date) as event_date
from Activity_1097
group by player_id) a1
left join Activity_1097 a2
on a1.player_id = a2.player_id and datediff(a2.event_date,a1.event_date)=1
group by a1.event_date;
用窗口函数选出注册日期(最小记录日期)
直接对 当前日期与注册日期的差值 为1的用户进行计数/所有的用户 ,按注册日期进行分组
is not null的判断 , 字段值为null的话为0, 字段值不为null的话 返回 1 然后对所有参加计算的数为分母求比例。
select
first_date as install_dt,
count(distinct player_id) as installs,
round(sum(if(datediff(event_date,first_date)=1,1,0))/count(distinct player_id),2)
as Day1_retention
from(select
player_id
,event_date
,min(event_date) over(partition by player_id) as first_date
from Activity_1097)tm
group by first_date;