550. 游戏玩法分析 IV
Question
Table: Activity
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| player_id | int |
| device_id | int |
| event_date | date |
| games_played | int |
+--------------+---------+
(player_id,event_date)是此表的主键(具有唯一值的列的组合)。
这张表显示了某些游戏的玩家的活动情况。
每一行是一个玩家的记录,他在某一天使用某个设备注销之前登录并玩了很多游戏(可能是 0)。
编写解决方案,报告在首次登录的第二天再次登录的玩家的 比率 , 四舍五入到小数点后两位 。换句话说,你需要计算从首次登录日期开始至少连续两天登录的玩家的数量,然后除以玩家总数。
结果格式如下所示:
示例 1:
输入: Activity table: +-----------+-----------+------------+--------------+ | player_id | device_id | event_date | games_played | +-----------+-----------+------------+--------------+ | 1 | 2 | 2016-03-01 | 5 | | 1 | 2 | 2016-03-02 | 6 | | 2 | 3 | 2017-06-25 | 1 | | 3 | 1 | 2016-03-02 | 0 | | 3 | 4 | 2018-07-03 | 5 | +-----------+-----------+------------+--------------+ 输出: +-----------+ | fraction | +-----------+ | 0.33 | +-----------+ 解释: 只有 ID 为 1 的玩家在第一天登录后才重新登录,所以答案是 1/3 = 0.33
Answer
這個是直接找到所有的連續登陸的日期的然後限定登陸日期為用戶的註冊日期來查詢的
/* 向上移動一位為下一次的登陸時間 下一次登陸時間-前一次登錄時間 = 1為連續登陸 */ SELECT ROUND(SUM(if(d.`differ`=1,1,0))/SUM(1),2) as 'fraction' FROM( SELECT player_id, IFNULL(datediff(t.next_date,t.event_date),0) as 'differ'# 這個等於1的時候說明是連續登陸 FROM ( SELECT player_id, event_date, IFNULL(LEAD(event_date,1) OVER(Partition BY player_id ORDER BY event_date),0) as 'next_date' FROM Activity ) t WHERE (event_date, player_id) IN (SELECT MIN(event_date), player_id FROM Activity GROUP BY player_id)) d
/* 向上移動一位為下一次的登陸時間
下一次登陸時間-前一次登錄時間 = 1為連續登陸
*/
SELECT
ROUND(SUM(if(d.`differ`=1,1,0))/SUM(1),2) as 'fraction'
FROM(
SELECT
player_id,
IFNULL(datediff(t.next_date,t.event_date),0) as 'differ'# 這個等於1的時候說明是連續登陸
FROM
(
SELECT
player_id,
event_date,
IFNULL(LEAD(event_date,1) OVER(Partition BY player_id ORDER BY event_date),0) as 'next_date'
FROM
Activity
) t
WHERE
(event_date, player_id) IN (SELECT
MIN(event_date),
player_id
FROM
Activity
GROUP BY
player_id)) d
將前後兩次的登陸時間放在同一條記錄中
SELECT
player_id,
event_date,
IFNULL(LEAD(event_date,1) OVER(Partition BY player_id ORDER BY event_date),0) as 'next_date'
FROM
Activity
output:
| player_id | event_date | next_date |
| --------- | ---------- | ---------- |
| 1 | 2016-03-01 | 2016-03-02 |
| 1 | 2016-03-02 | "0" |
| 2 | 2017-06-25 | "0" |
| 3 | 2016-03-02 | 2018-07-03 |
| 3 | 2018-07-03 | "0" |
計算兩個日期之間的差
SELECT
player_id,
t.event_date,
next_date,
IFNULL(datediff(t.next_date,t.event_date),0) as 'differ'# 這個等於1的時候說明是連續登陸
FROM
(
SELECT
player_id,
event_date,
IFNULL(LEAD(event_date,1) OVER(Partition BY player_id ORDER BY event_date),0) as 'next_date'
FROM
Activity
) t
output:
| player_id | event_date | next_date | differ |
| --------- | ---------- | ---------- | ------ |
| 1 | 2016-03-01 | 2016-03-02 | 1 |
| 1 | 2016-03-02 | "0" | 0 |
| 2 | 2017-06-25 | "0" | 0 |
| 3 | 2016-03-02 | 2018-07-03 | 853 |
| 3 | 2018-07-03 | "0" | 0 |
這個時候差距為1的為連續登陸一天的, 而1到0結束是連續登陸結束的日期
更快速的方法
表格自連,並且是註冊日期的基礎上,找到表格中用戶在註冊日期第二天的登錄記錄添加的表格中, 計算存在第二天日期的則為重複兩天登錄的用戶
select
round(count(distinct a.player_id)/count(distinct t.player_id),2) as fraction
from(
select
player_id,
min(event_date) as first_date
from Activity
group by player_id
) as t left join Activity a
on t.player_id=a.player_id
and datediff(a.event_date,first_date)=1