SELECT user_grade, ROUND(COUNT(uid) / MAX(user_cnt), 2) as ratio
FROM (
SELECT uid, user_cnt,
CASE
WHEN last_dt_diff >= 30 THEN "流失用户"
WHEN last_dt_diff >= 7 THEN "沉睡用户"
WHEN first_dt_diff < 7 THEN "新晋用户"
ELSE "忠实用户"
END as user_grade
FROM (
SELECT uid, user_cnt,
TIMESTAMPDIFF(DAY,first_dt,cur_dt) as first_dt_diff,
TIMESTAMPDIFF(DAY,last_dt,cur_dt) as last_dt_diff
FROM (
SELECT uid, MIN(DATE(in_time)) as first_dt,
MAX(DATE(out_time)) as last_dt
FROM tb_user_log
GROUP BY uid
) as t_uid_first_last
LEFT JOIN (
SELECT MAX(DATE(out_time)) as cur_dt,
COUNT(DISTINCT uid) as user_cnt
FROM tb_user_log
) as t_overall_info ON 1
) as t_user_info
) as t_user_grade
GROUP BY user_grade
ORDER BY ratio DESC;
最内层子查询 (
SELECT uid, MIN(DATE(in_time)) as first_dt, MAX(DATE(out_time)) as last_dt FROM tb_user_log GROUP BY uid
):
- 这个查询从
tb_user_log
表中为每个用户提取最早登录日期(first_dt
)和最后登录日期(last_dt
)。次内层子查询 (
SELECT MAX(DATE(out_time)) as cur_dt, COUNT(DISTINCT uid) as user_cnt FROM tb_user_log
):
- 这个查询计算当前日期(
cur_dt
)和总用户数(user_cnt
)。连接两个子查询 (
LEFT JOIN ... ON 1
):
- 通过左连接将两个子查询的结果结合起来,
ON 1
表示无条件连接,因为左表(t_uid_first_last
)已经包含了所有需要的用户信息,而右表(t_overall_info
)只包含一个总体的记录。计算日期差异 (
TIMESTAMPDIFF(DAY,first_dt,cur_dt) as first_dt_diff, TIMESTAMPDIFF(DAY,last_dt,cur_dt) as last_dt_diff
):
- 计算用户第一次登录到当前日期的时间差(
first_dt_diff
)和用户最后一次登录到当前日期的时间差(last_dt_diff
)。用户等级分类 (
CASE WHEN last_dt_diff >= 30 THEN "流失用户" ... ELSE "忠实用户" END as user_grade
):
- 使用
CASE
语句根据last_dt_diff
和first_dt_diff
将用户分为四个等级:
- 如果用户最后一次登录距离现在超过30天,则标记为“流失用户”。
- 如果用户最后一次登录距离现在超过7天但少于30天,则标记为“沉睡用户”。
- 如果用户第一次登录距离现在少于7天,则标记为“新晋用户”。
- 否则,用户标记为“忠实用户”。
外部查询 (
SELECT user_grade, ROUND(COUNT(uid) / MAX(user_cnt), 2) as ratio ... GROUP BY user_grade ORDER BY ratio DESC
):
- 这个查询使用前面的结果作为临时表
t_user_grade
。COUNT(uid)
计算每个用户等级的用户数。MAX(user_cnt)
获取总用户数。ROUND(COUNT(uid) / MAX(user_cnt), 2)
计算每个用户等级在总用户中的比例,并保留两位小数。GROUP BY user_grade
按用户等级分组结果。ORDER BY ratio DESC
按比例降序排序结果。