NIUKE SQL:大厂面试真题(二)【某度信息流】

SQL162 2021年11月每天的人均浏览文章时长

场景逻辑说明:artical_id-文章ID代表用户浏览的文章的ID,artical_id-文章ID为0表示用户在非文章内容页(比如App内的列表页、活动页等)

问题:统计2021年11月每天的人均浏览文章时长(秒数),结果保留1位小数,并按时长由短到长排序

SELECT
    dt,
    ROUND(SUM(viiew_len_sec)/COUNT(DISTINCT uid),1) AS avg_viiew_len_sec

FROM
(
    SELECT
        uid,
        DATE(in_time) AS dt,
        TIMESTAMPDIFF(SECOND,in_time,out_time) AS viiew_len_sec
    FROM tb_user_log
    WHERE artical_id<>0
    AND YEAR(in_time)=2021 AND MONTH(in_time)=11
) AS T

GROUP BY dt
ORDER BY avg_viiew_len_sec

SQL163 每篇文章同一时刻最大在看人数

问题:统计每篇文章同一时刻最大在看人数,如果同一时刻有进入也有离开时,先记录用户数增加再记录减少,结果按最大人数降序

WITH A AS 
(
    SELECT artical_id,in_time AS dt,1 AS diff
    FROM tb_user_log
    WHERE artical_id!=0
    UNION ALL
    SELECT artical_id,out_time AS dt,-1 AS diff
    FROM tb_user_log
    WHERE artical_id!=0
)

SELECT 
    artical_id,
    MAX(uv) AS max_uv
FROM 
(
    SELECT artical_id,SUM(diff)over(partition by artical_id order by dt,diff DESC) AS uv
    FROM A
) AS B

GROUP BY artical_id
ORDER BY max_uv DESC

SQL164 2021年11月每天新用户的次日留存率

问题:统计2021年11月每天新用户的次日留存率(保留2位小数)

次日留存率为当天新增的用户数中第二天又活跃了的用户数占比

如果in_time-进入时间和out_time-离开时间跨天了,在两天里都记为该用户活跃过,结果按日期升序

SELECT
    A.dt,
    ROUND(COUNT(second_day_user)/COUNT(new_user),2) AS uv_left_rate
FROM 
(SELECT
    uid AS new_user,
    MIN(DATE(in_time)) AS dt
FROM tb_user_log
GROUP BY uid) AS A
LEFT JOIN 
(SELECT
    uid AS second_day_user,
    DATE((in_time)) AS dt
FROM tb_user_log
UNION
SELECT
    uid AS second_day_user,
    DATE((out_time)) AS dt
FROM tb_user_log) AS B 
ON new_user=second_day_user AND TIMESTAMPDIFF(DAY,A.dt,B.dt)=1
WHERE YEAR(A.dt)=2021 AND MONTH(A.dt)=11
GROUP BY A.dt
ORDER BY A.dt

SQL165 统计活跃间隔对用户分级结果

问题:统计活跃间隔对用户分级后,各活跃等级用户占比,结果保留两位小数,且按占比降序排序

用户等级标准简化为:忠实用户(近7天活跃过且非新晋用户)、新晋用户(近7天新增)、沉睡用户(近7天未活跃但更早前活跃过)、流失用户(近30天未活跃但更早前活跃过)。

假设今天就是数据中所有日期的最大值。

近7天表示包含当天T的近7天,即闭区间[T-6, T]

SELECT
    user_grade,
    ROUND(COUNT(uid)/MAX(user_cnt),2) AS ratio
FROM
(
SELECT
    uid,
    user_cnt,
    (CASE 
    WHEN TIMESTAMPDIFF(DAY,first_use_time,to_day)<7 THEN '新晋用户'
    WHEN TIMESTAMPDIFF(DAY,last_use_time,to_day) BETWEEN 7 AND 29 THEN '沉睡用户'
    WHEN TIMESTAMPDIFF(DAY,last_use_time,to_day)>=30 THEN '流失用户'
    ELSE '忠实用户' END ) AS user_grade

FROM
    (
    SELECT
        uid,
        DATE(MAX(out_time)) AS last_use_time,
        DATE(MIN(in_time)) AS first_use_time
    FROM tb_user_log
    GROUP BY uid 
    ) AS T1
    LEFT JOIN
    (
    SELECT
        DATE(MAX(out_time)) AS to_day,
        COUNT(DISTINCT uid) AS user_cnt
    FROM tb_user_log
    ) AS T2
    ON 1=1
)AS T3

GROUP BY user_grade
ORDER BY ratio DESC
LEFT JOIN ON 1=1

left join on 1=1表示使用左连接将两个表连接起来,并且使用1=1作为连接条件(on 1=1)

这意味着任何非空的行都会被连接

通常用于将两个表连接在一起,而不需要指定实际的连接条件

可以用于需要额外添加列的情况

使用1=1作为连接条件的效果是将左表中的所有行与右表中的所有行匹配,因此结果集中将包含左表中的所有行和右表中的所有行,不管它们是否匹配

SQL166 每天的日活数及新用户占比

统计每天的日活数及新用户占比

新用户占比=当天的新用户数÷当天活跃用户数(日活数)

如果in_time-进入时间和out_time-离开时间跨天了,在两天里都记为该用户活跃过

新用户占比保留2位小数,结果按日期升序排序

SELECT
    dt,
    dau,
    ROUND(IFNULL(new_user_cnt,0)/dau,2) AS uv_new_ratio

FROM 
(
    SELECT 
        DATE(time) AS dt,
        COUNT(uid) AS new_user_cnt
    FROM (SELECT DISTINCT uid,MIN(in_time) AS time FROM tb_user_log GROUP BY uid) AS T1 
    GROUP BY dt
) AS T2
RIGHT JOIN
(
    SELECT
    dt,
    COUNT(uid) AS dau
    FROM
    (
        SELECT 
            uid,
            DATE(in_time) AS dt
        FROM tb_user_log
        UNION
        SELECT 
            uid,
            DATE(out_time) AS dt
        FROM tb_user_log
    ) AS T3
    GROUP BY dt
) AS T4
USING(dt)
ORDER BY dt

SQL167 连续签到领金币

从2021年7月7日0点开始,用户每天签到可以领1金币,并可以开始累积签到天数,连续签到的第3、7天分别可额外领2、6金币

每连续签到7天后重新累积签到天数(即重置签到天数:连续第8天签到时记为新的一轮签到的第一天,领1金币)

问题:计算每个用户2021年7月以来每月获得的金币数(该活动到10月底结束,11月1日开始的签到不再获得金币)。结果按月份、ID升序排序

注:如果签到记录的in_time-进入时间和out_time-离开时间跨天了,也只记作in_time对应的日期签到了

WITH t1 AS( 
	SELECT
		DISTINCT uid,
		DATE(in_time) AS dt,
		RANK() OVER (PARTITION BY uid ORDER BY DATE(in_time)) AS rn 
	FROM tb_user_log
	WHERE
		DATE(in_time) BETWEEN '2021-07-07' AND '2021-10-31' 
        AND artical_id = 0 AND sign_in = 1
),
t2 AS (
	SELECT
	    *,
	    CASE RANK() OVER (PARTITION BY DATE_SUB(dt,INTERVAL rn day),uid ORDER BY dt )%7 
		WHEN 3 THEN 3
		WHEN 0 THEN 7
		ELSE 1
	END AS day_coin
	FROM t1
)
	SELECT
		uid,
        DATE_FORMAT(dt,'%Y%m') AS month, 
        SUM(day_coin) AS coin
	FROM t2
	GROUP BY uid,DATE_FORMAT(dt,'%Y%m')
	ORDER BY DATE_FORMAT(dt,'%Y%m'),uid
INTERVAL

1. 当函数使用时,interval()为比较函数,如:interval(10,1,3,5,7)  结果4
原理:10为被比较数,后面1,3,5,7为比较数,将后面四个依次与10比较,看后面数字组有多少个少于10,则返回其个数。前提是后面数字组为从小到大排列,否则返回结果0


2. 当关键词使用时,表示为设置时间间隔,常用在date_add()与date_sub()函数里,
如:interval 1 day ,解释为将时间间隔设置为1天

  • 8
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值