【数据库】窗口函数实战(一)

本系列包含:


窗口函数实战(一)

本篇文章是笔者在牛客网上摘选的几道比较有挑战性的SQL窗口函数编程题。

1. 试卷完成数同比2020年的增长率及排名变化【困难】(原题链接

我们首先来分析一下题目的要求:

  • 时间上,是 2020年上半年2021年上半年
  • 是已完成的试卷(submit_time 不为空),不考虑未完成的试卷。
  • 排名变化 中的排名,是该类型试卷的当年完成情况的排名。

下面提供一种比较优雅的解法,视觉上也很容易理解。

  • 将2020年和2021年的数据先分成两张表,在通过 inner join 连接,得到的就是能进行同比的试卷类别(tag)。这样相同类别的试卷的结果都在一行,方便计算增长率、排名变化。
  • 有时候适当使用 with as 语句能让代码的逻辑更清晰。
with a as (
    select tag, count(submit_time) as exam_cnt,
           rank() over (order by count(submit_time) desc) as exam_cnt_rank_20
    from exam_record
    left outer join examination_info using(exam_id)
    where date_format(submit_time,'%Y%m') between '202001' and '202006'
    group by tag
),
b as (
    select tag, count(submit_time) as exam_cnt,
           rank() over (order by count(submit_time) desc) as exam_cnt_rank_21
    from exam_record
    left outer join examination_info using(exam_id)
    where date_format(submit_time,'%Y%m') between '202101' and '202106'
    group by tag
)
select a.tag, a.exam_cnt, b.exam_cnt, 
       concat(round((b.exam_cnt-a.exam_cnt) / a.exam_cnt * 100, 1),'%') as growth_rate,
       exam_cnt_rank_20, exam_cnt_rank_21,
       cast(exam_cnt_rank_21 as signed) - cast(exam_cnt_rank_20 as signed) as rank_delta
from a
inner join b using(tag)
order by growth_rate desc, exam_cnt_rank_21 desc

这一道题的有几个需要注意的点:

  • 如何构建查询表,能比较方便地去计算同期(2020上半年、2021上半年)的增长率、排名变化。
  • 窗口函数得到的排名,是 unsigned 类型,需要转换成 signed 类型,再进行计算。

此处再提供另外一种解法。

SELECT tag, exam_cnt_20, exam_cnt_21,
       CONCAT(ROUND((exam_cnt_21 - exam_cnt_20) / exam_cnt_20 * 100, 1), '%') as growth_rate,
       exam_cnt_rank_20, exam_cnt_rank_21,
       CAST(exam_cnt_rank_21 as SIGNED) - CAST(exam_cnt_rank_20 AS SIGNED) rank_delta
FROM (
    SELECT tag, exam_cnt_20, exam_cnt_21,
           RANK() OVER (ORDER BY exam_cnt_20 DESC) exam_cnt_rank_20,
           RANK() OVER (ORDER BY exam_cnt_21 DESC) exam_cnt_rank_21
    FROM (
        SELECT tag,
               SUM(IF(DATE_FORMAT(submit_time, '%Y-%m') BETWEEN '2020-01' AND '2020-06', 1, 0)) exam_cnt_20,
               SUM(IF(DATE_FORMAT(submit_time, '%Y-%m') BETWEEN '2021-01' AND '2021-06', 1, 0)) exam_cnt_21
        FROM exam_record
        LEFT JOIN examination_info USING(exam_id)
        GROUP BY tag
    ) a
) b
WHERE exam_cnt_20 != 0 AND exam_cnt_21 != 0
ORDER BY growth_rate DESC, exam_cnt_rank_21 DESC

第二种解法给我们提供的思路是:

  • 有时候计数不一定要用 count,也可以用 sumif 的组合。
  • 这里的几个 select 嵌套子查询,在思路上也还是很清晰的。

其实这一题最核心的点,就是在一开始的时候要把 2020 年和 2021 年分成两列对待。我一开始参考了题目下面的 “解释” 那张表,反而造成后续的计算很麻烦。

2.未完成率较高的50%用户近三个月答卷情况【困难】(原题链接

我们首先来分析一下题目要求:

  • SQL 试卷未完成率较高的前 50%用户。注意:不是未完成率超过 50% 的用户。
  • 用户等级为 6 或 7 级。
  • 有作答记录(start_time)的近 3 个月。

这一题的查询限制条件本身就非常多,而又需要用到至少两个窗口函数,所以层层嵌套几乎是难免的,只能尽量通过调整语句格式让代码更加易懂一些。在这里,给出一种解法。

with a as ( 
    select exam_record.uid, level,
           count(start_time) - count(submit_time) as incomplete_cnt, 
           count(start_time) as total_cnt, 
           (count(start_time) - count(submit_time)) / count(start_time) as incomplete_rate
    from exam_record
    left outer join examination_info on exam_record.exam_id = examination_info.exam_id
    left outer join user_info on exam_record.uid = user_info.uid
    where tag = 'SQL'
    group by exam_record.uid
    order by exam_record.uid
),
b as (
    select *, percent_rank() over (order by incomplete_rate) as percent_rank_
    from a
),
c as (
    select *
    from b
    where percent_rank_ >= 0.5 and level in (6,7)
)
select uid, start_month, count(start_month) as total_cnt, count(submit_time) as complete_cnt
from (
    select uid, start_month, submit_time,
           dense_rank() over (partition by uid order by start_month desc) as dense_rank_
    from (
        select uid, date_format(start_time,'%Y%m') as start_month, submit_time
        from exam_record
        where uid in ( select uid from c )
    ) d
) e
where dense_rank_ <= 3
group by uid, start_month
order by uid, start_month
  • with 后面的 a、b、c 三张表,都是为了找到 SQL 试卷未完成率较高的前 50%,且等级为 6 或 7 级的用户 uid
  • 然后再对时间排序(dense_rank),找到近 3 个月的答题记录,进行统计即可。
  • 5
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 5
    评论
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

G皮T

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值