根据本题主要用来记录初次学习coalesce函数、group by with rollup函数的一些想法。
原题链接https://www.nowcoder.com/practice/f6b4770f453d4163acc419e3d19e6746?tpId=240&tqId=2183006&ru=%2Fpractice%2F4963f6d63dde48d787aaa2b43460fb4b&qru=%2Fta%2Fsql-advanced%2Fquestion-ranking&sourceUrl=%2Fexam%2Foj%3Fpage%3D1%26tab%3DSQL%25E7%25AF%2587%26topicId%3D240
这里牛客给了一句提示:
我的初始思路:首先我是想用union
来联合两个select
来做这道题,先搜索每个月的数据最后再汇总。根据需要的目标来对每一列进行检索。
第一版sql
select
date_format(submit_time,"%y%m") as submit_month,
count(question_id) month_q_cnt,
month_q_cnt/每个月份的天数 avg_day_q_cnt
from practice_record
where year(submint_time)='2021'
group by submit_month
order by submit_month
在这里我卡住了,我不知道月份怎么求这里学习了解了日期处理的两个函数
DAYOFMONTH(d) 函数返回 d 是一个月中的第几天,范围为 1~31。
LAST_DAY()函数是取某个月最后一天的日期。
这两个函数比较好理解不再赘述。
根据这两个我改进了一下初始代码,以下为第二版
select
year_mon as submit_month,
count(question_id) month_q_cnt,
round(month_q_cnt/t.days_month, 3) avg_day_q_cnt
from (
select
question_id,
dayofmonth(last_day(submit_time)) as days_month,
date_format(submit_time,"%y%m") as year_mon
from practice_record
where year(submint_time)='2021'
) as t
group by submit_month
order by submit_month
但是这个跑不通,在days_month
有问题
在这里我单独跑了一下内层select
这里看到问题了吗,同一个月份,question_id
有多个值!!!!虽然都是同样的数,但是可能不知道要取哪一个,这里我试了max
、min
都可以继续完成接下来的操作。另外在date_format(submit_time,"%Y%m") as year_mon
这一句中记得用大写的Y才是完整年份,谨记。
select
year_mon as submit_month,
count(question_id) month_q_cnt,
round(count(question_id)/any_value(t.days_month), 3) avg_day_q_cnt
from (
select
question_id,
dayofmonth(last_day(submit_time)) as days_month,
date_format(submit_time,"%Y%m") as year_mon
from practice_record
where year(submit_time)=2021
) as t
group by submit_month
order by submit_month
这样就可以进行union
下一步
select
year_mon as submit_month,
count(question_id) month_q_cnt,
round(count(question_id) / any_value (t.days_month), 3) avg_day_q_cnt
from
(
select
question_id,
dayofmonth (last_day (submit_time)) as days_month,
date_format (submit_time, "%Y%m") as year_mon
from
practice_record
where
year (submit_time) = 2021
) as t
group by
submit_month
union
select
'2021汇总' as submit_month,
count(question_id) as month_q_cnt,
round(count(question_id) / 31, 3) as avg_day_q_cnt
from
practice_record
where
year (submit_time) = 2021
order by
submit_month
这样就可以a掉这道题了,a之前还有个问题就是order by
的位置,union
在没有括号的情况下只能使用一个order by
a了之后当然要去欣赏大佬的做法,这里可以注意到大佬的做法非常简洁,我们就注意到题目一开始给的注意我们是不是遗忘了,好像就是在提示在聚合函数里下功夫。然后我们就一起学习一下下面这个代码吧。
select
coalesce(year_mon, '2021汇总') as submit_month,
count(question_id) as month_q_cnt,
round(count(question_id) / max(t.days_month), 3) as avg_day_cnt
from
(
select
question_id,
dayofmonth (last_day (submit_time)) as days_month,
date_format (submit_time, '%Y%m') as year_mon
from
practice_record
where
year (submit_time) = 2021
) as t
group by
t.year_mon
with
rollup;
知识点讲解:
COALESCE()
函数返回参数列表中第一个不是NULL
的值。如果全部参数都是NULL
,该函数将返回 NULL。WITH ROLLUP
可以实现在分组统计数据基础上再进行相同的统计(SUM,AVG,COUNT…)
。
首先应该会产生几个疑问:
- 为什么这里
coalesce
函数可以用来简化代码 order by
不需要了吗WITH ROLLUP
很迷,第一次接触我确实对这个很疑惑。
接下来我们从另一个例子进行切入,这部分copy另一个网站https://www.runoob.com/mysql/mysql-group-by-statement.html
例如我们将以上的数据表按名字进行分组,再统计每个人登录的次数:
+----+--------+---------------------+--------+
| id | name | date | signin |
+----+--------+---------------------+--------+
| 1 | 小明 | 2016-04-22 15:25:33 | 1 |
| 2 | 小王 | 2016-04-20 15:25:47 | 3 |
| 3 | 小丽 | 2016-04-19 15:26:02 | 2 |
| 4 | 小王 | 2016-04-07 15:26:14 | 4 |
| 5 | 小明 | 2016-04-11 15:26:40 | 4 |
| 6 | 小明 | 2016-04-04 15:26:54 | 2 |
+----+--------+---------------------+--------+
mysql> SELECT name, COUNT(*) FROM employee_tbl GROUP BY name;
+--------+----------+
| name | COUNT(*) |
+--------+----------+
| 小丽 | 1 |
| 小明 | 3 |
| 小王 | 2 |
+--------+----------+
mysql> SELECT name, SUM(signin) as signin_count FROM employee_tbl GROUP BY name WITH ROLLUP;
+--------+--------------+
| name | signin_count |
+--------+--------------+
| 小丽 | 2 |
| 小明 | 7 |
| 小王 | 7 |
| NULL | 16 |
+--------+--------------+
4 rows in set (0.00 sec)
通过这两个对比我们可以看到多了with rollup
后会多一行name
为空的信息。这时候就需要体现coalesce
的作用了
coalesce 语法:
select coalesce(a,b,c);
参数说明:如果anull,则选择b;如果bnull,则选择c;如果a!=null,则选择a;如果a b c 都为null ,则返回为null(没意义)。
以下实例中如果名字为空我们使用总数代替:
mysql> SELECT coalesce(name, '总数'), SUM(signin) as signin_count FROM employee_tbl GROUP BY name WITH ROLLUP;
+--------------------------+--------------+
| coalesce(name, '总数') | signin_count |
+--------------------------+--------------+
| 小丽 | 2 |
| 小明 | 7 |
| 小王 | 7 |
| 总数 | 16 |
+--------------------------+--------------+
4 rows in set (0.01 sec)
到这里我们就大概能够明白了这两个搭配的作用吧。
对我们原来的题目也就更加清晰的理解了
最后一个问题,为啥没有order by 也会自动排序呢?希望大家可以告诉我!!