SQL(入门实战07)

最新推荐文章于 2024-04-11 12:55:43 发布

林园857

最新推荐文章于 2024-04-11 12:55:43 发布

阅读量770

点赞数

分类专栏： mysql 文章标签：数据库开发

本文链接：https://blog.csdn.net/weixin_44464850/article/details/124547372

版权

mysql 专栏收录该内容

24 篇文章 7 订阅

订阅专栏

1.统计作答次数

有一个试卷作答记录表exam_record，请从中统计出总作答次数total_pv、试卷已完成作答数complete_pv、已完成的试卷数complete_exam_cnt。

示例数据 exam_record表（uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分）：

在这里插入图片描述

示例输出：

在这里插入图片描述

解释：表示截止当前，有11次试卷作答记录，已完成的作答次数为7次（中途退出的为未完成状态，其交卷时间和份数为NULL），已完成的试卷有9001和9002两份。

示例1

drop table if exists exam_record;
CREATE TABLE exam_record (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    uid int NOT NULL COMMENT '用户ID',
    exam_id int NOT NULL COMMENT '试卷ID',
    start_time datetime NOT NULL COMMENT '开始时间',
    submit_time datetime COMMENT '提交时间',
    score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES
(1001, 9001, '2020-01-02 09:01:01', '2020-01-02 09:21:01', 80),
(1001, 9001, '2021-05-02 10:01:01', '2021-05-02 10:30:01', 81),
(1001, 9001, '2021-06-02 19:01:01', '2021-06-02 19:31:01', 84),
(1001, 9002, '2021-09-05 19:01:01', '2021-09-05 19:40:01', 89),
(1001, 9001, '2021-09-02 12:01:01', null, null),
(1001, 9002, '2021-09-01 12:01:01', null, null),
(1002, 9002, '2021-02-02 19:01:01', '2021-02-02 19:30:01', 87),
(1002, 9001, '2021-05-05 18:01:01', '2021-05-05 18:59:02', 90),
(1003, 9001, '2021-02-06 12:01:01', null, null),
(1003, 9001, '2021-09-07 10:01:01', '2021-09-07 10:31:01', 89),
(1004, 9001, '2021-09-06 12:01:01', null, null);

输出

11|7|2

题解

思路：
1.总作答次数：count(exam_id) as total_pv；
2.试卷已完成作答数，count(A)会忽略A的值为null的行：
count(submit_time) as complete_pv；
3.已完成的试卷数，已完成时才计数用if判断，试卷可能被完成多次，
需要去重用distinct：
count(distinct if(submit_time is not null, exam_id, null)) as complete_exam_cnt

select 
count(exam_id)as total_pv,
count(submit_time) as complete,
count(distinct if(submit_time is not null, exam_id,null)) as complete_exam_cnt
from exam_record record

2.得分不小于平均分的最低分

请从试卷作答记录表中找到SQL试卷得分不小于该类试卷平均得分的用户最低得分。

示例数据 exam_record表（uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分）

在这里插入图片描述

examination_info表（exam_id试卷ID, tag试卷类别, difficulty试卷难度, duration考试时长, release_time发布时间）

在这里插入图片描述

示例输出数据：

在这里插入图片描述

解释：试卷9001和9002为SQL类别，作答这两份试卷的得分有[80,89,87,90]，平均分为86.5，不小于平均分的最小分数为87

示例1

drop table if exists examination_info;
CREATE TABLE  examination_info (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    exam_id int UNIQUE NOT NULL COMMENT '试卷ID',
    tag varchar(32) COMMENT '类别标签',
    difficulty varchar(8) COMMENT '难度',
    duration int NOT NULL COMMENT '时长',
    release_time datetime COMMENT '发布时间'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

drop table if exists exam_record;
CREATE TABLE  exam_record (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    uid int NOT NULL COMMENT '用户ID',
    exam_id int NOT NULL COMMENT '试卷ID',
    start_time datetime NOT NULL COMMENT '开始时间',
    submit_time datetime COMMENT '提交时间',
    score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

INSERT INTO examination_info(exam_id,tag,difficulty,duration,release_time) VALUES
  (9001, 'SQL', 'hard', 60, '2020-01-01 10:00:00'),
  (9002, 'SQL', 'easy', 60, '2020-02-01 10:00:00'),
  (9003, '算法', 'medium', 80, '2020-08-02 10:00:00');

INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES
(1001, 9001, '2020-01-02 09:01:01', '2020-01-02 09:21:01', 80),
(1002, 9001, '2021-09-05 19:01:01', '2021-09-05 19:40:01', 89),
(1002, 9002, '2021-09-02 12:01:01', null, null),
(1002, 9003, '2021-09-01 12:01:01', null, null),
(1002, 9001, '2021-02-02 19:01:01', '2021-02-02 19:30:01', 87),
(1002, 9002, '2021-05-05 18:01:01', '2021-05-05 18:59:02', 90),
(1003, 9002, '2021-02-06 12:01:01', null, null),
(1003, 9003, '2021-09-07 10:01:01', '2021-09-07 10:31:01', 86),
(1004, 9003, '2021-09-06 12:01:01', null, null);

题解

思路
1.
要找类别为SQL的试卷平均得分：
得分信息在exam_record，试卷类别在表examination_info中，因此要将两个表以exam_id连接。
从连接后的表中找到类别为SQL的试卷的分数。知识点：select...from...where...
计算得分的平均值。

2.找到类别SQL的试卷得分大于平均得分的最小值：
得分信息在exam_record，试卷类别在表examination_info中，因此要将两个表以exam_id连接。
从连接后的表中找到类别为SQL的试卷且分数大于刚刚找到的平均分的分数。知识点：select...from...where...and...
从中选出最小值。


select min(record.score) as min_score_over_avg
from exam_record record
left join examination_info info
on record.exam_id=info.exam_id
where info.tag='SQL'and  record.score>=
(
select avg(record.score)
from exam_record record
left join examination_info info
on record.exam_id=info.exam_id
where info.tag='SQL'
)

3.平均活跃天数和月活人数

用户在牛客试卷作答区作答记录存储在表exam_record中，内容如下：

exam_record表（uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分）

在这里插入图片描述

请计算2021年每个月里试卷作答区用户平均月活跃天数avg_active_days和月度活跃人数mau，上面数据的示例输出如下：

在这里插入图片描述

解释：2021年7月有2人活跃，共活跃了3天（1001活跃1天，1002活跃2天），平均活跃天数1.5；2021年9月有4人活跃，共活跃了5天，平均活跃天数1.25，结果保留2位小数。

注：此处活跃指有交卷行为。

示例1

drop table if exists exam_record;
CREATE TABLE exam_record (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    uid int NOT NULL COMMENT '用户ID',
    exam_id int NOT NULL COMMENT '试卷ID',
    start_time datetime NOT NULL COMMENT '开始时间',
    submit_time datetime COMMENT '提交时间',
    score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES
(1001, 9001, '2021-07-02 09:01:01', '2021-07-02 09:21:01', 80),
(1002, 9001, '2021-09-05 19:01:01', '2021-09-05 19:40:01', 81),
(1002, 9002, '2021-09-02 12:01:01', null, null),
(1002, 9003, '2021-09-01 12:01:01', null, null),
(1002, 9001, '2021-07-02 19:01:01', '2021-07-02 19:30:01', 82),
(1002, 9002, '2021-07-05 18:01:01', '2021-07-05 18:59:02', 90),
(1003, 9002, '2021-07-06 12:01:01', null, null),
(1003, 9003, '2021-09-07 10:01:01', '2021-09-07 10:31:01', 86),
(1004, 9003, '2021-09-06 12:01:01', null, null),
(1002, 9003, '2021-09-01 12:01:01', '2021-09-01 12:31:01', 81),
(1005, 9001, '2021-09-01 12:01:01', '2021-09-01 12:31:01', 88),
(1006, 9002, '2021-09-02 12:11:01', '2021-09-02 12:31:01', 89),
(1007, 9002, '2020-09-02 12:11:01', '2020-09-02 12:31:01', 89);

题解：

思路：
1.根据提交时间进行月分组  date_format(submit_time, '%Y%m')
2.根据提交时间submit_time不为空筛选活跃的的人
3.利用COUNT(DISTINCT expr,[expr...])检索到的行中去重后非NULL值的数目.去重同一天产生的数据
4.保留两位小数 round(expr,2)

select 
 date_format(submit_time, '%Y%m') as month,
 round((count(distinct uid, 
 date_format(submit_time,'%y%m%d'))) / count(distinct uid), 2) as avg_active_days,
  count(distinct uid) as mau
from exam_record
where submit_time is not null
and year(submit_time) = 2021
group by date_format(submit_time, '%Y%m')

拓展

count( DISTINCT expr,[expr…])函数

COUNT( DISTINCT expr,[expr...])
DISTINCT:  去重
expr: 字段或表达式
[expr]: 0个或多个字段

count(expr) ：返回select语句检索到的行中非NULL值的数目，
相当于select expr from 表名，得到n行记录，去掉字段expr值为null的行数，即为count(expr) 的值；

如果expr为*，即count(*)，则select * from 表名结果中把每个字段都为null的行去掉，最后得到的行数即为count(*)的值

COUNT(DISTINCT expr,[expr...])：
返回select语句检索到的行中去重后非NULL值的数目
[expr...]：表示可以写多个字段，COUNT(DISTINCT *) 为无效写法

举例
COUNT(DISTINCT field1, field2, field3)，
表示去掉field1, field2, field3的行，
field1, field2, field3组合的结果去重后，再统计个数

4.月总刷题数和日均刷题数

现有一张题目练习记录表practice_record，示例内容如下：

在这里插入图片描述

请从中统计出2021年每个月里用户的月总刷题数month_q_cnt 和日均刷题数avg_day_q_cnt（按月份升序排序）以及该年的总体情况，示例数据输出如下：

在这里插入图片描述

解释：2021年8月共有2次刷题记录，日均刷题数为2/31=0.065（保留3位小数）；2021年9月共有3次刷题记录，日均刷题数为3/30=0.100；2021年共有5次刷题记录（年度汇总平均无实际意义，这里我们按照31天来算5/31=0.161）

示例1：

drop table if exists practice_record;
CREATE TABLE  practice_record (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    uid int NOT NULL COMMENT '用户ID',
    question_id int NOT NULL COMMENT '题目ID',
    submit_time datetime COMMENT '提交时间',
    score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

INSERT INTO practice_record(uid,question_id,submit_time,score) VALUES
(1001, 8001, '2021-08-02 11:41:01', 60),
(1002, 8001, '2021-09-02 19:30:01', 50),
(1002, 8001, '2021-09-02 19:20:01', 70),
(1002, 8002, '2021-09-02 19:38:01', 70),
(1003, 8002, '2021-08-01 19:38:01', 80);

输出

202108|2|0.065
202109|3|0.100
2021汇总|5|0.161

题解

思路：
1.筛选2021年的刷题记录：where year(submit_time)='2021'
2.获取当月的天数：DAYOFMONTH(LAST_DAY(submit_time))
3.获取当月的年月：DATE_FORMAT(submit_time, "%Y%m") as y_m
4.统计每月的总刷题数和日均刷题数：先按年月分组：group by y_m
5.统计月总刷题数：count(1)
6.月均刷题数=月总刷题数÷该月天数：count(1) / max(days_of_month)
7.汇总情况：with rollup

方式一：
select ifnull(y_m, '2021汇总') as submit_month,
    count(1) as month_q_cnt,
    round(count(1) / max(days_of_month), 3) as avg_day_q_cnt
from (
    select 
        DAYOFMONTH(LAST_DAY(submit_time)) as days_of_month,
        DATE_FORMAT(submit_time, "%Y%m") as y_m
    from practice_record
    where year(submit_time)='2021'
) as t_month_stat
group by y_m
with rollup

方式二：

with tmp as (
	select DATE_FORMAT(submit_time,'%Y%m')as submit_month,
	count(question_id)as month_q_cnt 
	from practice_record 
	where year(submit_time)=2021 
	group by submit_month 
    order by submit_month
)
	  select submit_month,month_q_cnt,
      round(month_q_cnt/
      datediff(concat(submit_month,'01')+interval 1 month,
               concat(submit_month,'01')),3)as avg_day_q_cnt 
	from tmp 
union (
	select '2021汇总' as submit_month,
    sum(month_q_cnt)as month_q_cnt,
    round(sum(month_q_cnt)/31,3) as avg_day_q_cnt 
    from tmp
		);

方式三：

	select * from 
	(
	select 
    date_format(submit_time,'%Y%m') as submit_month,
    count(1) as month_q_cnt,
 	round(count(1)/
       max(day(last_day(date_format(submit_time,'%Y%m%d')))),3) as avg_day_q_cnt
 	from practice_record
 	where year(submit_time) = 2021
 	group by date_format(submit_time,'%Y%m') 
union all
	select
      '2021汇总'   as submit_month,
       count(1)   as month_q_cnt,
       round(count(1)/31,3)   as avg_day_q_cnt
	from practice_record 
	where year(submit_time) = 2021
    ) t
    order by submit_month

拓展

with()函数

mysql版本在8.0之前不能使用with的写法。

with t1 as (
    select 
		    emp.emp_level as type,
	      count(emp.id) as n
	  from
		    emp
		group by
		    emp.emp_level
),
t2 as (
    select 
		    d.depot_level as type,
	      count(d.id) as n
	  from
		    depot as d
		group by
		    d.depot_level
)
select 
    t1.n/t2.n
from
    t1
left join
    t2
on t1.type = t2.type

特别是从多张表中取数据时，而且每张表的数据量又很大时，使用with写法可以先筛选出来一张数据量较少的表，避免全表join。