SQL(入门实战07)

1.统计作答次数

有一个试卷作答记录表exam_record,请从中统计出总作答次数total_pv、试卷已完成作答数complete_pv、已完成的试卷数complete_exam_cnt。

示例数据 exam_record表(uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分):

在这里插入图片描述

示例输出:

在这里插入图片描述

解释:表示截止当前,有11次试卷作答记录,已完成的作答次数为7次(中途退出的为未完成状态,其交卷时间和份数为NULL),已完成的试卷有9001和9002两份。

示例1
drop table if exists exam_record;
CREATE TABLE exam_record (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    uid int NOT NULL COMMENT '用户ID',
    exam_id int NOT NULL COMMENT '试卷ID',
    start_time datetime NOT NULL COMMENT '开始时间',
    submit_time datetime COMMENT '提交时间',
    score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES
(1001, 9001, '2020-01-02 09:01:01', '2020-01-02 09:21:01', 80),
(1001, 9001, '2021-05-02 10:01:01', '2021-05-02 10:30:01', 81),
(1001, 9001, '2021-06-02 19:01:01', '2021-06-02 19:31:01', 84),
(1001, 9002, '2021-09-05 19:01:01', '2021-09-05 19:40:01', 89),
(1001, 9001, '2021-09-02 12:01:01', null, null),
(1001, 9002, '2021-09-01 12:01:01', null, null),
(1002, 9002, '2021-02-02 19:01:01', '2021-02-02 19:30:01', 87),
(1002, 9001, '2021-05-05 18:01:01', '2021-05-05 18:59:02', 90),
(1003, 9001, '2021-02-06 12:01:01', null, null),
(1003, 9001, '2021-09-07 10:01:01', '2021-09-07 10:31:01', 89),
(1004, 9001, '2021-09-06 12:01:01', null, null);
输出

11|7|2

题解
思路:
1.总作答次数:count(exam_id) as total_pv;
2.试卷已完成作答数,count(A)会忽略A的值为null的行:
count(submit_time) as complete_pv;
3.已完成的试卷数,已完成时才计数用if判断,试卷可能被完成多次,
需要去重用distinctcount(distinct if(submit_time is not null, exam_id, null)) as complete_exam_cnt

select 
count(exam_id)as total_pv,
count(submit_time) as complete,
count(distinct if(submit_time is not null, exam_id,null)) as complete_exam_cnt
from exam_record record

2.得分不小于平均分的最低分

请从试卷作答记录表中找到SQL试卷得分不小于该类试卷平均得分的用户最低得分。

示例数据 exam_record表(uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分)

在这里插入图片描述

examination_info表(exam_id试卷ID, tag试卷类别, difficulty试卷难度, duration考试时长, release_time发布时间)

在这里插入图片描述

示例输出数据:

在这里插入图片描述

解释:试卷9001和9002为SQL类别,作答这两份试卷的得分有[80,89,87,90],平均分为86.5,不小于平均分的最小分数为87

示例1
drop table if exists examination_info;
CREATE TABLE  examination_info (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    exam_id int UNIQUE NOT NULL COMMENT '试卷ID',
    tag varchar(32) COMMENT '类别标签',
    difficulty varchar(8) COMMENT '难度',
    duration int NOT NULL COMMENT '时长',
    release_time datetime COMMENT '发布时间'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

drop table if exists exam_record;
CREATE TABLE  exam_record (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    uid int NOT NULL COMMENT '用户ID',
    exam_id int NOT NULL COMMENT '试卷ID',
    start_time datetime NOT NULL COMMENT '开始时间',
    submit_time datetime COMMENT '提交时间',
    score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

INSERT INTO examination_info(exam_id,tag,difficulty,duration,release_time) VALUES
  (9001, 'SQL', 'hard', 60, '2020-01-01 10:00:00'),
  (9002, 'SQL', 'easy', 60, '2020-02-01 10:00:00'),
  (9003, '算法', 'medium', 80, '2020-08-02 10:00:00');

INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES
(1001, 9001, '2020-01-02 09:01:01', '2020-01-02 09:21:01', 80),
(1002, 9001, '2021-09-05 19:01:01', '2021-09-05 19:40:01', 89),
(1002, 9002, '2021-09-02 12:01:01', null, null),
(1002, 9003, '2021-09-01 12:01:01', null, null),
(1002, 9001, '2021-02-02 19:01:01', '2021-02-02 19:30:01', 87),
(1002, 9002, '2021-05-05 18:01:01', '2021-05-05 18:59:02', 90),
(1003, 9002, '2021-02-06 12:01:01', null, null),
(1003, 9003, '2021-09-07 10:01:01', '2021-09-07 10:31:01', 86),
(1004, 9003, '2021-09-06 12:01:01', null, null);
题解
思路
1.
要找类别为SQL的试卷平均得分:
得分信息在exam_record,试卷类别在表examination_info中,因此要将两个表以exam_id连接。
从连接后的表中找到类别为SQL的试卷的分数。知识点:select...from...where...
计算得分的平均值。

2.找到类别SQL的试卷得分大于平均得分的最小值:
得分信息在exam_record,试卷类别在表examination_info中,因此要将两个表以exam_id连接。
从连接后的表中找到类别为SQL的试卷且分数大于刚刚找到的平均分的分数。知识点:select...from...where...and...
从中选出最小值。


select min(record.score) as min_score_over_avg
from exam_record record
left join examination_info info
on record.exam_id=info.exam_id
where info.tag='SQL'and  record.score>=
(
select avg(record.score)
from exam_record record
left join examination_info info
on record.exam_id=info.exam_id
where info.tag='SQL'
)

3.平均活跃天数和月活人数

用户在牛客试卷作答区作答记录存储在表exam_record中,内容如下:

exam_record表(uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分)

在这里插入图片描述

请计算2021年每个月里试卷作答区用户平均月活跃天数avg_active_days和月度活跃人数mau,上面数据的示例输出如下:

在这里插入图片描述

解释:2021年7月有2人活跃,共活跃了3天(1001活跃1天,1002活跃2天),平均活跃天数1.5;2021年9月有4人活跃,共活跃了5天,平均活跃天数1.25,结果保留2位小数。

注:此处活跃指有交卷行为。

示例1
drop table if exists exam_record;
CREATE TABLE exam_record (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    uid int NOT NULL COMMENT '用户ID',
    exam_id int NOT NULL COMMENT '试卷ID',
    start_time datetime NOT NULL COMMENT '开始时间',
    submit_time datetime COMMENT '提交时间',
    score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES
(1001, 9001, '2021-07-02 09:01:01', '2021-07-02 09:21:01', 80),
(1002, 9001, '2021-09-05 19:01:01', '2021-09-05 19:40:01', 81),
(1002, 9002, '2021-09-02 12:01:01', null, null),
(1002, 9003, '2021-09-01 12:01:01', null, null),
(1002, 9001, '2021-07-02 19:01:01', '2021-07-02 19:30:01', 82),
(1002, 9002, '2021-07-05 18:01:01', '2021-07-05 18:59:02', 90),
(1003, 9002, '2021-07-06 12:01:01', null, null),
(1003, 9003, '2021-09-07 10:01:01', '2021-09-07 10:31:01', 86),
(1004, 9003, '2021-09-06 12:01:01', null, null),
(1002, 9003, '2021-09-01 12:01:01', '2021-09-01 12:31:01', 81),
(1005, 9001, '2021-09-01 12:01:01', '2021-09-01 12:31:01', 88),
(1006, 9002, '2021-09-02 12:11:01', '2021-09-02 12:31:01', 89),
(1007, 9002, '2020-09-02 12:11:01', '2020-09-02 12:31:01', 89);
题解:
思路:
1.根据提交时间进行月分组  date_format(submit_time, '%Y%m')
2.根据提交时间submit_time不为空筛选活跃的的人
3.利用COUNT(DISTINCT expr,[expr...])检索到的行中去重后非NULL值的数目.去重同一天产生的数据
4.保留两位小数 round(expr,2)

select 
 date_format(submit_time, '%Y%m') as month,
 round((count(distinct uid, 
 date_format(submit_time,'%y%m%d'))) / count(distinct uid), 2) as avg_active_days,
  count(distinct uid) as mau
from exam_record
where submit_time is not null
and year(submit_time) = 2021
group by date_format(submit_time, '%Y%m')

拓展

count( DISTINCT expr,[expr…])函数

COUNT( DISTINCT expr,[expr...])
DISTINCT:  去重
expr: 字段或表达式
[expr]: 0个或多个字段

count(expr) :返回select语句检索到的行中非NULL值的数目,
相当于select expr from 表名,得到n行记录,去掉字段expr值为null的行数,即为count(expr) 的值;

如果expr为*,即count(*),则select * from 表名结果中把每个字段都为null的行去掉,最后得到的行数即为count(*)的值

COUNT(DISTINCT expr,[expr...]):
返回select语句检索到的行中去重后非NULL值的数目
[expr...]:表示可以写多个字段,COUNT(DISTINCT *) 为无效写法

举例
COUNT(DISTINCT field1, field2, field3),
表示去掉field1, field2, field3的行,
field1, field2, field3组合的结果去重后,再统计个数

4.月总刷题数和日均刷题数

现有一张题目练习记录表practice_record,示例内容如下:

在这里插入图片描述

请从中统计出2021年每个月里用户的月总刷题数month_q_cnt 和日均刷题数avg_day_q_cnt(按月份升序排序)以及该年的总体情况,示例数据输出如下:

在这里插入图片描述

解释:2021年8月共有2次刷题记录,日均刷题数为2/31=0.065(保留3位小数);2021年9月共有3次刷题记录,日均刷题数为3/30=0.100;2021年共有5次刷题记录(年度汇总平均无实际意义,这里我们按照31天来算5/31=0.161)

示例1:
drop table if exists practice_record;
CREATE TABLE  practice_record (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    uid int NOT NULL COMMENT '用户ID',
    question_id int NOT NULL COMMENT '题目ID',
    submit_time datetime COMMENT '提交时间',
    score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

INSERT INTO practice_record(uid,question_id,submit_time,score) VALUES
(1001, 8001, '2021-08-02 11:41:01', 60),
(1002, 8001, '2021-09-02 19:30:01', 50),
(1002, 8001, '2021-09-02 19:20:01', 70),
(1002, 8002, '2021-09-02 19:38:01', 70),
(1003, 8002, '2021-08-01 19:38:01', 80);
输出

202108|2|0.065
202109|3|0.100
2021汇总|5|0.161

题解
思路:
1.筛选2021年的刷题记录:where year(submit_time)='2021'
2.获取当月的天数:DAYOFMONTH(LAST_DAY(submit_time))
3.获取当月的年月:DATE_FORMAT(submit_time, "%Y%m") as y_m
4.统计每月的总刷题数和日均刷题数:先按年月分组:group by y_m
5.统计月总刷题数:count(1)
6.月均刷题数=月总刷题数÷该月天数:count(1) / max(days_of_month)
7.汇总情况:with rollup

方式一:
select ifnull(y_m, '2021汇总') as submit_month,
    count(1) as month_q_cnt,
    round(count(1) / max(days_of_month), 3) as avg_day_q_cnt
from (
    select 
        DAYOFMONTH(LAST_DAY(submit_time)) as days_of_month,
        DATE_FORMAT(submit_time, "%Y%m") as y_m
    from practice_record
    where year(submit_time)='2021'
) as t_month_stat
group by y_m
with rollup

方式二:

with tmp as (
	select DATE_FORMAT(submit_time,'%Y%m')as submit_month,
	count(question_id)as month_q_cnt 
	from practice_record 
	where year(submit_time)=2021 
	group by submit_month 
    order by submit_month
)
	  select submit_month,month_q_cnt,
      round(month_q_cnt/
      datediff(concat(submit_month,'01')+interval 1 month,
               concat(submit_month,'01')),3)as avg_day_q_cnt 
	from tmp 
union (
	select '2021汇总' as submit_month,
    sum(month_q_cnt)as month_q_cnt,
    round(sum(month_q_cnt)/31,3) as avg_day_q_cnt 
    from tmp
		);

方式三:

	select * from 
	(
	select 
    date_format(submit_time,'%Y%m') as submit_month,
    count(1) as month_q_cnt,
 	round(count(1)/
       max(day(last_day(date_format(submit_time,'%Y%m%d')))),3) as avg_day_q_cnt
 	from practice_record
 	where year(submit_time) = 2021
 	group by date_format(submit_time,'%Y%m') 
union all
	select
      '2021汇总'   as submit_month,
       count(1)   as month_q_cnt,
       round(count(1)/31,3)   as avg_day_q_cnt
	from practice_record 
	where year(submit_time) = 2021
    ) t
    order by submit_month

拓展

with()函数

mysql版本在8.0之前不能使用with的写法。

with t1 as (
    select 
		    emp.emp_level as type,
	      count(emp.id) as n
	  from
		    emp
		group by
		    emp.emp_level
),
t2 as (
    select 
		    d.depot_level as type,
	      count(d.id) as n
	  from
		    depot as d
		group by
		    d.depot_level
)
select 
    t1.n/t2.n
from
    t1
left join
    t2
on t1.type = t2.type

特别是从多张表中取数据时,而且每张表的数据量又很大时,使用with写法可以先筛选出来一张数据量较少的表,避免全表join

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值