SQL(进阶实战04)

1.试卷完成数同比2020年的增长率及排名变化

现有试卷信息表examination_info(exam_id试卷ID, tag试卷类别, difficulty试卷难度, duration考试时长, release_time发布时间):

idexam_idtagdifficultydurationrelease_time
19001SQLhard602021-01-01 10:00:00
29002C++hard802021-01-01 10:00:00
39003算法hard802021-01-01 10:00:00
49004PYTHONmedium702021-01-01 10:00:00

试卷作答记录表exam_record(uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分):

iduidexam_idstart_timesubmit_timescore
1100190012020-08-02 10:01:012020-08-02 10:31:0189
2100290012020-04-01 18:01:012020-04-01 18:59:0290
3100190012020-04-01 09:01:012020-04-01 09:21:5980
5100290012021-03-02 19:01:012021-03-02 19:32:0020
8100390012021-05-02 12:01:012021-05-02 12:31:0198
13100390012020-01-02 10:01:012020-01-02 10:31:0189
9100190022020-02-02 12:01:012020-02-02 12:20:0199
10100290022021-02-02 12:01:012020-02-02 12:43:0181
11100190022020-01-02 19:01:012020-01-02 19:59:0169
16100290022020-02-02 12:01:01
17100290022020-03-02 12:11:01
18100190022021-05-05 18:01:01
4100290032021-01-20 10:01:012021-01-20 10:10:0181
6100190032021-04-02 19:01:012021-04-02 19:40:0189
15100290032021-01-01 18:01:012021-01-01 18:59:0290
7100490042020-05-02 12:01:012020-05-02 12:20:0199
12100190042021-09-02 12:11:01
14100290042020-01-01 12:11:012020-01-01 12:31:0183

请计算2021年上半年各类试卷的做完次数相比2020年上半年同期的增长率(百分比格式,保留1位小数),以及做完次数排名变化,按增长率和21年排名降序输出。

由示例数据结果输出如下:

tagexam_cnt_20exam_cnt_21growth_rateexam_cnt_rank_20exam_cnt_rank_21rank_delta
SQL32-33.3%121

解释:2020年上半年有3个tag有作答完成的记录,分别是C++、SQL、PYTHON,它们被做完的次数分别是3、3、2,做完次数排名为1、1(并列)、3;

2021年上半年有2个tag有作答完成的记录,分别是算法、SQL,它们被做完的次数分别是3、2,做完次数排名为1、2;具体如下:

tagstart_yearexam_cntexam_cnt_rank
C++202031
SQL202031
PYTHON202023
算法202131
SQL202122

因此能输出同比结果的tag只有SQL,从2020到2021年,做完次数3=>2,减少33.3%(保留1位小数);排名1=>2,后退1名。

示例1
drop table if exists examination_info,exam_record;
CREATE TABLE examination_info (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    exam_id int UNIQUE NOT NULL COMMENT '试卷ID',
    tag varchar(32) COMMENT '类别标签',
    difficulty varchar(8) COMMENT '难度',
    duration int NOT NULL COMMENT '时长',
    release_time datetime COMMENT '发布时间'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

CREATE TABLE exam_record (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    uid int NOT NULL COMMENT '用户ID',
    exam_id int NOT NULL COMMENT '试卷ID',
    start_time datetime NOT NULL COMMENT '开始时间',
    submit_time datetime COMMENT '提交时间',
    score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

INSERT INTO examination_info(exam_id,tag,difficulty,duration,release_time) VALUES
  (9001, 'SQL', 'hard', 60, '2020-01-01 10:00:00'),
  (9002, 'C++', 'hard', 80, '2020-01-01 10:00:00'),
  (9003, '算法', 'hard', 80, '2020-01-01 10:00:00'),
  (9004, 'PYTHON', 'medium', 70, '2020-01-01 10:00:00');
	
INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES
(1001, 9001, '2020-08-02 10:01:01', '2020-08-02 10:31:01', 89),
(1002, 9001, '2020-04-01 18:01:01', '2020-04-01 18:59:02', 90),
(1001, 9001, '2020-04-01 09:01:01', '2020-04-01 09:21:59', 80),
(1002, 9003, '2021-01-20 10:01:01', '2021-01-20 10:10:01', 81),
(1002, 9001, '2021-03-02 19:01:01', '2021-03-02 19:32:00', 20),
(1001, 9003, '2021-04-02 19:01:01', '2021-04-02 19:40:01', 89),
(1004, 9004, '2020-05-02 12:01:01', '2020-05-02 12:20:01', 99),
(1003, 9001, '2021-05-02 12:01:01', '2021-05-02 12:31:01', 98),
(1001, 9002, '2020-02-02 12:01:01', '2020-02-02 12:20:01', 99),
(1002, 9002, '2020-02-02 12:01:01', '2020-02-02 12:43:01', 81),
(1001, 9002, '2020-01-02 19:01:01', '2020-01-02 19:59:01', 69),
(1001, 9004, '2021-09-02 12:11:01', null, null),
(1003, 9001, '2020-01-02 10:01:01', '2020-01-02 10:31:01', 89),
(1002, 9004, '2020-01-01 12:11:01', '2020-01-01 12:31:01', 83),
(1002, 9003, '2021-01-01 18:01:01', '2021-01-01 18:59:02', 90),
(1002, 9002, '2020-02-02 12:01:01', null, null),
(1002, 9002, '2020-03-02 12:11:01', null, null),
(1001, 9002, '2021-05-05 18:01:01', null, null);
输出

SQL|3|2|-33.3%|1|2|1

思路
1.找到所有tag在20202021上半年的完成数
2.计算growth_rate,并生成各tag完成数排名
3.取出所有需要的字段,并计算排名变化(2021-2020年,无需取绝对值)
4.筛选出20202021年均有完成记录的tag,并按题目要求排序
题解
方式一:
SELECT #第三步:取出所有需要的字段,并计算排名变化(2021-2020年,无需取绝对值)
  tag,
  exam_cnt_20,
  exam_cnt_21,
  growth_rate,
  exam_cnt_rank_20,
  exam_cnt_rank_21,
  CAST(exam_cnt_rank_21 as SIGNED) - CAST(exam_cnt_rank_20 AS SIGNED) rank_delta
FROM (
SELECT #第二步:计算growth_rate,并生成各tag完成数排名
  tag,
  exam_cnt_20,
  exam_cnt_21,
  IFNULL(
      CONCAT(
      ROUND((exam_cnt_21 - exam_cnt_20)/exam_cnt_20 * 100, 1),'%'),0) growth_rate,
  	  RANK() OVER (ORDER BY exam_cnt_20 DESC) exam_cnt_rank_20,
  	  RANK() OVER (ORDER BY exam_cnt_21 DESC) exam_cnt_rank_21
  FROM (
  SELECT #第一步:找到所有tag在2020,2021上半年的完成数
    tag,
    SUM(IF(DATE_FORMAT(submit_time, '%Y-%m') 
           BETWEEN '2020-01' AND '2020-06', 1, 0)) exam_cnt_20,
    SUM(IF(DATE_FORMAT(submit_time, '%Y-%m') 
           BETWEEN '2021-01' AND '2021-06', 1, 0)) exam_cnt_21
  FROM exam_record
  LEFT JOIN examination_info ei USING(exam_id)
  GROUP BY 1) t1
) t2
# 第四步:筛选出2020和2021年均有完成记录的tag,并按题目要求排序
WHERE exam_cnt_20 != 0 AND exam_cnt_21 != 0 
ORDER BY 4 desc, 6 desc

方式二:
	select 
	tag,
	exam_cnt_20,
	exam_cnt_21,
	concat(round((exam_cnt_21-exam_cnt_20)/exam_cnt_20*100,1),"%") 	growth_rate,
		exam_cnt_rank_20,
		exam_cnt_rank_21,
	cast(exam_cnt_rank_21 as SIGNED)-cast(exam_cnt_rank_20 as SIGNED) rank_delta
	from
	(
	select #根据去年获取下一年的试卷次数和排名
	start_year,
	tag,
	exam_cnt as exam_cnt_20,
	lead(exam_cnt)over(partition by tag order by start_year) exam_cnt_21,
	exam_cnt_rank as exam_cnt_rank_20,
	lead(exam_cnt_rank)over(partition by tag order by start_year)exam_cnt_rank_21
	from 
		(
	select #每类试卷2020/2021完成数及排名统计
	start_year,
	tag,
	exam_cnt,
	rank()over(partition by start_year order by exam_cnt desc) as exam_cnt_rank
	from 
		(
		select #获取每年上半年试卷次数
		date_format(start_time,"%Y") start_year,
		tag,
		count(*)exam_cnt
		from examination_info
		inner join exam_record 
		using(exam_id)
		where score is not null and date_format(start_time,"%m")<=6
		group by tag,date_format(start_time,"%Y")
		)t
		group by tag,start_year
		)t1
	)t2
	WHERE exam_cnt_21 IS NOT NULL
	ORDER BY growth_rate DESC, exam_cnt_rank_21 DESC;

拓展
Cast(字段名 as 转换的类型 ),其中类型可以为:

CHAR[(N)] 字符型 
DATE  日期型
DATETIME  日期和时间型
DECIMAL  float型
SIGNED  int
TIME  时间型

示例1:
作用: 四舍五入
SELECT CAST('9.0' AS DECIMAL) FROM DUAL; -- 9

SELECT CAST('9.5' AS DECIMAL) FROM DUAL; -- 10

SELECT CAST('9.6' AS DECIMAL) FROM DUAL; -- 10

SELECT CAST('9.3' AS DECIMAL) FROM DUAL; -- 9

2.对试卷得分做min-max归一化

现有试卷信息表examination_info(exam_id试卷ID, tag试卷类别, difficulty试卷难度, duration考试时长, release_time发布时间):

idexam_idtagdifficultydurationrelease_time
19001SQLhard602020-01-01 10:00:00
29002C++hard802020-01-01 10:00:00
39003算法hard802020-01-01 10:00:00
49004PYTHONmedium702020-01-01 10:00:00

试卷作答记录表exam_record(uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分)

iduidexam_idstart_timesubmit_timescore
6100390012020-01-02 12:01:012020-01-02 12:31:0168
9100190012020-01-02 10:01:012020-01-02 10:31:0189
1100190012020-01-01 09:01:012020-01-01 09:21:5990
12100290022021-05-05 18:01:01(NULL)(NULL)
3100490022020-01-01 12:01:012020-01-01 12:11:0160
2100390022020-01-01 19:01:012020-01-01 19:30:0175
7100190022020-01-02 12:01:012020-01-02 12:43:0181
10100290022020-01-01 12:11:012020-01-01 12:31:0183
4100390022020-01-01 12:01:012020-01-01 12:41:0190
5100290022020-01-02 19:01:012020-01-02 19:32:0090
11100290042021-09-06 12:01:01(NULL)(NULL)
8100190052020-01-02 12:11:01(NULL)(NULL)

在物理学及统计学数据计算时,有个概念叫min-max标准化,也被称为离差标准化,是对原始数据的线性变换,使结果值映射到[0 - 1]之间。

转换函数为:

img

请你将用户作答高难度试卷的得分在每份试卷作答记录内执行min-max归一化后缩放到[0,100]区间,并输出用户ID、试卷ID、归一化后分数平均值;最后按照试卷ID升序、归一化分数降序输出。(注:得分区间默认为[0,100],如果某个试卷作答记录中只有一个得分,那么无需使用公式,归一化并缩放后分数仍为原分数)。

由示例数据结果输出如下:

uidexam_idavg_new_score
1001900198
100390010
1002900288
1003900275
1001900270
100490020

解释:高难度试卷有9001、9002、9003;

作答了9001的记录有3条,分数分别为68、89、90,按给定公式归一化后分数为:0、95、100,而后两个得分都是用户1001作答的,因此用户1001对试卷9001的新得分为(95+100)/2≈98(只保留整数部分),用户1003对于试卷9001的新得分为0。最后结果按照试卷ID升序、归一化分数降序输出。

示例1
drop table if exists examination_info,exam_record;
CREATE TABLE examination_info (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    exam_id int UNIQUE NOT NULL COMMENT '试卷ID',
    tag varchar(32) COMMENT '类别标签',
    difficulty varchar(8) COMMENT '难度',
    duration int NOT NULL COMMENT '时长',
    release_time datetime COMMENT '发布时间'
)CHARACTER SET utf8 COLLATE utf8_bin;

CREATE TABLE exam_record (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    uid int NOT NULL COMMENT '用户ID',
    exam_id int NOT NULL COMMENT '试卷ID',
    start_time datetime NOT NULL COMMENT '开始时间',
    submit_time datetime COMMENT '提交时间',
    score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

INSERT INTO examination_info(exam_id,tag,difficulty,duration,release_time) VALUES
  (9001, 'SQL', 'hard', 60, '2020-01-01 10:00:00'),
  (9002, 'C++', 'hard', 80, '2020-01-01 10:00:00'),
  (9003, '算法', 'hard', 80, '2020-01-01 10:00:00'),
  (9004, 'PYTHON', 'medium', 70, '2020-01-01 10:00:00'),
  (9005, 'WEB', 'hard', 80, '2020-01-01 10:00:00'),
  (9006, 'PYTHON', 'hard', 80, '2020-01-01 10:00:00'),
  (9007, 'web', 'hard', 80, '2020-01-01 10:00:00'),
  (9008, 'Web', 'medium', 70, '2020-01-01 10:00:00'),
  (9009, 'WEB', 'medium', 70, '2020-01-01 10:00:00'),
  (9010, 'SQL', 'medium', 70, '2020-01-01 10:00:00');
	
INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES
(1001, 9001, '2020-01-01 09:01:01', '2020-01-01 09:21:59', 90),
(1003, 9002, '2020-01-01 19:01:01', '2020-01-01 19:30:01', 75),
(1004, 9002, '2020-01-01 12:01:01', '2020-01-01 12:11:01', 60),
(1003, 9002, '2020-01-01 12:01:01', '2020-01-01 12:41:01', 90),
(1002, 9002, '2020-01-02 19:01:01', '2020-01-02 19:32:00', 90),
(1003, 9001, '2020-01-02 12:01:01', '2020-01-02 12:31:01', 68),
(1001, 9002, '2020-01-02 12:01:01', '2020-01-02 12:43:01', 81),
(1001, 9005, '2020-01-02 12:11:01', null, null),
(1001, 9001, '2020-01-02 10:01:01', '2020-01-02 10:31:01', 89),
(1002, 9002, '2020-01-01 12:11:01', '2020-01-01 12:31:01', 83),
(1002, 9004, '2021-09-06 12:01:01', null, null),
(1002, 9002, '2021-05-05 18:01:01', null, null);
说明:分数归一化

假设集合A 收入的最小值和最大值分别为 12,000 美元和 98,000 美元。我们想将收入映射到范围 [1, 100]。那么73,600 美元映射到多少?
在这里插入图片描述

思路
1.统计每次高难度试卷被完成的分数以及该卷的最高最低分
2.关联试卷作答表和试卷信息表:exam_record JOIN examination_info USING(exam_id)
3.筛选做完了高难度的记录:WHERE score IS NOT NULL AND difficulty="hard"
4.按用户ID和试卷ID分组:GROUP BY uid, exam_id
5.计算归一化后分数,需分情况讨论:
如果最高分和最低分相等,即只有一个分值:直接返回原分值
否则归一化后取整:ROUND(AVG((score - min_score) / (max_score - min_score) * 100))
题解
方式一:
select 
uid,
exam_id,
if(is_score=1,mix_score,
   round(avg(((score-mix_score)/(max_score-mix_score))*100),0)) avg_new_score
from exam_record
inner join
(
select  exam_id, min(score) mix_score,max(score) max_score, if(min(score)=max(score),1,0) is_score
from exam_record 
inner join examination_info 
using(exam_id)
where difficulty='hard' and score IS NOT NULL
group by exam_id
) t1
using(exam_id)
where score is not null
group by uid,exam_id
order by exam_id,avg_new_score desc



方式二:
SELECT uid,exam_id,
ROUND(SUM(max_min)/COUNT(max_min),0) avg_new_score
FROM (
	SELECT exam_id,uid,score,
	IF(min_x=max_x,score,(score-min_x)*100/(max_x-min_x))max_min/*如果某试卷只有一个得分*/
	FROM (
		SELECT uid,a.exam_id,score,
		MIN(score) OVER(PARTITION BY exam_id) min_x, #求每类试卷的得分最小值
		MAX(score)OVER(PARTITION BY exam_id) max_x #求每类试卷的得分最大值
		FROM exam_record a
		LEFT JOIN examination_info b ON a.exam_id=b.exam_id
		WHERE difficulty='hard' #难度为'hard'
		AND score IS NOT NULL #分数不为空
		)t1
		
	)t2

GROUP BY exam_id,uid
ORDER BY exam_id,avg_new_score DESC;#按照试卷ID升序,评价分降序排序

3.每份试卷每月作答数和截止当月的作答总数

现有试卷作答记录表exam_record(uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分):

iduidexam_idstart_timesubmit_timescore
1100190012020-01-01 09:01:012020-01-01 09:21:5990
2100290012020-01-20 10:01:012020-01-20 10:10:0189
3100290012020-02-01 12:11:012020-02-01 12:31:0183
4100390012020-03-01 19:01:012020-03-01 19:30:0175
5100490012020-03-01 12:01:012020-03-01 12:11:0160
6100390012020-03-01 12:01:012020-03-01 12:41:0190
7100290012020-05-02 19:01:012020-05-02 19:32:0090
8100190022020-01-02 19:01:012020-01-02 19:59:0169
9100490022020-02-02 12:01:012020-02-02 12:20:0199
10100390022020-02-02 12:01:012020-02-02 12:31:0168
11100190022020-02-02 12:01:012020-02-02 12:43:0181
12100190022020-03-02 12:11:01(NULL)(NULL)

请输出每份试卷每月作答数和截止当月的作答总数。

示例1
drop table if exists exam_record;
CREATE TABLE exam_record (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    uid int NOT NULL COMMENT '用户ID',
    exam_id int NOT NULL COMMENT '试卷ID',
    start_time datetime NOT NULL COMMENT '开始时间',
    submit_time datetime COMMENT '提交时间',
    score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES
(1001, 9001, '2020-01-01 09:01:01', '2020-01-01 09:21:59', 90),
(1002, 9001, '2020-01-20 10:01:01', '2020-01-20 10:10:01', 89),
(1002, 9001, '2020-02-01 12:11:01', '2020-02-01 12:31:01', 83),
(1003, 9001, '2020-03-01 19:01:01', '2020-03-01 19:30:01', 75),
(1004, 9001, '2020-03-01 12:01:01', '2020-03-01 12:11:01', 60),
(1003, 9001, '2020-03-01 12:01:01', '2020-03-01 12:41:01', 90),
(1002, 9001, '2020-05-02 19:01:01', '2020-05-02 19:32:00', 90),
(1001, 9002, '2020-01-02 19:01:01', '2020-01-02 19:59:01', 69),
(1004, 9002, '2020-02-02 12:01:01', '2020-02-02 12:20:01', 99),
(1003, 9002, '2020-02-02 12:01:01', '2020-02-02 12:31:01', 68),
(1001, 9002, '2020-02-02 12:01:01', '2020-02-02 12:43:01', 81),
(1001, 9002, '2020-03-02 12:11:01', null, null);

输出

9001|202001|2|2
9001|202002|1|3
9001|202003|3|6
9001|202005|1|7
9002|202001|1|1
9002|202002|3|4
9002|202003|1|5

思路
1.求每份试卷每个月的作答数,要以exam_id,开始作答时间的月份进行分组。直接获取试卷ID、作答月份。
2.对每组开始作答时间计数得到每份试卷每个月的作答数
3.以试卷分组,月份递增获取每份试卷每个月的累计和。sum() over() as cum_exam_cnt 
题解
方式一:
select 
    exam_id,
    start_month, 
    month_cnt,
    SUM(month_cnt) over(
    partition by exam_id
    order by start_month) as cum_exam_cnt
    from 
    (
    select exam_id,
        DATE_FORMAT(start_time, "%Y%m")as start_month,
        count(start_time) month_cnt
    FROM exam_record
    group by exam_id, DATE_FORMAT(start_time, "%Y%m")-- 每份试卷被作答月份
    )t

方式二:
	SELECT DISTINCT
    exam_id,
    DATE_FORMAT(start_time,'%Y%m') start_month,
    count(start_time) over(
        partition by exam_id,DATE_FORMAT(start_time,'%Y%m')
    ) month_cnt,
    count(start_time) over(
        partition by exam_id
        order by DATE_FORMAT(start_time,'%Y%m') 
    ) cum_exam_cnt
FROM
    exam_record

4. 每月及截止当月的答题情况

现有试卷作答记录表exam_record(uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分):

iduidexam_idstart_timesubmit_timescore
1100190012020-01-01 09:01:012020-01-01 09:21:5990
2100290012020-01-20 10:01:012020-01-20 10:10:0189
3100290012020-02-01 12:11:012020-02-01 12:31:0183
4100390012020-03-01 19:01:012020-03-01 19:30:0175
5100490012020-03-01 12:01:012020-03-01 12:11:0160
6100390012020-03-01 12:01:012020-03-01 12:41:0190
7100290012020-05-02 19:01:012020-05-02 19:32:0090
8100190022020-01-02 19:01:012020-01-02 19:59:0169
9100490022020-02-02 12:01:012020-02-02 12:20:0199
10100390022020-02-02 12:01:012020-02-02 12:31:0168
11100190022020-01-02 19:01:012020-02-02 12:43:0181
12100190022020-03-02 12:11:01(NULL)(NULL)

请输出自从有用户作答记录以来,每月的试卷作答记录中月活用户数、新增用户数、截止当月的单月最大新增用户数、截止当月的累积用户数。结果按月份升序输出。
由示例数据结果输出如下:

start_monthmaumonth_add_uvmax_month_add_uvcum_sum_uv
2020012222
2020024224
2020033024
2020051024
说明
month1001100210031004
20200111
2020021111
202003111
2020051

由上述矩阵可以看出,2020年1月有2个用户活跃(mau=2),当月新增用户数为2;

2020年2月有4个用户活跃,当月新增用户数为2,最大单月新增用户数为2,当前累积用户数为4。

实例1
drop table if exists exam_record;
CREATE TABLE exam_record (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    uid int NOT NULL COMMENT '用户ID',
    exam_id int NOT NULL COMMENT '试卷ID',
    start_time datetime NOT NULL COMMENT '开始时间',
    submit_time datetime COMMENT '提交时间',
    score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES
(1001, 9001, '2020-01-01 09:01:01', '2020-01-01 09:21:59', 90),
(1002, 9001, '2020-01-20 10:01:01', '2020-01-20 10:10:01', 89),
(1002, 9001, '2020-02-01 12:11:01', '2020-02-01 12:31:01', 83),
(1003, 9001, '2020-03-01 19:01:01', '2020-03-01 19:30:01', 75),
(1004, 9001, '2020-03-01 12:01:01', '2020-03-01 12:11:01', 60),
(1003, 9001, '2020-03-01 12:01:01', '2020-03-01 12:41:01', 90),
(1002, 9001, '2020-05-02 19:01:01', '2020-05-02 19:32:00', 90),
(1001, 9002, '2020-01-02 19:01:01', '2020-01-02 19:59:01', 69),
(1004, 9002, '2020-02-02 12:01:01', '2020-02-02 12:20:01', 99),
(1003, 9002, '2020-02-02 12:01:01', '2020-02-02 12:31:01', 68),
(1001, 9002, '2020-02-02 12:01:01', '2020-02-02 12:43:01', 81),
(1001, 9002, '2020-03-02 12:11:01', null, null);
输出

202001|2|2|2|2
202002|4|2|2|4
202003|3|0|2|4
202005|1|0|2|4

思路
1.统计每月用户第一次出现的月份 min()over()
2.统计用户作答的次数和最大的用户作答数 
count(distinct uid) as mau,
count(distinct case when start_month=first_month then uid else null end)as month_add_uv	
3.计算用户作答月份最大的数和截止当月的累积用户数 
max(month_add_uv)over(order by start_month) max_month_add_uv,
sum(month_add_uv)over(order by start_month) cum_sum_uv
题解

    select 
    start_month,mau,month_add_uv,
    max(month_add_uv)over(order by start_month) max_month_add_uv,
    sum(month_add_uv)over(order by start_month) cum_sum_uv
    from (
    select 
        start_month,
        count(distinct uid) as mau,
        count(distinct case
              when start_month=first_month then uid
              else null end) as month_add_uv	
    from (
    select 
    uid,
    date_format(start_time,"%Y%m")as start_month,
    min(date_format(start_time,"%Y%m"))
    over(partition by uid 
    order by date_format(start_time,"%Y%m"))first_month
    from exam_record
    )t
    group by start_month
)t1

5.统计有未完成状态的试卷的未完成数和未完成率

现有试卷作答记录表exam_record(uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分),数据如下:

iduidexam_idstart_timesubmit_timescore
1100190012020-01-02 09:01:012020-01-02 09:21:0180
2100190012021-05-02 10:01:012021-05-02 10:30:0181
3100190012021-09-02 12:01:01(NULL)(NULL)

请统计有未完成状态的试卷的未完成数incomplete_cnt和未完成率incomplete_rate。由示例数据结果输出如下:

exam_idincomplete_cntcomplete_rate
900110.333

解释:试卷9001有3次被作答的记录,其中两次完成,1次未完成,因此未完成数为1,未完成率为0.333(保留3位小数)

示例1
drop table if exists exam_record;
CREATE TABLE exam_record (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    uid int NOT NULL COMMENT '用户ID',
    exam_id int NOT NULL COMMENT '试卷ID',
    start_time datetime NOT NULL COMMENT '开始时间',
    submit_time datetime COMMENT '提交时间',
    score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES
(1001, 9001, '2020-01-02 09:01:01', '2020-01-02 09:21:01', 80),
(1001, 9001, '2021-05-02 10:01:01', '2021-05-02 10:30:01', 81),
(1001, 9001, '2021-09-02 12:01:01', null, null);
题解
select exam_id,
sum(if(submit_time is null,1,0))incomplete_cnt,
round(sum(if(submit_time is null,1,0))/count(exam_id),3) complete_rate
from exam_record
group by exam_id
having incomplete_cnt>0
  • 2
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值