count和distinct配合使用

牛客SQL题 SQL126 平均活跃天数和月活人数

题目

用户在牛客试卷作答区作答记录存储在表exam_record中,内容如下:
exam_record表(uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分)
(和原题的数据相比我加了一条数据)
在这里插入图片描述
请计算2021年每个月里试卷作答区用户平均月活跃天数avg_active_days和月度活跃人数mau,上面数据的示例输出如下:

monthavg_active_daysmau
2021071.502
20210091.254

解释:2021年7月有2人活跃,共活跃了3天,其中1001活跃1天(2021-07-02),1002活跃2天(2021-07-02和2021-07-05,2021-07-02当天活跃两次),平均活跃天数1.5;2021年9月有4人活跃,共活跃了5天,平均活跃天数1.25,结果保留2位小数。
注:此处活跃指有交卷行为。
在这里插入图片描述

建表语句放在这里,方便大家测试
drop table if exists exam_record;
CREATE TABLE exam_record (
    id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    uid int NOT NULL COMMENT '用户ID',
    exam_id int NOT NULL COMMENT '试卷ID',
    start_time datetime NOT NULL COMMENT '开始时间',
    submit_time datetime COMMENT '提交时间',
    score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;

INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES
(1001, 9001, '2021-07-02 09:01:01', '2021-07-02 09:21:01', 80),
(1002, 9001, '2021-09-05 19:01:01', '2021-09-05 19:40:01', 81),
(1002, 9002, '2021-09-02 12:01:01', null, null),
(1002, 9003, '2021-09-01 12:01:01', null, null),
(1002, 9001, '2021-07-02 19:01:01', '2021-07-02 19:30:01', 82),
(1002, 9002, '2021-07-05 18:01:01', '2021-07-05 18:59:02', 90),
(1003, 9002, '2021-07-06 12:01:01', null, null),
(1003, 9003, '2021-09-07 10:01:01', '2021-09-07 10:31:01', 86),
(1004, 9003, '2021-09-06 12:01:01', null, null),
(1002, 9003, '2021-09-01 12:01:01', '2021-09-01 12:31:01', 81),
(1005, 9001, '2021-09-01 12:01:01', '2021-09-01 12:31:01', 88),
(1006, 9002, '2021-09-02 12:11:01', '2021-09-02 12:31:01', 89),
(1007, 9002, '2020-09-02 12:11:01', '2020-09-02 12:31:01', 89),
(1002, 9003, '2021-07-02 20:01:01', '2021-07-02 20:30:01', 60);
分析

根据题目要求需要筛选出2021年的记录;活跃指有交卷行为,所以分数不为null
先写个大致结构

select
    as month,
    as avg_active_days,
    as mau
from exam_record
where year(submit_time) = 2021 and submit_time is not null

查询结果是2021年每个月里试卷作答区用户平均月活跃天数avg_active_days和月度活跃人数mau,所以要分组group by,查询结果第一列是年月可以使用date_format函数

select
    date_format(submit_time, '%Y%m') as month,
    as avg_active_days,
    as mau
from exam_record
where year(submit_time) = 2021 and submit_time is not null
group by date_format(submit_time, '%Y%m')

查询结果第三列是月度活跃人数count去重后的uid即可

select
    date_format(submit_time, '%Y%m') as month,
    as avg_active_days,
    count(distinct uid) as mau
from exam_record
where year(submit_time) = 2021 and submit_time is not null
group by date_format(submit_time, '%Y%m')

剩下第二列用户平均月活跃天数,需要知道月度活跃人数共计活跃天数
月度活跃人数结果的第三列count(distinct uid)可以求出
共计活跃天数可以使用uid和时间两个约束来求

count(distinct uid, date_format(submit_time, '%Y%m%d'))

原本满足条件的有4条数据:
(1001, 2021-07-02)
(1002, 2021-07-02)
(1002, 2021-07-05)
(1002, 2021-07-02)
distinct的作用是对数据库表中一个或者多个字段重复的数据进行过滤,只返回其中的一条数据给用户,distinctuid年月日之后剩下3条数据:
(1001, 2021-07-02)
(1002, 2021-07-02)
(1002, 2021-07-05)
在这里插入图片描述
用户平均月活跃天数 = 共计活跃天数 / 月度活跃人数
结果保留两位小数

round(count(distinct uid, date_format(submit_time, '%Y%m%d'))/count(distinct uid), 2)

最终代码:

select
    date_format(submit_time, '%Y%m') as month,
    round(count(distinct uid, date_format(submit_time, '%Y%m%d'))/count(distinct uid), 2) as avg_active_days,
    count(distinct uid) as mau
from exam_record
where year(submit_time) = 2021 and submit_time is not null
group by date_format(submit_time, '%Y%m')
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值