SQL刷题记录贴

原创已于 2025-05-20 13:23:03 修改 · 766 阅读

10 ·

CC 4.0 BY-SA版权

文章标签：

#sql #数据库 #mysql

于 2025-04-17 17:29:20 首次发布

知识点：

1.在 SQL 中，判断 NULL 的值需要使用 IS NULL 或 IS NOT NULL，而不能直接用 != null 来比较。NULL 在 SQL 中表示一个未知的值，因此不能使用普通的比较运算符（如 != 或 =）进行比较。

正确：

select device_id,gender,age,university from user_profile where age is not null;

select device_id,gender,age,university from user_profile where age!='';

select device_id,gender,age,university from user_profile where age<>'';

- **`IS`** 用于判断特殊值（如 `NULL`、`TRUE`、`FALSE` 等）。
- **`NOT LIKE`** 是模式匹配操作符，直接使用，不需要加 `IS`。

因此，像 `NOT LIKE` 这样的运算符，在用来匹配字符串时，直接使用即可，无需 `IS`。

2.group by和 order by后面不加括号，order by在group by后面,

**排序字段之间需要用逗号** `,` 分隔，而不是直接连接。

3.在 SQL 查询中，执行的顺序大致如下：

FROM: 从哪个表或视图获取数据。
WHERE: 过滤数据。
GROUP BY: 按照指定的列对数据进行分组。
HAVING: 对分组后的数据进行过滤（通常用于聚合条件）。
SELECT: 从分组后的数据中选择列。
ORDER BY: 对结果进行排序。
LIMIT: 限制返回的结果行数（如果有的话）。

所以HAVING不能使用select中的别名！

4.去重加distinct！COUNT(DISTINCT q1.device_id, q1.date) 计算的是，q1 表中 不同设备和日期的组合 的数量。

5.什么时候需要两个字段分组：

假设有以下数据：

学校 A 和学校 B 的用户分别参与了难度为 easy 和 medium 的题目。
如果我们只按学校分组，难度将无法区分；
如果只按难度分组，学校也无法区分。
因此，按学校和难度同时分组能够准确计算每个学校和每个难度下的答题统计数据。

6.根据 SQL 标准，SELECT 列表中的列要么应该包含在 GROUP BY 子句中，要么应该使用聚合函数进行处理。

7.UNION：合并多个查询的结果集，并去除重复的行。它在返回结果时会进行去重操作。

UNION ALL：合并多个查询的结果集，但不会去除重复的行。返回结果中可能包含重复的记录。

8.假设有以下字符串：'apple,banana,cherry'

我们希望提取分隔符 , 前面的第一个部分（即 apple）

SELECT SUBSTRING_INDEX('apple,banana,cherry', ',', 1) AS first_part;
我们希望提取分隔符 , 后面的最后一个部分（即 cherry）。

SELECT SUBSTRING_INDEX('apple,banana,cherry', ',', -1) AS last_part;
我们希望提取第二部分（即 banana）。

SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('apple,banana,cherry,date', ',', 2), ',', -1) AS second_part;

blog_url是sql中的一个字段，举例http:/url/bisdgboy777，我怎么提取最后斜杠后的bisdgboy777？SELECT SUBSTRING_INDEX(blog_url, '/', -1) AS extracted_value
FROM your_table;

9.select substring_index(substring_index(profile,',',3),',',-1) as age
,count(age) as number from user_submit group by age;
为什么count(age)不正确，count(*)正确
`count(age)` 只会统计 `age` 列中非 `NULL` 的值

`count(*)` 计算的是所有行的数量，包括 `NULL` 和非 `NULL` 的值

10.

- **`LIKE`**：用于模式匹配，`%` 表示零个或多个任意字符。
- **`'北%'`**：这个条件会匹配所有以“北”开头的大学名，后面可以跟任意字符。

如果你使用 `=` 运算符，SQL会要求完全匹配，而不能使用通配符 `%`，因此需要改成 `LIKE`。

11.

^[1-9][0-9]{9}$：**`^`**：表示匹配字符串的开头。 **`[1-9]`**：表示第一个数字必须是 1 到 9 之间的任意一个数字（不能是 0）。 *`[0-9]{9}`**：表示接下来的 9 个字符必须是数字（0 到 9 之间的任意数字）。 - **`$`**：表示匹配字符串的结尾。 - **总结**：这部分正则表达式匹配的是一个 10 位数字的电话号码，第一位数字不能是 0。

12.

OVER (
[PARTITION BY partition_expression]
[ORDER BY order_expression]
[ROWS | RANGE frame_clause]
)

SQL窗口函数OVER详细用法，一学就会-CSDN博客

题目

day1

select * from user_profile;

day2

select device_id,gender,age,university from user_profile;

day3

select distinct university from user_profile;

day4

select device_id from user_profile limit 2;

day5

select device_id user_infos_example from user_profile limit 2;

*day6

select device_id,university from user_profile where university like '北%';

day7

select device_id,gender,age,university from user_profile where age>24;

day8

select device_id,gender,age from user_profile where age between 20 and 23;

day9

select device_id,gender,age,university from user_profile where university!='复旦大学';

*day10

select device_id,gender,age,university from user_profile where age<>'';

day11

select device_id,gender,age,university,gpa from user_profile where gender='male' and gpa>3.5;

day12

select device_id,gender,age,university,gpa from user_profile where university='北京大学' or gpa>3.7;

day13

select device_id,gender,age,university,gpa from user_profile where university in ('北京大学','复旦大学','山东大学');

day14

select device_id,gender,age,university,gpa from user_profile where (gpa>3.5 and university='山东大学') or (gpa>3.8 and university='复旦大学') order by device_id asc ;

day15

select device_id,age,university from user_profile where university like '北京%';

day16

select round(max(gpa),1) gpa from user_profile where university='复旦大学';

day17

select count(gender) male_num,round(avg(gpa),1) avg_gpa from user_profile where gender='male';

day18

select

gender,

university,

count(gender) user_num,

round(avg(active_days_within_30), 1) avg_active_day,

round(avg(question_cnt), 1) avg_question_cnt

from

user_profile

group by

gender, university

order by

gender asc,

university asc;

day19

select university,round(avg(question_cnt),3) avg_question_cnt,round(avg(answer_cnt),3) avg_answer_cnt from user_profile group by university having avg(question_cnt) <5 or avg(answer_cnt)<20 ;

day20

select university,round(avg(question_cnt),4) avg_question_cnt from user_profile group by university order by avg_question_cnt;

day21

SELECT p.device_id, p.question_id, p.result

FROM (

SELECT q.device_id, q.question_id, q.result, u.university, u.id AS user_id

FROM question_practice_detail q

JOIN user_profile u ON q.device_id = u.device_id

) p

WHERE p.university = '浙江大学'

ORDER BY p.question_id;

day22

select university,(count(question_id)/count(distinct(u.device_id))) avg_answer_cnt from user_profile u join question_practice_detail q on u.device_id=q.device_id group by university order by university ;

day23

select

u.university,

s.difficult_level,

round(

count(s.question_id) / count(distinct u.device_id),

) avg_answer_cnt

from

user_profile u

join (

select

p.device_id,

p.question_id,

q.difficult_level

from

question_practice_detail p

left join question_detail q on p.question_id = q.question_id

) s on u.device_id = s.device_id

GROUP BY

u.university,

s.difficult_level;

day24

select u.university,p.difficult_level,round(count(q.question_id)/count(distinct u.device_id),4) avg_answer_cnt

from user_profile u

join question_practice_detail q on u.device_id=q.device_id

join question_detail p on q.question_id=p.question_id

where u.university='山东大学' group by u.university,p.difficult_level;

day25

select device_id,gender,age,gpa from user_profile where university='山东大学'

union all

select device_id,gender,age,gpa from user_profile where gender='male';

day26

select (case when age>=25 then '25岁及以上' else '25岁以下' end)

as age_cut,

count(distinct device_id) number

from user_profile group by age_cut;

day27

select device_id,gender,

if(age<20,'20岁以下',

if(age between 20 and 24,'20-24岁',if(age>=25, '25岁及以上','其他')))as age_cut

from user_profile;

day28

select day(date) as day,count(question_id) as question_cnt from question_practice_detail where month(date)=8 group by date;

day29

select count(distinct q2.device_id,q2.date)/count(distinct q1.device_id,q1.date) as avg_ret

from question_practice_detail as q1 left join

question_practice_detail as q2

on q1.device_id = q2.device_id

and datediff(q2.date,q1.date)=1;

day30

select substring_index(profile,',',-1) as gender ,count(device_id)

from user_submit

group by gender;

day31

select device_id,

substring_index(blog_url,'/',-1) as user_name

from user_submit;

day32

select substring_index(substring_index(profile,',',3),',',-1) as age

,count(*) as number from user_submit group by age;

day33

select device_id,university,gpa

from user_profile

where (university,gpa) in

(select university,min(gpa) as gpa from user_profile group by university)

order by university;

day34

select u.device_id,u.university,

sum(case when q.result is null then 0 else 1 end) as question_cnt,

sum(case when q.result="right" then 1 else 0 end) as right_question_cnt

from

user_profile u left join question_practice_detail q

on u.device_id=q.device_id where u.university="复旦大学"

and (month(q.date) =8 or month(q.date) is null)

group by u.device_id;

day35

select d.difficult_level as difficult_level ,

sum(case when q.result="right" then 1 else 0 end)/count(d.question_id ) as correct_rate

from user_profile as u

join question_practice_detail as q on u.device_id=q.device_id

join question_detail as d on q.question_id=d.question_id

where u.university="浙江大学"

group by d.difficult_level

order by correct_rate asc;

day36

select device_id,age from user_profile order by age asc;

day37

select device_id,gpa,age from user_profile

order by gpa asc , age asc;

day38

select device_id,gpa,age from user_profile order by gpa desc,age desc;

day39

select count(distinct device_id) did_cnt,

count(question_id) question_cnt

from question_practice_detail where month(date)=8;

day40

select * from contacts

where phone_number regexp '^[1-9][0-9]{9}$|^[1-9][0-9]{2}-[0-9]{3}-[0-9]{4}'

order by id;

day41

select *,sum(profit) over(order by profit_date) as cumulative_profit

from daily_profits;

day42

select *,

abs(value) absolute_value,

ceil(value) ceiling_value,

floor(value) floor_value,

round(value,1) rounded_value

from numbers

order by id;