文章目录
练习一: 各部门工资最高的员工
创建Employee 表,包含所有员工信息,每个员工有其对应的 Id, salary 和 department Id。
create table Employee
( Id integer(4) not null,
employee_name varchar(128) not null,
salary integer(128) not null,
departmentid integer(4) not null,
primary key(ID)
);
insert into Employee values(1,'Joe',70000,1);
insert into Employee values(2,'Henry',80000,2);
insert into Employee values(3,'Sam',60000,2);
insert into Employee values(4,'Max',90000,1);
创建Department 表,包含公司所有部门的信息
create table department
(Id integer(4) not null,
department_name varchar(128) not null,
primary key(Id)
);
insert into department values(1,'IT');
insert into department values(2,'Sales');
编写一个 SQL 查询,找出每个部门工资最高的员工。例如,根据上述给定的表格,Max 在 IT 部门有最高工资,Henry 在 Sales 部门有最高工资。
解题思路:
- 根据employee表中的departmentid与department表中的Id一致,采用关联子查询。
- 由于group by语句后必须和select语句后的字段一致(聚合函数内的字段除外),故引入employee as
p2作为新的查询表格,来获取employee_name信息,并通过最高工资来限制查询到的记录。注:个人认为本方法并非是最简便的方法。
select p1.department_name as Department,
p2.employee_name as Employee,
p3.max_salary as Salary
from department as p1, employee as p2,
(select max(salary)as max_salary,departmentid
from employee
group by departmentid)as p3
where p3.departmentid = p1.Id
and p2.salary = p3.max_salary;
练习二: 换座位
小美是一所中学的信息科技老师,她有一张 seat 座位表,平时用来储存学生名字和与他们相对应的座位 id。
其中纵列的id是连续递增的
小美想改变相邻俩学生的座位。
你能不能帮她写一个 SQL query 来输出小美想要的结果呢?
参考:【LeetCode】626.换座位
创建seat表:
create table seat
(id integer(4) not null,
student varchar(128) not null,
primary key(id));
insert into seat values(1,'Abbot');
insert into seat values(2,'Doris');
insert into seat values(3,'Emerson');
insert into seat values(4,'Green');
insert into seat values(5,'Jemes');
解题思路:
- 所有偶数行前移1
- 所有奇数行后移1
- 最后一行为奇数行,不做移动
select p.id ,p.student
from(
select id-1 as id,student from seat where mod(id,2)=0
union
select id+1 as id,student from seat where mod(id,2)=1 and id != (select count(*) from seat)
union
select id,student from seat where id = (select count(*) from seat)
)as p
order by id;
练习三: 分数排名
编写一个 SQL 查询来实现分数排名。如果两个分数相同,则两个分数排名(Rank)相同。请注意,平分后的下一个名次应该是下一个连续的整数值。换句话说,名次之间不应该有“间隔”。
创建以下score表:
create table score
(Id integer(4) not null,
Score float(64,2) not null,
primary key(Id));
insert into score values(1,3.50);
insert into score values(2,3.65);
insert into score values(3,4.00);
insert into score values(4,3.85);
insert into score values(5,4.00);
insert into score values(6,3.65);
解题思路:
- 根据排序不跳过重复后的次序,故采用DENSE_RANK专用窗口函数
- 根据分数进行降序排序
select Score, DENSE_RANK() OVER (ORDER BY Score desc) AS Rank1
from score
练习四:连续出现的数字
编写一个 SQL 查询,查找所有至少连续出现三次的数字。
创建log表:
create table log
(Id integer(4) not null,
Num integer(4) not null,
primary key(Id));
insert into log values(1, 1);
insert into log values(2, 1);
insert into log values(3, 1);
insert into log values(4, 2);
insert into log values(5, 1);
insert into log values(6, 2);
insert into log values(7, 2);
解题思路:
- 由于需要查询连续三次出现的数字,故第一想法是采用窗口函数。
- 先查询出当前位置Num值,再依次查询出当前位置加1的值Num2与当前位置加2处的值Num3。
- 进行比较,当Num=Num2=Num3时,取出此时相等的Num值,即为至少连续出现三次的数字。
补充窗口函数的知识:MySQL操作实战(二):窗口函数
select p.Num as ConsecutiveNums
from (select Id,Num,
lead(Num,1) over ()as Num2,
lead(Num,2) over ()as Num3
from log)as p
where p.Num = p.Num2
and p.Num = p.Num3;
练习五:树节点
对于tree表,id是树节点的标识,p_id是其父节点的id。
创建表tree:
create table tree
(id integer(4) not null,
p_id integer(4),
primary key(id));
insert into tree values(1,null);
insert into tree values(2,1);
insert into tree values(3,1);
insert into tree values(4,2);
insert into tree values(5,2);
解题思路:
- 查询结果分三种情况,root、inner、leaf,故采用case查询语句
- 当父节点为空时,该节点为根节点root
- 当节点没有子节点,即不在p_id字段的节点,即为叶节点leaf
- 其它节点即为中间节点Inner
注:
not in 用法补充
如果 not in 后面跟的是子查询,子查询中只要包含一个 null 的返回值,则会造成 整个 not in 字句返回空值,查询不会返回任何结果。
详情参考:Mysql 用 not in 不返回任何结果的原因及解决方案
select id,
case when p_id is null then 'Root'
when id not in(select p_id from tree as p1 where p_id is not null) then 'Leaf'
else 'Inner'
end as Type
from tree
练习六:至少有五名直接下属的经理
Employee表包含所有员工及其上级的信息。每位员工都有一个Id,并且还有一个对应主管的Id(ManagerId)。
创建员工信息表employee5:
create table employee5
(id integer(4) not null,
employee_name varchar(128) not null,
department varchar(4) not null,
managerid integer(4),
primary key(id));
insert into employee5 values(101,'John','A',null);
insert into employee5 values(102,'Dan','A',101);
insert into employee5 values(103,'James','A',101);
insert into employee5 values(104,'Amy','A',101);
insert into employee5 values(105,'Anne','A',101);
insert into employee5 values(106,'Ron','B',101);
针对Employee表,写一条SQL语句找出有5个下属的主管
解题思路:
- 采用聚合函数count来统计各主管对应下属人数
- 使用关联子查询,通过count统计的结果大于等于5锁定主管信息
select p1.employee_name
from employee5 as p1,
(select managerid,count(managerid) as managerid_count
from employee5
group by managerid)as p3
where p1.id = p3.managerid
and p3.managerid_count >=5;
练习七: 分数排名
练习三的分数表,实现排名功能,但是排名需要是非连续的,如下:
解题思路:
- 根据排序不连续,故采用RANK专用窗口函数
- 根据分数进行降序排序
select Score, RANK() OVER (ORDER BY Score desc) AS Rank1
from score
练习八:查询回答率最高的问题
求出survey_log表中回答率最高的问题,表格的字段有:uid, action, question_id, answer_id, q_num, timestamp。
uid是用户id;action的值为:“show”, “answer”, “skip”;当action是"answer"时,answer_id不为空,相反,当action是"show"和"skip"时为空(null);q_num是问题的数字序号。
写一条sql语句找出回答率最高的问题。
创建survey_log表格:
create table survey_log
(uid integer(4) not null,
action varchar(128) not null,
question_id integer(4) not null,
answer_id integer(64),
q_num integer(4) not null,
timestamp integer(4),
primary key(uid,action,question_id));
insert into survey_log values(5,'show',285,null,1,123);
insert into survey_log values(5,'answer',285,124124,1,124);
insert into survey_log values(5,'show',369,null,2,125);
insert into survey_log values(5,'skip',369,null,2,126);
解题思路:
- 自己的想法:
拿到问题的第一反应是进行分组,分别统计出各个问题action各个类别的数目,再进行计算回答率进行排序。
效率低下,代码量冗余。因此借鉴如下编程思路:- SQL学习笔记——task06:SQL综合练习
以问题id(question_id)进行分组,然后分别计算各个问题(分组)的回答情况,进行降序排序,输出第一行记录,即为最高回答率。注:
回答率=action为(answer) 个数除以action为(show)个数。
select question_id as survey_log
from survey_log
group by question_id
order by count(answer_id)/count(action='show')desc
limit 1;
练习九:各部门前3高工资的员工
将项目7中的employee表清空,重新插入以下数据(其实是多插入5,6两行):
重新创建employee表:
create table employee7
(id integer(4) not null,
name varchar(128) not null,
salary integer(64) not null,
departmentid integer(4) not null,
primary key(id));
insert into employee7 values(1,'Joe',70000,1);
insert into employee7 values(2,'Henry',80000,1);
insert into employee7 values(3,'Sam',60000,1);
insert into employee7 values(4,'Max',90000,1);
insert into employee7 values(5,'Janet',69000,1);
insert into employee7 values(6,'Randy',85000,1);
解题思路:
- 分别查询各部门员工工资逆序排序结果
- 取各部门工资前三的记录的并,得到结果
注:也可以使用连结函数join,由于此处学习并不太熟悉,后续补充。
(select p2.department_name as Department,
p1.name as Employee,
p1.salary as Salary
from employee7 as p1,
department as p2
where p1.departmentid = 1
and p1.departmentid = p2.Id
order by Salary desc
limit 3)
union
(select p5.department_name as Department,
p4.name as Employee,
p4.salary as Salary
from employee7 as p4,
department as p5
where p4.departmentid = 2
and p4.departmentid = p5.Id
order by Salary desc
limit 3)
tips:
- 两集合相并(union)时,order by只能有一个,且写在最后,否则就会报错,但union前后集合用括号括起来后不存在此问题。
- limit函数同上。
练习十:平面上最近距离
point_2d表包含一个平面内一些点(超过两个)的坐标值(x,y)。
写一条查询语句求出这些点中的最短距离并保留2位小数。
创建point_2d表:
create table point_2d
(x integer(8) not null,
y integer(8) not null,
primary key(x,y));
insert into point_2d values(-1,-1);
insert into point_2d values(0,0);
insert into point_2d values(-1,-2);
最短距离是1,从点(-1,-1)到点(-1,-2)。所以输出结果为:1
解题思路:
- 计算任意两点之间的距离(使用两个完全相同的表格),采用round,sqrt,power函数
- 相同点不须进行距离的计算
- 对计算结果进行排序,找出距离最短的点
代码如下:
select p1.x,p1.y,p2.x,p2.y,
round(sqrt(power(p1.x-p2.x,2)+power(p1.y-p2.y,2)),2) as shortest
from point_2d as p1,point_2d as p2
where p1.x != p2.x
or p1.y != p2.y
order by shortest
运行结果如下:
分析:
本方法两点之间的距离均会进行两次运算,效率低下。 解决办法:后续有时间再做补充。
练习十一:行程和用户
Trips 表中存所有出租车的行程信息。每段行程有唯一键 Id,Client_Id 和 Driver_Id 是 Users 表中 Users_Id 的外键。Status 是枚举类型,枚举成员为 (‘completed’, ‘cancelled_by_driver’, ‘cancelled_by_client’)。
创建表trips:
create table trips
(id integer(4) not null,
client_id integer(4) not null,
driver_id integer(4) not null,
city_id integer(4) not null,
status varchar(64) not null,
request_at date,
primary key(id));
insert into trips values(1,1,10,1,'completed',20131001);
insert into trips values(2,2,11,1,'cancelled_by_driver',20131001);
insert into trips values(3,3,12,6,'completed',20131001);
insert into trips values(4,4,13,6,'cancelled_by_client',20131001);
insert into trips values(5,1,10,1,'completed',20131002);
insert into trips values(6,2,11,6,'completed',20131002);
insert into trips values(7,3,12,6,'completed',20131002);
insert into trips values(8,2,12,12,'completed',20131003);
insert into trips values(9,3,10,12,'completed',20131003);
insert into trips values(10,4,13,12,'cancelled_by_driver',20131003);
创建用户表users:
drop table if exists users;
create table users
(users_id integer(4) not null,
banned varchar(4) not null,
role varchar(32) not null,
primary key(users_id));
insert into users values(1,'No','client');
insert into users values(2,'Yes','client');
insert into users values(3,'No','client');
insert into users values(4,'No','client');
insert into users values(10,'No','driver');
insert into users values(11,'No','driver');
insert into users values(12,'No','driver');
insert into users values(13,'No','driver');
写一段 SQL 语句查出2013年10月1日至2013年10月3日期间非禁止用户的取消率。基于上表,你的 SQL 语句应返回如下结果,取消率(Cancellation Rate)保留两位小数。
解题思路:
- 表trips与表users进行关联,查询所有非禁止用户的出租车行程信息
- 通过request_at进行分组
- 通过count(if())语句,分别统计各日期非禁止用户取消订单数和各日期总订单数,并计算比率,保留两位小数。
SQL语句:
select request_at,
round((count(if(p.status='cancelled_by_driver',true,null))+count(if(p.status='cancelled_by_client',true,null)))/count(p.status),2)as 'Cancellation Rate'
from (select t.id,t.client_id,t.driver_id,t.status,t.request_at,u.banned_users_id
from trips as t,(select users_id as banned_users_id from users where banned='Yes')as u
where t.client_id != u.banned_users_id
and t.driver_id != u.banned_users_id) as p
group by request_at
补充:
MYSQL中数据类型介绍
MySQL操作实战(二):窗口函数
Mysql 用 not in 不返回任何结果的原因及解决方案
详细题目列表详见:DataWhale 组队学习