连续出现的数字
解法:
1、比较直接,不考虑效率的做法是,直接从MySQL出发:
因为题目要求了是连续出现三次以上的,所以中间不可以有出现过其他的数子。因而我们可以视为对一个表进行了三次的查询,也就是可以假设有三张完全一样的表存在。
select distinct a.Num as ConsecutiveNums from Logs as a,Logs as b,Logs as c
where a.Num=b.Num and b.Num = c.Num and a.id=b.id+1 and b.id=c.id+1;
2、利用用户变量实现对连续出现的值进行计数:
select distinct Num as ConsecutiveNums from(
select Num ,
case
when @num = Num then @count := @count +1
when (@num :=Num) is not null then @count := 1
end as CN
from Logs,(select @num := null,@count :=null) as t
) as temp
where temp.CN >=3
与自关联或自连接相比,这种方法的效率更高,不受Logs表中的Id是否连续的限制,而且可以任意设定某个值连续出现的次数。
分析:
1)逻辑:构建两个变量@num
和@count
,前者用于与Num
做比较判断,后者用于@num
和Num
相等时的条件计数;
2)(select @num := null,@count :=null) as t
这句的作用是初始化两个变量,并将初始化后的变量放到一张临时表t
中,:=
符号在MySQL中是赋值的意思;
3)when @num = Num then @count := @count +1
和when (@num :=Num) is not null then @count := 1
这两个语句不能交换顺序,赋值语句永远非NULL,所以一旦执行顺序来到了第二个when,@count 是一定会被赋值为1的,后者放到前面的话就达不到计数的目的;
4)(@num := Num) is not null
这部分去掉后面加的判断,SQL也能正常执行,上面SQL中case when的这种用法,when后是判断条件,赋值后又加判断,我原以为这样会好理解点;
5)case when
本质是一个函数,有值时就返回内部处理得到的值,无值就返回NULL,针对每一个Num
,上面SQL中的case when 都会有一个计数,并把这个计数返回给CN
。
PS:MySQL8.0以后的版本开始支持窗口函数,使用窗口函数也能很好的解决此类问题。关于MySQL的窗口函数,可以参考译文:https://blog.csdn.net/qq_41080850/article/details/86416106
超过经理收入的员工
#自链接
select e1.Name as Employee from Employee e1, Employee e2 where e1.ManagerId =e2.Id and e1.Salary > e2.Salary;
#子链接
#先查询出经理的工资,在查询出比经理工资高的员工
select e.Name as Employee from Employee e where Salary >
(select Salary from Employee where id = e.ManagerId)
查找重复的电子邮箱
#解法一,对一张表看做两张一模一样的表
select distinct a.Email from Person as a,Person as b where a.Email=b.Email and a.Id != b.Id;
#解法二:使用group by与having相结合,先分组再选择
select Email from Person group by Email having count(Email)>1
#解法三;先依据email 使用 group by 进行分组,用count(1) 计数 存入一个临时的表 c ,然后再以c.num>1为条件查询出所要的。
select Email from (select count(1) as num,Email from Person group by Email) c where c.num >1;
#解法四:使用join(或者是inner join)
select distinct(p1.Email) from Person p1 join Person p2 on p1.Email=p2.Email and p1.Id != p2.Id;
解法一的速度是最快的。
从不订购的客户
#498ms
select c.name as Customers from Customers c left join Orders o on o.CustomerId =c.id where o.id is null;
#532ms
select c.Name as Customers from Customers c where not exists (select 1 from Orders o where o.CustomerId = c.Id);
# 455ms
select c.Name as Customers from Customers c where c.Id not in (select distinct o.CustomerId from Orders o);
分析:
其中上图黑色框里的sql解决的问题是:不在表里的数据,也就是在表A里的数据,但是不在表B里的数据。
部门工资最高的员工
解法
方法一:
先从员工表里查询工资最高的员工所在的部门与工资,接着再联合两张表一起查询。具体的查询条件是e.DepartmentId = d.id and (e.salary,e.DepartmentId)
在子查询的结果中。
select
d.name as Department, e.name as Employee,e.salary as salary
from
Employee e, Department d
where
e.DepartmentId = d.id
and
(e.salary,e.DepartmentId)
in
(select max(salary),DepartmentId from Employee group by DepartmentId);
方法二:
使用递归集合
select D.Name as Department,E.name as Employee,E.Salary
from Employee as E cross join Department as D
where E.DepartmentId=D.id
and exists(select null
from Employee as P
where P.DepartmentId=E.DepartmentId
and E.Salary<=P.Salary
having count(distinct P.Salary)=1)
方法三:
使用left join
,效率要高于方法一。
select t3.name department, t2.name Employee, salary
from (
select departmentid, max(salary) max_salary
from Employee
group by departmentid
) t1
left join Employee t2
on t1.departmentid = t2.departmentid and t1.max_salary = t2.salary
left join department t3
on t1.departmentid = t3.id
方法四:
将最大工资表查询出来加入连接,共连接三表
SELECT D.Name AS Department, E.Name AS Employee, E.Salary
FROM Employee E INNER JOIN Department D ON E.DepartmentId = D.Id
INNER JOIN (SELECT DepartmentId, MAX(Salary) AS MaxSalary
FROM Employee
GROUP BY DepartmentId ) S ON E.DepartmentId = S.DepartmentId
WHERE E.Salary = MaxSalary
当然,也还可以使用窗口函数来获取最高,第二高等的工资,具体如下:
- 每个部门最高
-- 每个部门最高
SELECT S.NAME, S.EMPLOYEE, S.SALARY
FROM (SELECT D.NAME,
T.NAME EMPLOYEE,
T.SALARY,
ROW_NUMBER() OVER(PARTITION BY T.DEPARTMENTID ORDER BY T.SALARY DESC) RN
FROM EMPLOYEE T
LEFT JOIN DEPARTMENT D
ON T.DEPARTMENTID = D.ID) S
WHERE S.RN = 1
- 每个部门前2高
-- 每个部门前2高
SELECT S.NAME, S.EMPLOYEE, S.SALARY
FROM (SELECT D.NAME,
T.NAME EMPLOYEE,
T.SALARY,
ROW_NUMBER() OVER(PARTITION BY T.DEPARTMENTID ORDER BY T.SALARY DESC) RN
FROM EMPLOYEE T
LEFT JOIN DEPARTMENT D
ON T.DEPARTMENTID = D.ID) S
WHERE S.RN <= 2
- 每个部门第一第三高
-- 每个部门第一第三高
SELECT S.NAME, S.EMPLOYEE, S.SALARY
FROM (SELECT D.NAME,
T.NAME EMPLOYEE,
T.SALARY,
ROW_NUMBER() OVER(PARTITION BY T.DEPARTMENTID ORDER BY T.SALARY DESC) RN
FROM EMPLOYEE T
LEFT JOIN DEPARTMENT D
ON T.DEPARTMENTID = D.ID) S
WHERE S.RN = 1 OR S.RN = 3
部门工资前三高的所有员工
这个是一个经典的TOPN问题
对于分组内取前几名的问题,可以先group by然后用having count()来筛选
在这个题中,我们可以找每个部门的工资前三名,那么先在子查询中用Employee和自己做连接,连接条件是【部门相同但是工资比我高】,那么接下来按照having count(Salary) <= 2来筛选,筛选的原理是:如果【跟我一个部门而且工资比我高的人数】不超过2个,那么我一定是部门工资前三,这样内层查询可以查询出所有符合要求的员工ID,接下来外层查询就简单了。
解法
select
d.name as Department, e.name as Employee,e.salary as salary
from Employee as e left join Department as d on e.DepartmentId =d.id
where e.id in(
select e1.id from Employee as e1 left join Employee as e2
on e1.DepartmentId = e2.DepartmentId and e1.salary<e2.salary
group by e1.id
having count(distinct e2.salary)<=2
)
and e.DepartmentId in (select id from Department)
order by d.id asc, e.salary desc;
使用 JOIN 和子查询
解析:公司里前 3 高的薪水意味着有不超过 3 个工资比这些值大。所以我们可以统计有多少人的工资比 e1.Salary
高,然后就是把表 Employee 和表 Department 连接来获得部门信息。
SELECT
d.Name AS 'Department', e1.Name AS 'Employee', e1.Salary
FROM
Employee e1
JOIN
Department d ON e1.DepartmentId = d.Id
WHERE
3 > (SELECT
COUNT(DISTINCT e2.Salary)
FROM
Employee e2
WHERE
e2.Salary > e1.Salary
AND e1.DepartmentId = e2.DepartmentId
);
窗口函数
select Department, Employee, Salary
from
(
select
D.Name as Department,
E.Name as Employee,
E.Salary as Salary,
dense_rank() over(partition by D.Name order by E.Salary desc) as dense_rank_
from Employee E join Department D on E.DepartmentId = D.Id
) as tmp
where dense_rank_ <= 3
窗口函数的一些使用技巧
同分不同名-连续:row_number()
同分同名-连续:dense_rank()
同分同名-不连续:rank()