嵌套查询和连接查询的效率_LeetCode 查找：groupby+having/嵌套查询/连接

最新推荐文章于 2023-07-14 10:25:15 发布

weixin_39603908

最新推荐文章于 2023-07-14 10:25:15 发布

阅读量808

点赞数

文章标签：嵌套查询和连接查询的效率

leetcode 182. 查找重复的电子邮箱

题目：编写一个 SQL 查询，查找 Person表中所有重复的电子邮箱。

Person表：

Create table If Not Exists Person (Id int, Email varchar(255))
Truncate table Person
insert into Person (Id, Email) values ('1', 'a@b.com')
insert into Person (Id, Email) values ('2', 'c@d.com')
insert into Person (Id, Email) values ('3', 'a@b.com')

示例：

+----+---------+
| Id | Email   |
+----+---------+
| 1  | a@b.com |
| 2  | c@d.com |
| 3  | a@b.com |
+----+---------+

根据以上输入，你的查询应返回以下结果：

+---------+
| Email   |
+---------+
| a@b.com |
+---------+

说明：所有电子邮箱都是小写字母。

解法1：having配合GROUP BY

SELECT Email FROM Person GROUP BY Email having COUNT(id) > 1 ;

解法2：distinct配合自连接

SELECT distinct p1.Email from Person p1,Person p2 where p1.Email = p2.Email and p1.id != p2.id;

解法3：嵌套查找

select Email from (select Email,count(*) as num from Person group by Email) as a where a.num>1;

leetcode 183. 从不订购的客户

题目：某网站包含两个表，Customers表和Orders表。编写一个 SQL 查询，找出所有从不订购任何东西的客户。

Customers 表：

+----+-------+
| Id | Name  |
+----+-------+
| 1  | Joe   |
| 2  | Henry |
| 3  | Sam   |
| 4  | Max   |
+----+-------+

Orders 表：

+----+------------+
| Id | CustomerId |
+----+------------+
| 1  | 3          |
| 2  | 1          |
+----+------------+

例如给定上述表格，你的查询应返回：

+-----------+
| Customers |
+-----------+
| Henry     |
| Max       |
+-----------+

解法1：嵌套查询

SELECT Name as Customers FROM Customers WHERE Id not in (SELECT distinct(CustomerId) from Orders);

解法2：左连接

select  a.name as customers from customers a  
left join Orders b 
on a.id=b.customerid
where b.customerid is null

leetcode: 197. 上升的温度

题目：给定一个 Weather 表，编写一个 SQL 查询，来查找与之前（昨天的）日期相比温度更高的所有日期的 Id。

+---------+------------------+------------------+
| Id(INT) | RecordDate(DATE) | Temperature(INT) |
+---------+------------------+------------------+
|       1 |       2015-01-01 |               10 |
|       2 |       2015-01-02 |               25 |
|       3 |       2015-01-03 |               20 |
|       4 |       2015-01-04 |               30 |
+---------+------------------+------------------+

例如，根据上述给定的 Weather 表格，返回如下 Id:

+----+
| Id |
+----+
|  2 |
|  4 |
+----+

解法1：自连接配合DATEDIFF

SELECT w2.Id 
FROM Weather w1,Weather w2 
WHERE w2.Temperature > w1.Temperature 
    AND DATEDIFF(w2.RecordDate,w1.RecordDate)=1;

解法2：内连接配合DATEDIFF

SELECT
    a.id AS 'Id'
FROM
    Weather a
        JOIN
    Weather b ON DATEDIFF(a.RecordDate, b.RecordDate) = 1
        AND a.Temperature > b.Temperature;

leetcode:596. 超过5名学生的课

题目：有一个courses 表，有: student (学生) 和 class (课程)。

请列出所有超过或等于5名学生的课。

例如,表:

+---------+------------+
| student | class      |
+---------+------------+
| A       | Math       |
| B       | English    |
| C       | Math       |
| D       | Biology    |
| E       | Math       |
| F       | Computer   |
| G       | Math       |
| H       | Math       |
| I       | Math       |
+---------+------------+

应该输出:

+---------+
| class   |
+---------+
| Math    |
+---------+

Note:
学生在每个课中不应被重复计算。

解法：GROUP BY 配合 having

注意题目可能会多次出现同一个学生选了同一课，因此要加distinct

SELECT class FROM courses GROUP BY class having count(distinct student) >= 5;

where,having与 group by连用的区别

select 列a,聚合函数 from 表名 where 过滤条件 group by 列a having 过滤条件

group by 字句也和where条件语句结合在一起使用。当结合在一起时，where在前，group by 在后。即先对select xx from xx的记录集合用where进行筛选，然后再使用group by 对筛选后的结果进行分组使用having字句对分组后的结果进行筛选

需要注意having和where的用法区别：

1.having只能用在group by之后，对分组后的结果进行筛选(即使用having的前提条件是分组)。

2.where肯定在group by 之前

3.where后的条件表达式里不允许使用聚合函数，而having可以。

leetcode: 626. 换座位

小美是一所中学的信息科技老师，她有一张 seat 座位表，平时用来储存学生名字和与他们相对应的座位 id。

其中纵列的 id 是连续递增的

小美想改变相邻俩学生的座位。

你能不能帮她写一个 SQL query 来输出小美想要的结果呢？

示例：

+---------+---------+
|    id   | student |
+---------+---------+
|    1    | Abbot   |
|    2    | Doris   |
|    3    | Emerson |
|    4    | Green   |
|    5    | Jeames  |
+---------+---------+

假如数据输入的是上表，则输出结果如下：

+---------+---------+
|    id   | student |
+---------+---------+
|    1    | Doris   |
|    2    | Abbot   |
|    3    | Green   |
|    4    | Emerson |
|    5    | Jeames  |
+---------+---------+

注意：

如果学生人数是奇数，则不需要改变最后一个同学的座位。

解法1：CASE WHEN

改变id，如果id是奇数，并且是最后一个同学，就直接返回；如果id是奇数，就将id+1；如果id是偶数，就就id-1。交换id，相当于是交换student。

SELECT (CASE 
            WHEN MOD(id,2) = 1 AND id = (SELECT COUNT(*) FROM seat) THEN id
            WHEN MOD(id,2) = 1 THEN id+1
            ElSE id-1
        END) AS id, student
FROM seat
ORDER BY id;

解法2：CASE WHEN

同上，用id&1来判断id的二进制数最后一位是否为1，是1就表示为奇数，否则为偶数

select (case
            when id & 1 = 1 and id = (select count(*) from seat) then id
            when id & 1 = 1 then id + 1
            else id -1
        end) as id, student
from seat
order by id

解法3：IF

思路跟上面一样，这里用了IF判断，<>表示不等于

SELECT IF(id%2<>0, IF(id<>(SELECT COUNT(*) FROM seat), id+1, id), id-1) id,student 
  FROM seat
 ORDER BY id

解法4 IF NULL

select s1.id as id, IFNULL(s2.student,s1.student) as student 
from seat as s1 
     left join seat as s2 
on (s1.id%2=1 and (s1.id+1)=s2.id) 
    or (s2.id%2=1 and (s2.id+1)=s1.id) order by s1.id asc

IFNULL(expr1,expr2)

如果expr1不是NULL，IFNULL()返回expr1，否则它返回expr2。IFNULL()返回一个数字或字符串值
具体用法如：现有学生表（tbl_student）和分数表（score）,查询学生表的所有字段和学生相对于的英语成绩（english_score）sql如下：
select stu.*,IFNULL(score.english_score,0) from tbl_student stu,tbl_score score where 1=1 and stu.stu_id=score.stu_id
以上sql中，如果score表中的english_score有值，则IFNULL(score.english_score,0)就显示english_score，否则，显示0

leetcode：180. 连续出现的数字

题目：编写一个 SQL 查询，查找所有至少连续出现三次的数字。

+----+-----+
| Id | Num |
+----+-----+
| 1  |  1  |
| 2  |  1  |
| 3  |  1  |
| 4  |  2  |
| 5  |  1  |
| 6  |  2  |
| 7  |  2  |
+----+-----+

例如，给定上面的 Logs 表， 1 是唯一连续出现至少三次的数字。

+-----------------+
| ConsecutiveNums |
+-----------------+
| 1               |
+-----------------+

解法1：自连接

三张表连接，要求Id连续，num相同

SELECT distinct l1.Num ConsecutiveNums
FROM Logs l1 JOIN Logs l2 on l1.id = l2.id-1
JOIN Logs l3 on l1.Id = l3.Id - 2
WHERE l1.num = l2.num AND l2.num = l3.num;

注意：上面那种写法有一个致命问题，连续N个语句，而且表Id不一定是连续的，通过判断Id是否相差1是有问题的（测试用例中就有数据是不连续的，只是不小于3次的情况下正确，如果不小于4次就会计算错误了）

解法2：OVER

SELECT DISTINCT Num ConsecutiveNums FROM
  (SELECT Num,(RowNo - ROW_NUMBER() OVER (PARTITION BY Num ORDER BY RowNo)) Rank FROM 
	(SELECT Num,ROW_NUMBER() OVER (ORDER BY Id) RowNo FROM Logs) A ) B
GROUP BY Num,Rank
HAVING COUNT(1) >=3

over不能单独使用，要和分析函数：rank(),dense_rank(),row_number()等一起使用。

其参数：over（partition by columnname1 order by columnname2）
含义：按columname1指定的字段进行分组排序，或者说按字段columnname1的值进行分组排序。
例如：employees表中，有两个部门的记录：department_id ＝10和20
select department_id，rank（） over（partition by department_id order by salary) from employees就是指在部门10中进行薪水的排名，在部门20中进行薪水排名。如果是partition by org_id，则是在整个公司内进行排名。

解法3：变量

初始化临时表中变量，要先执行t，再执行logs中的case when then

select distinct Num as ConsecutiveNums
from (
  select Num, 
    case 
      when @prev = Num then @count := @count + 1
      when (@prev := Num) is not null then @count := 1
    end as CNT
  from Logs, (select @prev := null,@count := null) as t
) as temp
where temp.CNT >= 3

与自关联或自连接相比，这种方法的效率更高，不受Logs表中的Id是否连续的限制，而且可以任意设定某个值连续出现的次数。

leetcode:184. 部门工资最高的员工

题目：Employee 表包含所有员工信息，每个员工有其对应的 Id, salary 和 department Id。

+----+-------+--------+--------------+
| Id | Name  | Salary | DepartmentId |
+----+-------+--------+--------------+
| 1  | Joe   | 70000  | 1            |
| 2  | Henry | 80000  | 2            |
| 3  | Sam   | 60000  | 2            |
| 4  | Max   | 90000  | 1            |
+----+-------+--------+--------------+

Department 表包含公司所有部门的信息。

+----+----------+
| Id | Name     |
+----+----------+
| 1  | IT       |
| 2  | Sales    |
+----+----------+

编写一个 SQL 查询，找出每个部门工资最高的员工。例如，根据上述给定的表格，Max 在 IT 部门有最高工资，Henry 在 Sales 部门有最高工资。

+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT         | Max      | 90000  |
| Sales      | Henry    | 80000  |
+------------+----------+--------+

解法1：内连接配合MAX

SELECT d.Name AS Department,
       e.Name AS Employee,
       e.Salary AS Salary
FROM Employee AS e 
INNER JOIN Department AS d 
ON e.DepartmentId = d.Id
AND e.Salary >= (SELECT MAX(DISTINCT Salary) FROM Employee WHERE DepartmentId = d.Id)

解法2：内连接配合GROUPY BY，MAX

select 
    Department.name AS Department,
    Employee.name AS Employee,
    Salary
from 
    Employee
        Join
    Department on Employee.DepartmentId = Department.Id
where 
    (Employee.DepartmentId, Salary) IN
    (
        select 
            DepartmentId,  MAX(Salary)
        from
            Employee
        group by DepartmentId
    )

leetcode:601. 体育馆的人流量

X 市建了一个新的体育馆，每日人流量信息被记录在这三列信息中：序号 (id)、日期 (date)、 人流量 (people)。

请编写一个查询语句，找出高峰期时段，要求连续三天及以上，并且每天人流量均不少于100。

例如，表 stadium：

+------+------------+-----------+
| id   | date       | people    |
+------+------------+-----------+
| 1    | 2017-01-01 | 10        |
| 2    | 2017-01-02 | 109       |
| 3    | 2017-01-03 | 150       |
| 4    | 2017-01-04 | 99        |
| 5    | 2017-01-05 | 145       |
| 6    | 2017-01-06 | 1455      |
| 7    | 2017-01-07 | 199       |
| 8    | 2017-01-08 | 188       |
+------+------------+-----------+

对于上面的示例数据，输出为：

+------+------------+-----------+
| id   | date       | people    |
+------+------------+-----------+
| 5    | 2017-01-05 | 145       |
| 6    | 2017-01-06 | 1455      |
| 7    | 2017-01-07 | 199       |
| 8    | 2017-01-08 | 188       |
+------+------------+-----------+

Note:
每天只有一行记录，日期随着 id 的增加而增加。

解法1：

（1）s1.id-s2.id=1,s2.id-s3.id=1，相当于s3 s2 s1 的顺序三个连续的

（2）s2.id-s1.id=1,s1.id-s3.id=1，相当于s3 s1 s2 的顺序三个连续的

（3）s3.id-s2.id=1,s2.id-s1.id=1，相当于s1 s2 s3 的顺序三个连续的

select distinct s1.*
from stadium s1, stadium s2, stadium s3
where s1.people >= 100 and s2.people>= 100 and s3.people >= 100
and
(
    (s1.id - s2.id = 1 and s2.id - s3.id =1)
   or
   (s2.id - s1.id = 1 and s1.id - s3.id =1) 
   or
   (s3.id - s2.id = 1 and s2.id - s1.id = 1) 
) order by s1.id;

leetcode：177. 第N高的薪水

题目：编写一个 SQL 查询，获取 Employee 表中第 n 高的薪水（Salary）。

+----+--------+
| Id | Salary |
+----+--------+
| 1  | 100    |
| 2  | 200    |
| 3  | 300    |
+----+--------+

例如上述 Employee 表，n = 2 时，应返回第二高的薪水 200。如果不存在第 n 高的薪水，那么查询应返回 null。

+------------------------+
| getNthHighestSalary(2) |
+------------------------+
| 200                    |
+------------------------+

解法1：IF NULL配合limit

第 n 高的薪水，则需要逆序排列，取第 n 个即可。而在程序里面我们要这样写：desc limit n-1,1意思是逆序排列从下标为 n-1 的元素开始取1个。当我们输入 n 为2的时候，第2高的薪水，就要选择一个下标为 1 的属性Salary。

注意这里的limit n-1,1相当于limit 1 offset n-1

CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
set N=N-1;
  RETURN (
      select IFNULL((select distinct Salary from Employee order by Salary desc
            limit N,1),NULL) as getNthHighestSalary
  );
END

ps：在sql语句中，limit关键字是最后才用到的。以下条件的出现顺序一般是：where->group by->having-order by->limit

参考文献：

【1】where,having与 group by连用的区别

【2】【LeetCode】601.体育馆的人流量

【3】MySQL的Limit详解