196. Delete Duplicate Emails

最新推荐文章于 2020-10-21 18:07:24 发布

一枚卷毛

最新推荐文章于 2020-10-21 18:07:24 发布

阅读量223

点赞数

分类专栏：四：数据库（leetocde）

本文链接：https://blog.csdn.net/jm729926980/article/details/78450029

版权

四：数据库（leetocde）专栏收录该内容

8 篇文章 0 订阅

订阅专栏

Write a SQL query to delete all duplicate email entries in a table named Person, keeping only unique emails based on its smallest Id.

+----+------------------+
| Id | Email            |
+----+------------------+
| 1  | john@example.com |
| 2  | bob@example.com  |
| 3  | john@example.com |
+----+------------------+
Id is the primary key column for this table.

For example, after running your query, the above Person table should have the following rows:

+----+------------------+
| Id | Email            |
+----+------------------+
| 1  | john@example.com |
| 2  | bob@example.com  |
+----+------------------+

题目大意:删除重复邮箱（注意必须对原表进行删除操作，查询操作将无结果。

博主今天脑子短路，这种简单的题愣是没想出来答案，不过在搜答案的过程中遇到了几个不能理解的，在这边做阐述尝试理解：

先讲容易理解的：

DELETE FROM Person WHERE Id NOT IN
(SELECT Id FROM (SELECT MIN(Id) Id FROM Person GROUP BY Email) p);

先将表根据Email分组，找出每个组中最小的Id，然后取其Id补集并删除，看似第二个select id from是多余的，其实，这个是mysql语法导致的，mysql语句不允许在同一条语句中对同一个表进行select和update操作，这会导致一个

You can't specify target table 'Person' for update in FROM clause错误，所以要引入中间表p

其实下面的代码也能正常工作：(大小写请见谅）

Delete  from Person where Id in
(select Id from 
 (select p1.Id from Person p1,Person p2 where p1.Id > p2.Id and p1.Email = p2.Email) p);

但是既然用了表的连接，其实有一种能规避子查询的方法：

Delete p2 from Person p1,Person p2 where p1.Email = p2.Email and p2.id > p1.id;

博主有点不理解这个，因为没有见过delete 后面能够跟表别名的，想了好久，能够自圆其说的是，将p2表看作原表，删除其id > 同邮箱对应id的数据。(也许可以当做表单字段去重工具用？)

运行时间1 < 2 < 3;

一枚卷毛

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
196. Delete Duplicate Emails

Write a SQL query to delete all duplicate email entries in a table named Person, keeping only unique emails based on its smallest Id.+----+------------------+| Id | Email |+----+-----
复制链接

扫一扫

专栏目录