在sql的使用中,我们总是碰到需要删除重复数据的情况,但是又不能全部删除完,必须要保留至少一个重复的数据。重复的记录根据两个字段uid, qid判断(实际使用中可以拓展为多个)。
例如:
id | uid | qid |
---|---|---|
1 | 1 | 1 |
2 | 1 | 2 |
3 | 2 | 2 |
4 | 2 | 2 |
5 | 3 | 3 |
6 | 2 | 2 |
在上述的表中第三行和第四行重复,我们要选择一行删除,流程如下:
选择重复的行
select *,count(*)
from A group by uid, qid
having count(*)>1;
结果如下:
id | uid | qid | count(*) |
---|---|---|---|
3 | 2 | 2 | 3 |
使用in来找到我们想要的ID
SELECT *
FROM A
WHERE (uid, qid) IN
(SELECT A.`uid`,A.`qid`
FROM A
GROUP BY A.`uid`,A.`qid`
HAVING COUNT(*)>1)
得到的结果如下:
id | uid | qid |
---|---|---|
3 | 2 | 2 |
4 | 2 | 2 |
6 | 2 | 2 |
选出要删除的值
SELECT *
FROM A
WHERE (uid, qid) IN
(SELECT `uid`,`qid`
FROM A
GROUP BY A.`uid`,A.`qid`
HAVING COUNT(*) > 1)
AND idNOT IN
(SELECT MIN(id)
FROM A
GROUP BY A.`uid`,A.`qid`
HAVING COUNT(*) > 1) ;
结果是保留id最小的值,其他选项全部选出。
删除值
//创建中间表
CREATE TABLE F(id INTEGER, uid INTEGER, qid INTEGER);
//将要删除的数据插入中间表
INSERT INTO F ( SELECT * FROM A WHERE (uid, qid) IN (SELECT `uid`,`qid` FROM A GROUP BY A.`uid`,A.`qid` HAVING COUNT(*) > 1) AND uid NOT IN (SELECT MIN(uid) FROM A GROUP BY A.`uid`,A.`qid` HAVING COUNT(*) > 1)) ;
//删除中间表
DELETE FROM A WHERE id IN (SELECT id FROM F);
SELECT * FROM A;
结果
结果如下:
id | uid | qid |
---|---|---|
1 | 1 | 1 |
2 | 1 | 2 |
3 | 2 | 2 |
5 | 3 | 3 |
注:如果说不用保留一行数据的话那么就简单多了,只需要一个很简单的sql语句:
DELETE FROM A WHERE (uid,qid) IN (SELECT uid,qidFROM A GROUP BY uid,qidHAVING COUNT(*)>1)