一方法:
查询表中重复数据
select * from employee where employeeId in (select employeeId from employee group by employeeId having count(employeeId) > 1)
删除表中多余的重复记录,重复记录是根据单个字段(employeeId)来判断,只留有rowid最小的记录
delete from employee where employeeId in (select employeeId from employee group by employeeId having count(employeeId) > 1)
and rowid not in (select min(rowid) from employee group by employeeId having count(employeeId )>1)
查找表中多余的重复记录(多个字段)
select * from employee e
where (e.employeeId,e.phoneNo) in (select employeeId,phoneNo from employee group by employeeId,phoneNo having count(*) > 1)
删除表中多余的重复记录(多个字段),只留有rowid最小的记录
delete from employee e
where (e.employeeId, e.phoneNo) in (select employeeId, phoneNo from employee group by employeeId,phoneNo having count(*) > 1)
and rowid not in (select min(rowid) from employee group by employeeId,phoneNo having count(*)>1)
查找表中多余的重复记录(多个字段),不包含rowid最小的记录
select * from employee e
where (e.employeeId,e.phoneNo) in (select employeeId, phoneNo from employee group by employeeId,phoneNo having count(*) > 1)
and rowid not in (select min(rowid) from employee group by employeeId,phoneNo having count(*)>1)
通用:
delete from table t
where (t.字段1, t.字段2, … , t.字段n) in (select 字段1, 字段2, … , 字段n from table group by 字段1, 字段2, … , 字段n having count(*) > 1)
and rowid not in (select min(rowid) from table group by 字段1, 字段2, … , 字段n having count(*)>1)
此外:
如果只是查询的时候, 不显示重复, 只需select distinct 字段 from table….
---------------------
二删除方法:DELETE FROM hr.employees t1
WHERE t1.ROWID NOT IN (
SELECT MIN(t2.ROWID)
FROM hr.employees t2
GROUP BY t2.employee_id --按照想要唯一保留的字段进行分组
);
这个明显就比方法一好多了,子查询中我们先选除了rowid,然后按照我们想要保留的唯一字段进行分组,并取每组最小的rowid(注意是子查询表的rowid);然后在用not in删除除开最小的rowid以外的所有记录。
怎么样,这个方法是不是瞬间解决并且非常好理解?但是你以为这样就结束了?no no no
三删除方法:DELETE FROM hr.employees t1 WHERE t1.rowid > ( SELECT MIN(t2.rowid) FROM hr.employees t2 WHERE t1.employee_id = t2.employee_id --按照想要唯一保留的字段进行匹配 );
这个方式看起来和方法二差不多,但是想要说的是,他用的是连接,他用的是连接,不敢说连接一定比group by快,但是基本上不会输group by,而且在一般的情况下也是最快的了。而且外层的">"可以用到索引,就是各种快。
方法也同样说一下,子查询中按照要保留的字段对t1和t2进行关联,然后选择出最小的rowid(注意是子查询表的rowid),然后在外层用">"只保留每个匹配结果最小的一条记录。然后就瞬间删除重复的记录