因为数据上传到平台一直报错(报了三四天了),一天报几万条数据错误,吓死我了,一直在找原因,还被被领导骂了一通,最后找到是因为抽取的数据重复的原因导致后台jar包运行一直报错,特别崩溃,然后查了一天,终于搞懂了。
对这个rowid字段的用法不太理解,问大神:rowid就是伪列,相当于标识这条记录的一个ID。
注意:建议不太会的小白先备份一份表,再操作。
**--------快速备份语句-----**
create table table1 as select * from table2 ;
**查询及删除重复记录的SQL语句**
1、查找表中多余的重复记录,重复记录是根据单个字段(Id)来判断
select * from 表 where Id in (select Id from 表 group byId having count(Id) > 1)
2、删除表中多余的重复记录,重复记录是根据单个字段(Id)来判断,只留有rowid最小的记录
DELETE from 表 WHERE (id) IN ( SELECT id FROM 表 GROUP BY id HAVING COUNT(id) > 1) AND ROWID NOT IN (SELECT MIN(ROWID) FROM 表 GROUP BY id HAVING COUNT(*) > 1);
3、查找表中多余的重复记录(多个字段)
select * from 表 a where (a.Id,a.seq) in(select Id,seq from 表 group by Id,seq having count(*) > 1)
4、删除表中多余的重复记录(多个字段),只留有rowid最小的记录
delete from 表 a where (a.Id,a.seq) in (select Id,seq from 表 group by Id,seq having count(*) > 1) and rowid not in (select min(rowid) from 表 group by Id,seq having count(*)>1)
5、查找表中多余的重复记录(多个字段),不包含rowid最小的记录
select * from 表 a where (a.Id,a.seq) in (select Id,seq from 表 group by Id,seq having count(*) > 1) and rowid not in (select min(rowid) from 表 group by Id,seq having count(*)>1)
转载:https://www.cnblogs.com/252e/archive/2012/09/13/2682817.html