多说两句:最近工作在做旧数据的导入,按照要求要去重复,所以总结了一下,写下第一篇博客来简单记录,这只能在数据库层面来操作,并且是单表,无关联,而实际,唉,写程序处理外键关系,在写程序删除。
需求:根据某个字段,去除重复记录。例如:mermber表,name字段存在重复值,有多个张三存在,要求只保留最新添加的一条记录,删除其他多余重复的记录;
id | create_time | name |
---|---|---|
1 | 2017-12-16 | 张三 |
2 | 2017-11-15 | 张三 |
3 | 2017-11-12 | 李四 |
实现:
- 查询重复数据sql
//查询重复数据sql
select * from member where name in (select name from member group by name having(count(*))>1);
结果:
id | create_time | name |
---|---|---|
1 | 2017-12-16 | 张三 |
2 | 2017-11-15 | 张三 |
- 删除重复记录sql
//删除重复数据sql
delete from member where name in
(select name from member group by name having count(name) > 1)
and
create_time in (select min(create_time) from member group by name having count(name)>1);
结果:
id | create_time | name |
---|---|---|
1 | 2017-12-16 | 张三 |
3 | 2017-11-12 | 李四 |
- 重复数据多条
id | create_time | name |
---|---|---|
1 | 2017-12-16 | 张三 |
2 | 2017-11-15 | 张三 |
2 | 2017-11-16 | 张三 |
3 | 2017-11-12 | 李四 |
sql实现:
//删除多条重复数据sql
delete from member where name in
(select name from member group by name having count(name) > 1)
and
create_time not in (select max(create_time) from member group by name having count(name)>1);
结果:
id | create_time | name |
---|---|---|
1 | 2017-12-16 | 张三 |
3 | 2017-11-12 | 李四 |
注:数据无价,生产环境批量删除操作,请先做好备份。