问题描述:
现有表A和表B为父子表,有外键关联。A表主键为a_id,主要字段有a_name、a_class,B表主键为b_id,与A表关联字段为a_id。现要根据A表的a_name去重,并且把A表的a_class字段进行列合并,并更新到A表中。
解决方法:
- 将A表中的数据备份:
create table A_bak as select * from A;
- 删除子表B表中和A表重复数据关联的数据:
delete from B b
where exists (select a.a_id
from A a
where a. a_name in
(select aa. a_name
from A aa
group by aa. a_name
having count(aa. a_name) > 1)
and rowid not in (select min(rowid)
from A aaa
group by aaa. a_name
having count(*) > 1)
and b.a_id = a.a_id);
- 删除A表中的重复数据:
delete from A a
where a.a_name in (select aa.a_name
from A aa
group by aa.a_name
having count(aa.a_name) > 1)
and rowid not in (select min(rowid)
from A aaa
group by aaa.a_name
having count(*) > 1);
- 修改A表a_class为列合并去重后的数据:
update A a
set a.a_class =
(select dd.a_class
from (select bak.a_name,
wmsys.wm_concat(distinct bak.a_class) as a_class
from A_bak bak
group by bak.a_name) dd
where dd.a_name = a.a_name);
总结:
1、查找表中多余的重复记录,重复记录是根据单个字段(Id)来判断
select *
from 表
where Id in (select Id from 表 group byId having count(Id) > 1)
2、删除表中多余的重复记录,重复记录是根据单个字段(Id)来判断,只留有rowid最小的记录
delete from 表
where (id) in (select id from 表 group by id having count(id) > 1)
and rowid not in
(select min(rowid) from 表 group by id having count(*) > 1);
3、查找表中多余的重复记录(多个字段)
select *
from 表 a
where (a.Id, a.seq) in
(select Id, seq from 表 group by Id, seq having count(*) > 1)
4、删除表中多余的重复记录(多个字段),只留有rowid最小的记录
delete from 表 a
where (a.Id, a.seq) in
(select Id, seq from 表 group by Id, seq having count(*) > 1)
and rowid not in
(select min(rowid) from 表 group by Id, seq having count(*) > 1)
5、查找表中多余的重复记录(多个字段),不包含rowid最小的记录
select *
from 表 a
where (a.Id, a.seq) in
(select Id, seq from 表 group by Id, seq having count(*) > 1)
and rowid not in
(select min(rowid) from 表 group by Id, seq having count(*) > 1)
- 去重再列合并
select a_name,wmsys.wm_concat(distinct a_class) as a_class from A group by a_name;
如有问题,敬请指正!