Oracle处理重复数据问题

问题描述:

现有表A和表B为父子表,有外键关联。A表主键为a_id,主要字段有a_name、a_class,B表主键为b_id,与A表关联字段为a_id。现要根据A表的a_name去重,并且把A表的a_class字段进行列合并,并更新到A表中。

解决方法:
  1. 将A表中的数据备份:
create table A_bak as select * from A;
  1. 删除子表B表中和A表重复数据关联的数据:
delete from B b
 where exists (select a.a_id
          from A a
         where a. a_name in
               (select aa. a_name
                  from A aa
                 group by aa. a_name
                having count(aa. a_name) > 1)
           and rowid not in (select min(rowid)
                               from A aaa
                              group by aaa. a_name
                             having count(*) > 1)
           and b.a_id = a.a_id);
  1. 删除A表中的重复数据:
delete from A a
 where a.a_name in (select aa.a_name
                           from A aa
                          group by aa.a_name
                         having count(aa.a_name) > 1)
   and rowid not in (select min(rowid)
                       from A aaa
                      group by aaa.a_name
                     having count(*) > 1);
  1. 修改A表a_class为列合并去重后的数据:
update A a
   set a.a_class =
       (select dd.a_class
          from (select bak.a_name,
                       wmsys.wm_concat(distinct bak.a_class) as a_class
                  from A_bak bak
                 group by bak.a_name) dd
         where dd.a_name = a.a_name);
总结:

1、查找表中多余的重复记录,重复记录是根据单个字段(Id)来判断

select *
  from 表
 where Id in (select Id from 表 group byId having count(Id) > 1)

2、删除表中多余的重复记录,重复记录是根据单个字段(Id)来判断,只留有rowid最小的记录

delete from 表
 where (id) in (select id from 表 group by id having count(id) > 1)
   and rowid not in
       (select min(rowid) from 表 group by id having count(*) > 1);

3、查找表中多余的重复记录(多个字段)

select *
  from 表 a
 where (a.Id, a.seq) in
       (select Id, seq from 表 group by Id, seq having count(*) > 1)

4、删除表中多余的重复记录(多个字段),只留有rowid最小的记录

delete from 表 a
 where (a.Id, a.seq) in
       (select Id, seq from 表 group by Id, seq having count(*) > 1)
   and rowid not in
       (select min(rowid) from 表 group by Id, seq having count(*) > 1)

5、查找表中多余的重复记录(多个字段),不包含rowid最小的记录

select *
  from 表 a
 where (a.Id, a.seq) in
       (select Id, seq from 表 group by Id, seq having count(*) > 1)
   and rowid not in
       (select min(rowid) from 表 group by Id, seq having count(*) > 1)
  1. 去重再列合并
select a_name,wmsys.wm_concat(distinct a_class) as a_class from A group by a_name;

如有问题,敬请指正!

总结部分内容转载自https://www.cnblogs.com/zfox2017/p/7676237.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值