创建测试表:
-- 创建test2
CREATE TABLE TEST2
(
ID NUMBER(8) PRIMARY KEY NOT NULL,
NAME CHAR(8) NOT NULL,
AGE NUMBER(8) NULL,
SEX CHAR(2) NULL,
HOBBY VARCHAR2(200) NULL
);
insert into TEST2 (ID, NAME, AGE, SEX, HOBBY)
values (7, '武器大师', 14, '男', '打野');
insert into TEST2 (ID, NAME, AGE, SEX, HOBBY)
values (8, '嫦娥 ', 14, '女', '打野');
insert into TEST2 (ID, NAME, AGE, SEX, HOBBY)
values (9, '猴子 ', 14, '男', '吃桃');
insert into TEST2 (ID, NAME, AGE, SEX, HOBBY)
values (1, '齐天大圣', 14, '男', '打野');
insert into TEST2 (ID, NAME, AGE, SEX, HOBBY)
values (2, '猪八戒 ', 48, '男', '吃货');
insert into TEST2 (ID, NAME, AGE, SEX, HOBBY)
values (5, '紫霞仙子', 22, '女', '至尊宝');
insert into TEST2 (ID, NAME, AGE, SEX, HOBBY)
values (6, '哇哈哈 ', 34, '女', '睡觉');
**
在这里我以age和hobby两者都重复作为重复条件
**
1、要删除的数据量比较小的话,直接查询出重复的数据,然后delete删除,2个方法:
方法1:
select a.* from test2 a where (a.age,a.hobby) in
(select age,hobby from test2 group by age,hobby having count(1)>1)
and
a.rowid not in
(select min(rowid) from test2 group by age,hobby having count(1)>1);
方法2,利用rowid进行比较删除:
select m.* from test2 m where m.rowid>
(select min(rowid) from test2 n where m.age=n.age and m.hobby=n.hobby);
2、数据量大的话,不但很费时间,还有可能导致数据库卡死,这个时候可以通过创建临时表解决:
create table temp_table
as
select * from
having count(条件)>1 --查询重复的数据,并只显示一条
union all
select * from……
having count(条件)=1
例如:
create table temp_table
as
select * from test2 aa where aa.rowid in --注意这里是in,不是not in
(select min(rowid) from test2 group by age,hobby having count(1)>1)
--上半部分是查询重复数据,并显示一条
union all
select * from test2 where (age,hobby) in
(select age,hobby from test2 group by age,hobby having count(1)=1);
--下半部分是查询非重复数据
另一种是利用比较rowid的方法,上面已经写过,如下,变换一下:
select * from test2 h where h.rowid in
(select min(rowid) from test2 j where h.age=j.age and h.hobby=j.hobby)
-- 直接查询去重之后的数据
以上是将去重的数据放入创建的临时表中,然后清空数据重复的表:
truncate table test2;
接着将临时表的数据转到目标表:
insert into test2 select * from temp_table;
最后删除临时表:
drop table temp_table(临时表);
最后的效果:
开始的状态: