在SQL中删除重复记录

最新推荐文章于 2024-09-06 22:15:10 发布

zztfj

最新推荐文章于 2024-09-06 22:15:10 发布

阅读量837

点赞数

分类专栏： 05_SQL 文章标签： sql scroll table oracle insert object

本文链接：https://blog.csdn.net/zztfj/article/details/3105679

版权

05_SQL 专栏收录该内容

29 篇文章 1 订阅

订阅专栏

学习 sql 有一段时间了，发现在我建了一个用来测试的表（没有建索引）中出现了许多的重复记录。后来总结了一些删除重复记录的方法，在 Oracle 中，可以通过唯一 rowid 实现删除重复记录；还可以建临时表来实现 ... 这个只提到其中的几种简单实用的方法，希望可以和大家分享（以表 employee 为例）。

SQL> desc employee
Name Null? Type
----------------------------------------- -------- ------------------
emp_id NUMBER(10)
emp_name VARCHAR2(20)
salary NUMBER(10,2)

可以通过下面的语句查询重复的记录：

SQL> select * from employee;
EMP_ID EMP_NAME SALARY
---------- ---------------------------------------- ----------
1 sunshine 10000
1 sunshine 10000
2 semon 20000
2 semon 20000
3 xyz 30000
2 semon 20000

SQL> select distinct * from employee;
EMP_ID EMP_NAME SALARY
---------- ---------------------------------------- ----------
1 sunshine 10000
2 semon 20000
3 xyz 30000
SQL> select * from employee group by emp_id,emp_name,salary having count (*)>1
EMP_ID EMP_NAME SALARY
---------- ---------------------------------------- ----------
1 sunshine 10000
2 semon 20000
SQL> select * from employee e1
where rowid in (select max(rowid) from employe e2
where e1.emp_id=e2.emp_id and
e1.emp_name=e2.emp_name and e1.salary=e2.salary);
EMP_ID EMP_NAME SALARY
---------- ---------------------------------------- ----------
1 sunshine 10000
3 xyz 30000
2 semon 20000
2. 删除的几种方法：
（ 1 ）通过建立临时表来实现
SQL>create table temp_emp as (select distinct * from employee)

SQL> truncate table employee; ( 清空 employee 表的数据）

SQL> insert into employee select * from temp_emp; ( 再将临时表里的内容插回来）

( 2 ）通过唯一 rowid 实现删除重复记录 . 在 Oracle 中，每一条记录都有一个 rowid ， rowid 在整个数据库中是唯一的， rowid 确定了每条记录是在 Oracle 中的哪一个数据文件、块、行上。在重复的记录中，可能所有列的内容都相同，但 rowid 不会相同，所以只要确定出重复记录中那些具有最大或最小 rowid 的就可以了，其余全部删除。

SQL>delete from employee e2 where rowid not in (
select max(e1.rowid) from employee e1 where
e1.emp_id=e2.emp_id and e1.emp_name=e2.emp_name and e1.salary=e2.salary);-- 这里用 min(rowid) 也可以。

SQL>delete from employee e2 where rowid <(
select max(e1.rowid) from employee e1 where
e1.emp_id=e2.emp_id and e1.emp_name=e2.emp_name and
e1.salary=e2.salary);

（ 3 ）也是通过 rowid ，但效率更高。

SQL>delete from employee where rowid not in (
select max(t1.rowid) from employee t1 group by
t1.emp_id,t1.emp_name,t1.salary);-- 这里用 min(rowid) 也可以。

EMP_ID EMP_NAME SALARY
---------- ---------------------------------------- ----------
1 sunshine 10000
3 xyz 30000
2 semon 20000
SQL> desc employee

Name Null? Type
----------------------------------------- -------- ------------------
emp_id NUMBER(10)
emp_name VARCHAR2(20)
salary NUMBER(10,2)

可以通过下面的语句查询重复的记录：
SQL> select * from employee;
EMP_ID EMP_NAME SALARY
---------- ---------------------------------------- ----------
1 sunshine 10000
1 sunshine 10000
2 semon 20000
2 semon 20000
3 xyz 30000
2 semon 20000

SQL> select distinct * from employee;
EMP_ID EMP_NAME SALARY
---------- ---------------------------------------- ----------
1 sunshine 10000
2 semon 20000
3 xyz 30000

SQL> select * from employee group by emp_id,emp_name,salary having count (*)>1

EMP_ID EMP_NAME SALARY
---------- ---------------------------------------- ----------
1 sunshine 10000
2 semon 20000
SQL> select * from employee e1

where rowid in (select max(rowid) from employe e2
where e1.emp_id=e2.emp_id and

CREATE PROCEDURE [K_FIND_CF] -- 此存储过程用于查找表中的重值于 2005-07-18 完成
AS

if exists (select * from sysobjects where id = object_id(N'[dbo].[T_find]')
and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[T_find]

CREATE TABLE [dbo].[T_find] (
[cfz] [char](16)
)
select * from GJ_JBXX order by gjbh ------ 修改成你要找的表名和字段名

declare sum1 scroll cursor
for
select count(*) from GJ_JBXX ------ 修改成你要找的表名
open sum1
declare @zshu int
fetch first from sum1 into @zshu
close sum1
deallocate sum1

declare find1 scroll cursor
for
select gjbh from GJ_JBXX ------ 修改成你要找的表名和字段名
open find1

declare find2 scroll cursor
for --update -- 指定游标结果集可以被修改
select gjbh from GJ_JBXX ------- 修改成你要找的表名和字段名
open find2

declare @yb1 char(16)
declare @yb2 char(16)
fetch first from find2 into @yb2

while(@@fetch_status<>-1) --if @@fetch_status （游标当前所指的行）在最后一行时， @@fetch_status 的值为 -1 ，其它情况都为 0
begin
fetch next from find1 into @yb1
fetch next from find2 into @yb2
if (@yb1=@yb2) and (@@fetch_status<>-1) -- 必须有这一句，否则最后一条记录将认为是重复的
insert into dbo.T_find(cfz) values(@yb2)

end
close find1
deallocate find1
close find2
deallocate find2
select * from T_find
GO