如何删除SQL Server中的重复行?

本文翻译自:How to delete duplicate rows in SQL Server?

How can I delete duplicate rows where no unique row id exists? 如何删除不存在unique row id 重复行

My table is 我的桌子是

col1  col2 col3 col4 col5 col6 col7
john  1    1    1    1    1    1 
john  1    1    1    1    1    1
sally 2    2    2    2    2    2
sally 2    2    2    2    2    2

I want to be left with the following after the duplicate removal: 在重复删除后,我想保留以下内容:

john  1    1    1    1    1    1
sally 2    2    2    2    2    2

I've tried a few queries but I think they depend on having a row id as I don't get the desired result. 我已经尝试了一些查询,但是我认为它们依赖于行ID,因为我没有得到期望的结果。 For example: 例如:

DELETE
FROM table
WHERE col1 IN (
    SELECT id
    FROM table
    GROUP BY id
    HAVING (COUNT(col1) > 1)
)

#1楼

参考:https://stackoom.com/question/1FAEA/如何删除SQL-Server中的重复行


#2楼

I like CTEs and ROW_NUMBER as the two combined allow us to see which rows are deleted (or updated), therefore just change the DELETE FROM CTE... to SELECT * FROM CTE : 我喜欢CTE和ROW_NUMBER因为两者结合使我们能够看到删除(或更新)了哪些行,因此只需将DELETE FROM CTE...更改为SELECT * FROM CTE

WITH CTE AS(
   SELECT [col1], [col2], [col3], [col4], [col5], [col6], [col7],
       RN = ROW_NUMBER()OVER(PARTITION BY col1 ORDER BY col1)
   FROM dbo.Table1
)
DELETE FROM CTE WHERE RN > 1

DEMO (result is different; I assume that it's due to a typo on your part) 演示 (结果有所不同;我认为这是由于您的错字引起的)

COL1    COL2    COL3    COL4    COL5    COL6    COL7
john    1        1       1       1       1       1
sally   2        2       2       2       2       2

This example determines duplicates by a single column col1 because of the PARTITION BY col1 . 由于PARTITION BY col1本示例通过单个列col1确定重复项。 If you want to include multiple columns simply add them to the PARTITION BY : 如果要包括多个列,只需将它们添加到PARTITION BY

ROW_NUMBER()OVER(PARTITION BY Col1, Col2, ... ORDER BY OrderColumn)

#3楼

Microsoft has a vey ry neat guide on how to remove duplicates. Microsoft有一个关于如何删除重复项的简洁指南。 Check out http://support.microsoft.com/kb/139444 查看http://support.microsoft.com/kb/139444

In brief, here is the easiest way to delete duplicates when you have just a few rows to delete: 简而言之,当您只有几行要删除时,这是删除重复项的最简单方法:

SET rowcount 1;
DELETE FROM t1 WHERE myprimarykey=1;

myprimarykey is the identifier for the row. myprimarykey是该行的标识符。

I set rowcount to 1 because I only had two rows that were duplicated. 我将rowcount设置为1,因为只有两行重复。 If I had had 3 rows duplicated then I would have set rowcount to 2 so that it deletes the first two that it sees and only leaves one in table t1. 如果我重复了3行,则应将rowcount设置为2,以便它删除它看到的前两行,而只在表t1中保留一行。

Hope it helps anyone 希望它能帮助任何人


#4楼

Another way of removing dublicated rows without loosing information in one step is like following: 另一种在不丢失信息的情况下删除重复行的方法如下:

delete from dublicated_table t1 (nolock)
join (
    select t2.dublicated_field
    , min(len(t2.field_kept)) as min_field_kept
    from dublicated_table t2 (nolock)
    group by t2.dublicated_field having COUNT(*)>1
) t3 
on t1.dublicated_field=t3.dublicated_field 
    and len(t1.field_kept)=t3.min_field_kept

#5楼

DELETE from search
where id not in (
   select min(id) from search
   group by url
   having count(*)=1

   union

   SELECT min(id) FROM search
   group by url
   having count(*) > 1
)

#6楼

If you have no references, like foreign keys, you can do this. 如果没有引用(例如外键),则可以执行此操作。 I do it a lot when testing proofs of concept and the test data gets duplicated. 测试概念证明和测试数据重复时,我会做很多事情。

SELECT DISTINCT [col1],[col2],[col3],[col4],[col5],[col6],[col7]

INTO [newTable]

Go into the object explorer and delete the old table. 进入对象资源管理器并删除旧表。

Rename the new table with the old table's name. 用旧表的名称重命名新表。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值