本文翻译自:Finding duplicate values in a SQL table
It's easy to find duplicates
with one field: 使用一个字段很容易找到duplicates
:
SELECT name, COUNT(email)
FROM users
GROUP BY email
HAVING COUNT(email) > 1
So if we have a table 所以,如果我们有一张桌子
ID NAME EMAIL
1 John asd@asd.com
2 Sam asd@asd.com
3 Tom asd@asd.com
4 Bob bob@asd.com
5 Tom asd@asd.com
This query will give us John, Sam, Tom, Tom because they all have the same email
. 这个查询将给我们John,Sam,Tom,Tom,因为他们都有相同的email
。
However, what I want is to get duplicates with the same email
and name
. 但是,我想要的是使用相同的email
和name
获取重复项。
That is, I want to get "Tom", "Tom". 也就是说,我想得到“汤姆”,“汤姆”。
The reason I need this: I made a mistake, and allowed to insert duplicate name
and email
values. 我需要这个的原因:我犯了一个错误,并允许插入重复的name
和email
值。 Now I need to remove/change the duplicates, so I need to find them first. 现在我需要删除/更改重复项,所以我需要先找到它们。
#1楼
参考:https://stackoom.com/question/At25/在SQL表中查找重复值
#2楼
Try the following: 请尝试以下方法:
SELECT * FROM
(
SELECT Id, Name, Age, Comments, Row_Number() OVER(PARTITION BY Name, Age ORDER By Name)
AS Rank
FROM Customers
) AS B WHERE Rank>1
#3楼
In case you work with Oracle, this way would be preferable: 如果您使用Oracle,这种方式更可取:
create table my_users(id number, name varchar2(100), email varchar2(100));
insert into my_users values (1, 'John', 'asd@asd.com');
insert into my_users values (2, 'Sam', 'asd@asd.com');
insert into my_users values (3, 'Tom', 'asd@asd.com');
insert into my_users values (4, 'Bob', 'bob@asd.com');
insert into my_users values (5, 'Tom', 'asd@asd.com');
commit;
select *
from my_users
where rowid not in (select min(rowid) from my_users group by name, email);
#4楼
If you wish to see if there is any duplicate rows in your table, I used below Query: 如果你想看看你的表中是否有任何重复的行,我使用下面的Query:
create table my_table(id int, name varchar(100), email varchar(100));
insert into my_table values (1, 'shekh', 'shekh@rms.com');
insert into my_table values (1, 'shekh', 'shekh@rms.com');
insert into my_table values (2, 'Aman', 'aman@rms.com');
insert into my_table values (3, 'Tom', 'tom@rms.com');
insert into my_table values (4, 'Raj', 'raj@rms.com');
Select COUNT(1) As Total_Rows from my_table
Select Count(1) As Distinct_Rows from ( Select Distinct * from my_table) abc
#5楼
try this code 试试这段代码
WITH CTE AS
( SELECT Id, Name, Age, Comments, RN = ROW_NUMBER()OVER(PARTITION BY Name,Age ORDER BY ccn)
FROM ccnmaster )
select * from CTE
#6楼
SELECT
name, email, COUNT(*)
FROM
users
GROUP BY
name, email
HAVING
COUNT(*) > 1
Simply group on both of the columns. 只需在两个列上分组。
Note: the older ANSI standard is to have all non-aggregated columns in the GROUP BY but this has changed with the idea of "functional dependency" : 注意:旧的ANSI标准是在GROUP BY中包含所有非聚合列,但这已经改变了“功能依赖”的概念 :
In relational database theory, a functional dependency is a constraint between two sets of attributes in a relation from a database. 在关系数据库理论中,函数依赖性是来自数据库的关系中的两组属性之间的约束。 In other words, functional dependency is a constraint that describes the relationship between attributes in a relation. 换句话说,函数依赖是描述关系中属性之间关系的约束。
Support is not consistent: 支持不一致:
- Recent PostgreSQL supports it . 最近的PostgreSQL 支持它 。
- SQL Server (as at SQL Server 2017) still requires all non-aggregated columns in the GROUP BY. SQL Server(与SQL Server 2017一样)仍然需要GROUP BY中的所有非聚合列。
- MySQL is unpredictable and you need
sql_mode=only_full_group_by
: MySQL是不可预测的,你需要sql_mode=only_full_group_by
:- GROUP BY lname ORDER BY showing wrong results ; GROUP BY lname ORDER BY显示错误的结果 ;
- Which is the least expensive aggregate function in the absence of ANY() (see comments in accepted answer). 在没有ANY()的情况下,哪个是最便宜的聚合函数 (参见接受答案中的注释)。
- Oracle isn't mainstream enough (warning: humour, I don't know about Oracle). Oracle不够主流(警告:幽默,我不了解Oracle)。