MySQL-处理重复项

MySQL-处理重复项 (MySQL - Handling Duplicates)

Generally, tables or result sets sometimes contain duplicate records. Most of the times it is allowed but sometimes it is required to stop duplicate records. It is required to identify duplicate records and remove them from the table. This chapter will describe how to prevent the occurrence of duplicate records in a table and how to remove the already existing duplicate records.

通常,表或结果集有时包含重复的记录。 在大多数情况下,它是允许的,但有时需要停止重复的记录。 需要标识重复的记录并将其从表中删除。 本章将介绍如何防止表中出现重复记录,以及如何删除已经存在的重复记录。

防止表中出现重复项 (Preventing Duplicates from Occurring in a Table)

You can use a PRIMARY KEY or a UNIQUE Index on a table with the appropriate fields to stop duplicate records.

您可以在具有适当字段的表上使用PRIMARY KEYUNIQUE索引来停止重复记录。

Let us take an example – The following table contains no such index or primary key, so it would allow duplicate records for first_name and last_name.

让我们举个例子–下表不包含此类索引或主键,因此它将允许重复记录first_namelast_name


CREATE TABLE person_tbl (
   first_name CHAR(20),
   last_name CHAR(20),
   sex CHAR(10)
);

To prevent multiple records with the same first and last name values from being created in this table, add a PRIMARY KEY to its definition. When you do this, it is also necessary to declare the indexed columns to be NOT NULL, because a PRIMARY KEY does not allow NULL values −

为了防止在此表中创建具有相同的名字和姓氏值的多个记录,请在其定义中添加一个PRIMARY KEY 。 这样做时,还必须声明索引列为NOT NULL ,因为PRIMARY KEY不允许NULL值-


CREATE TABLE person_tbl (
   first_name CHAR(20) NOT NULL,
   last_name CHAR(20) NOT NULL,
   sex CHAR(10),
   PRIMARY KEY (last_name, first_name)
);

The presence of a unique index in a table normally causes an error to occur if you insert a record into the table that duplicates an existing record in the column or columns that define the index.

如果在表中插入一条记录,该记录与定义索引的一个或多个列中的现有记录重复,则表中存在唯一索引通常会导致错误发生。

Use the INSERT IGNORE command rather than the INSERT command. If a record doesn't duplicate an existing record, then MySQL inserts it as usual. If the record is a duplicate, then the IGNORE keyword tells MySQL to discard it silently without generating an error.

使用INSERT IGNORE命令而不是INSERT命令。 如果一条记录与现有记录不重复,则MySQL照常插入它。 如果记录是重复的,则IGNORE关键字告诉MySQL静默丢弃它而不会产生错误。

The following example does not error out and at the same time it will not insert duplicate records as well.

下面的示例不会出错,同时也不会插入重复的记录。


mysql> INSERT IGNORE INTO person_tbl (last_name, first_name)
   -> VALUES( 'Jay', 'Thomas');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT IGNORE INTO person_tbl (last_name, first_name)
   -> VALUES( 'Jay', 'Thomas');
Query OK, 0 rows affected (0.00 sec)

Use the REPLACE command rather than the INSERT command. If the record is new, it is inserted just as with INSERT. If it is a duplicate, the new record replaces the old one.

使用REPLACE命令而不是INSERT命令。 如果记录是新记录,则与INSERT一样插入。 如果重复,则新记录将替换旧记录。


mysql> REPLACE INTO person_tbl (last_name, first_name)
   -> VALUES( 'Ajay', 'Kumar');
Query OK, 1 row affected (0.00 sec)

mysql> REPLACE INTO person_tbl (last_name, first_name)
   -> VALUES( 'Ajay', 'Kumar');
Query OK, 2 rows affected (0.00 sec)

The INSERT IGNORE and REPLACE commands should be chosen as per the duplicate-handling behavior you want to effect. The INSERT IGNORE command keeps the first set of the duplicated records and discards the remaining. The REPLACE command keeps the last set of duplicates and erases out any earlier ones.

应该根据要执行的重复处理行为来选择INSERT IGNORE和REPLACE命令。 INSERT IGNORE命令保留第一组重复的记录,并丢弃其余的记录。 REPLACE命令保留最后一组重复项,并清除所有较早的重复项。

Another way to enforce uniqueness is to add a UNIQUE index rather than a PRIMARY KEY to a table.

强制唯一性的另一种方法是向表中添加UNIQUE索引而不是PRIMARY KEY。


CREATE TABLE person_tbl (
   first_name CHAR(20) NOT NULL,
   last_name CHAR(20) NOT NULL,
   sex CHAR(10)
   UNIQUE (last_name, first_name)
);

计数和识别重复项 (Counting and Identifying Duplicates)

Following is the query to count duplicate records with first_name and last_name in a table.

以下是对表中具有first_name和last_name的重复记录进行计数的查询。


mysql> SELECT COUNT(*) as repetitions, last_name, first_name
   -> FROM person_tbl
   -> GROUP BY last_name, first_name
   -> HAVING repetitions > 1;

This query will return a list of all the duplicate records in the person_tbl table. In general, to identify sets of values that are duplicated, follow the steps given below.

该查询将返回person_tbl表中所有重复记录的列表。 通常,要标识重复的值集,请执行以下步骤。

  • Determine which columns contain the values that may be duplicated.

    确定哪些列包含可能重复的值。

  • List those columns in the column selection list, along with the COUNT(*).

    在列选择列表中列出这些列,以及COUNT(*)

  • List the columns in the GROUP BY clause as well.

    还要列出GROUP BY子句中的列。

  • Add a HAVING clause that eliminates the unique values by requiring the group counts to be greater than one.

    添加一个HAVING子句,该子句通过要求组计数大于1来消除唯一值。

从查询结果中消除重复 (Eliminating Duplicates from a Query Result)

You can use the DISTINCT command along with the SELECT statement to find out unique records available in a table.

您可以将DISTINCT命令与SELECT语句一起使用,以查找表中可用的唯一记录。


mysql> SELECT DISTINCT last_name, first_name
   -> FROM person_tbl
   -> ORDER BY last_name;

An alternative to the DISTINCT command is to add a GROUP BY clause that names the columns you are selecting. This has the effect of removing duplicates and selecting only the unique combinations of values in the specified columns.

DISTINCT命令的替代方法是添加一个GROUP BY子句,该子句命名您选择的列。 这具有删除重复项并仅选择指定列中值的唯一组合的效果。


mysql> SELECT last_name, first_name
   -> FROM person_tbl
   -> GROUP BY (last_name, first_name);

使用表替换删除重复项 (Removing Duplicates Using Table Replacement)

If you have duplicate records in a table and you want to remove all the duplicate records from that table, then follow the procedure given below.

如果表中有重复的记录,并且要从该表中删除所有重复的记录,请按照以下步骤进行操作。


mysql> CREATE TABLE tmp SELECT last_name, first_name, sex
   -> FROM person_tbl;
   -> GROUP BY (last_name, first_name);

mysql> DROP TABLE person_tbl;
mysql> ALTER TABLE tmp RENAME TO person_tbl;

An easy way of removing duplicate records from a table is to add an INDEX or a PRIMARY KEY to that table. Even if this table is already available, you can use this technique to remove the duplicate records and you will be safe in future as well.

从表中删除重复记录的一种简单方法是将INDEX或PRIMARY KEY添加到该表。 即使此表已经可用,您也可以使用此技术删除重复的记录,以后也将很安全。


mysql> ALTER IGNORE TABLE person_tbl
   -> ADD PRIMARY KEY (last_name, first_name);

翻译自: https://www.tutorialspoint.com/mysql/mysql-handling-duplicates.htm

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值