MySQL重复数据删除处理

一 背景

​ 在实际业务开发过程中,经常会出现需要删除脏数据的情况,其中由于没有严格控制并发导致的重复数据删除尤为典型。处理脏数据的方式一般有两种,一是通过sql处理,二是通过代码处理。对于复杂的脏数据,代码处理是最推荐的,通过代码的逻辑,可以准确地控制不同情况,然后针对性测试。但很多时候脏数据处理是一次性的,如果开发投入精力过多,明显是一件性价比不高的事情。所以大多数情况下,开发会偏向于通过sql方式来修复数据,只需要写一个sql到线上环境跑一下(前提是可以写得出来,哈哈),都不需要上线代码,就可以解决根本性问题。本次针对重复性脏数据情况,根据不同需求场景,来通过sql方式处理脏数据。

二 准备

准备一张热腾腾的数据表:

CREATE TABLE `user_group` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `uid` int(11) NOT NULL,
  `gid` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8;

插入数据:

insert  into `user_group`(`id`,`uid`,`gid`) values 
(1,1,1),
(2,1,1),
(3,1,2),
(4,1,3),
(5,2,1),
(6,2,2),
(7,2,2),
(8,2,2);

假定通过uid和gid联合作为唯一判断,那么重复的数据有:[1,2],[6,7,8]

三 处理

1 删除重复数据,只保留一条数据
1.1 保留id最大的那一条

删除的数据应为[1,6,7]

#### QUERY
SELECT a.* FROM  user_group AS a,
(
SELECT *,MAX(id) AS max_id FROM user_group GROUP BY uid,gid HAVING COUNT(1) > 1
) AS b
 WHERE a.uid = b.uid AND a.gid = b.gid AND a.id < b.max_id;
 
#### DELETE
DELETE a FROM  user_group AS a,
(
SELECT *,MAX(id) AS max_id FROM user_group GROUP BY uid,gid HAVING COUNT(1) > 1
) AS b
 WHERE a.uid = b.uid AND a.gid = b.gid AND a.id < b.max_id;
1.2 保留id最小的那一条

删除的数据应为[2,7,8]

#### QUERY
SELECT a.* FROM  user_group AS a,
(
SELECT *,MIN(id) AS min_id FROM user_group GROUP BY uid,gid HAVING COUNT(1) > 1
) AS b
 WHERE a.uid = b.uid AND a.gid = b.gid AND a.id > b.min_id;
 
#### DELETE
DELETE a FROM  user_group AS a,
(
SELECT *,MIN(id) AS min_id FROM user_group GROUP BY uid,gid HAVING COUNT(1) > 1
) AS b
 WHERE a.uid = b.uid AND a.gid = b.gid AND a.id > b.min_id;
2 删除重复数据,只删除一条数据
2.1 删除id最大的那一条

删除的数据应为[2,8]

#### QUERY
 SELECT a.* FROM  user_group AS a,
(
SELECT *,MAX(id) AS max_id FROM user_group GROUP BY uid,gid HAVING COUNT(1) > 1
) AS b
 WHERE a.uid = b.uid AND a.gid = b.gid AND a.id = b.max_id;
 
#### DELETE
DELETE a FROM  user_group AS a,
(
SELECT *,MAX(id) AS max_id FROM user_group GROUP BY uid,gid HAVING COUNT(1) > 1
) AS b
 WHERE a.uid = b.uid AND a.gid = b.gid AND a.id = b.max_id;
2.2 删除id最小的那一条

删除的数据应为[1,6]

#### QUERY
SELECT a.* FROM  user_group AS a,
(
SELECT *,MIN(id) AS min_id FROM user_group GROUP BY uid,gid HAVING COUNT(1) > 1
) AS b
 WHERE a.uid = b.uid AND a.gid = b.gid AND a.id = b.min_id;
 
#### DELETE
DELETE a FROM  user_group AS a,
(
SELECT *,MIN(id) AS min_id FROM user_group GROUP BY uid,gid HAVING COUNT(1) > 1
) AS b
 WHERE a.uid = b.uid AND a.gid = b.gid AND a.id = b.min_id;

四 拓展

https://blog.csdn.net/yueliangdao0608/article/details/2306771

https://jiloc.com/46224.html

  • 4
    点赞
  • 13
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值