多字段重复数据查询/去重

最新推荐文章于 2024-07-16 11:54:08 发布

醉眼望云烟

最新推荐文章于 2024-07-16 11:54:08 发布

阅读量1.7k

点赞数

分类专栏： MySQL 文章标签：数据库 sql mysql

本文链接：https://blog.csdn.net/qq_37161651/article/details/126349323

版权

MySQL 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

初始数据

SET NAMES utf8mb4;
SET FOREIGN_KEY_CHECKS = 0;

-- ----------------------------
-- Table structure for testDistinct
-- ----------------------------
DROP TABLE IF EXISTS `testDistinct`;
CREATE TABLE `testDistinct`  (
  `id` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NULL DEFAULT NULL,
  `name` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NULL DEFAULT NULL,
  `age` int NULL DEFAULT NULL
) ENGINE = InnoDB CHARACTER SET = utf8mb4 COLLATE = utf8mb4_bin ROW_FORMAT = Dynamic;

-- ----------------------------
-- Records of testDistinct
-- ----------------------------
INSERT INTO `testDistinct` VALUES ('1', 'xgt', 11);
INSERT INTO `testDistinct` VALUES ('1', 'xgt', 22);
INSERT INTO `testDistinct` VALUES ('2', 'xgt', 33);
INSERT INTO `testDistinct` VALUES ('2', 'wp', 44);
INSERT INTO `testDistinct` VALUES ('3', 'zyk', 55);
INSERT INTO `testDistinct` VALUES ('3', 'zyk', 66);
INSERT INTO `testDistinct` VALUES ('3', 'xgt', 77);
INSERT INTO `testDistinct` VALUES ('3', 'xh', 88);
INSERT INTO `testDistinct` VALUES ('4', 'xh', 99);
INSERT INTO `testDistinct` VALUES ('1', 'wp', 11);
INSERT INTO `testDistinct` VALUES ('2', 'wp', 11);
INSERT INTO `testDistinct` VALUES ('1', 'xgt', 11);

SET FOREIGN_KEY_CHECKS = 1;

在这里插入图片描述

查询重复数据

SELECT
count( 1 ) '数据量条数',
id,
NAME 
FROM
	testDistinct 
GROUP BY
	id,name
	having count(1) > 1
ORDER BY
	id;

如果是查询没有重复的数据，就选择count(1) < 2 的就行了；

SELECT count(1),id,name 
FROM testDistinct 
GROUP BY id,name
having count(1) < 2
ORDER BY id;

查询

新的表结构

SET NAMES utf8mb4;
SET FOREIGN_KEY_CHECKS = 0;

-- ----------------------------
-- Table structure for testDistinct
-- ----------------------------
DROP TABLE IF EXISTS `testDistinct`;
CREATE TABLE `testDistinct`  (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `name` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NULL DEFAULT NULL,
  `age` int NULL DEFAULT NULL,
  PRIMARY KEY (`id`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8mb4 COLLATE = utf8mb4_bin ROW_FORMAT = Dynamic;

-- ----------------------------
-- Records of testDistinct
-- ----------------------------
INSERT INTO `testDistinct` VALUES (1, 'xgt', 12);
INSERT INTO `testDistinct` VALUES (2, 'xgt', 12);
INSERT INTO `testDistinct` VALUES (3, 'wp', 12);
INSERT INTO `testDistinct` VALUES (4, 'wp', 33);
INSERT INTO `testDistinct` VALUES (5, 'xh', 12);
INSERT INTO `testDistinct` VALUES (6, 'zyk', 12);
INSERT INTO `testDistinct` VALUES (7, 'zyk', 12);
INSERT INTO `testDistinct` VALUES (8, 'zyk', 12);
INSERT INTO `testDistinct` VALUES (9, 'zyk', 22);
INSERT INTO `testDistinct` VALUES (10, 'xh', 5);
INSERT INTO `testDistinct` VALUES (11, 'xh', 5);
INSERT INTO `testDistinct` VALUES (12, 'xh', 5);
INSERT INTO `testDistinct` VALUES (13, 'xh', 5);
INSERT INTO `testDistinct` VALUES (14, NULL, NULL);

SET FOREIGN_KEY_CHECKS = 1;

在这里插入图片描述

去重查询

select 
id,name,age 
from testDistinct 
group by name,age 
order by id

在这里插入图片描述

去重查询（重复数据取最大ID的数据）

select max(id) id,name,age from testDistinct GROUP BY name,age

# 两种多此一举的操作SQL
# 用exists
select id,name,age
from testDistinct a where EXISTS 
(select max(id) id from testDistinct GROUP BY name,age having a.id = id);

# 用in
select id,name,age
from testDistinct where EXISTS (select max(id) id from testDistinct GROUP BY name,age);

在这里插入图片描述

删除掉重复数据，重复数据只保留ID最大的一条

delete a from testDistinct a
where not exists (
	select id from 
	(select max(b.id) id from testDistinct b group by b.name,b.age) c 
	where a.id = c.id
)

或者

delete a from testDistinct a
where a.id not in (
	select id from 
	( select max(b.id) id from testDistinct b group by name,age ) c
)

错误解决

ONLY_FULL_GROUP_BY

如果是mysql5.7+查询报错：

SELECT list is not in GROUP BY clause and contains nonaggregated column 'XXX' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

需要去掉ONLY_FULL_GROUP_BY模式

ONLY_FULL_GROUP_BY： 使用这个就是使用和oracle一样的group 规则, select的列都要在group中,或者本身是聚合列(SUM,AVG,MAX,MIN等) 才行

show session variables where variable_name = 'sql_mode';
#ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION
SHOW GLOBAL VARIABLES where variable_name = 'sql_mode';
#ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION

set @@sql_mode='STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION'

select @@sql_mode from dual;

You can’t specify target table for update in FROM clause

意思是： 不能先select出同一表中的某些值，再update这个表(在同一语句中)->>使用已经存在的一行数据的字段更新另外一行数据的字段
需要中间嵌套一层select，就行了
例如：将id为1数据的name值改为id为6数据的name值

update testDistinct a
set a.name = (select name from testDistinct b where b.id = 6)
where a.id = 1

应该换种写法：↓

update testDistinct a , testDistinct b
set a.name = b.name 
where a.id = 1 and b.id = 6

Oracle没有这种问题；

醉眼望云烟

关注

0
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
多字段重复数据查询/去重

偶然遇到问题，需要多字段去重/查询/删除重复数据
复制链接

扫一扫

专栏目录