IN和EXISTS的区别和效率

djy180

已于 2024-01-12 22:42:18 修改

阅读量1.1k

点赞数 9

分类专栏： mysql 文章标签：数据库

于 2024-01-12 22:41:48 首次发布

本文链接：https://blog.csdn.net/djy180/article/details/135563128

版权

mysql 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

本文探讨了SQL查询中in和exists操作在处理子查询时的效率差异，特别是在大数据表和小数据表之间的区别。通过实例和存储过程演示，强调了在不同场景下使用in和exists的最佳实践，以及如何利用索引来提升查询性能。

摘要由CSDN通过智能技术生成

in和exists

(有索引是前提，子查询表大用exists，子查询表小用in)

in是把外表和内表作hash连接，而exists是对外表作loop循环，每次loop循环再对内表进行查询。

select * from t1 where exists ( select null from t2 where y = x )
执行的过程相当于:
for x in ( select * from t1 )
   loop
      if ( exists ( select null from t2 where y = x.x )
      then 
         OUTPUT THE RECORD
      end if
end loop
          
 
Select * from T1 where x in ( select y from T2 )
执行的过程相当于:
select * 
  from t1, ( select distinct y from t2 ) t2
 where t1.x = t2.y;

如果查询的两个表大小相当，那么用in和exists差别不大；如果两个表中一个较小一个较大，则子查询表大的用exists，子查询表小的用in；

例如：表A(小表)，表B(大表)

select * from A where cc in(select cc from B)　　-->效率低，用到了A表上cc列的索引；

select * from A where exists(select cc from B where cc=A.cc)　　-->效率高，用到了B表上cc列的索引。

相反的：

select * from B where cc in(select cc from A)　　-->效率高，用到了B表上cc列的索引

select * from B where exists(select cc from A where cc=B.cc)　　-->效率低，用到了A表上cc列的索引。

in和exists案列测试

t_member表 10万数据；t_member_service 1万数据左右; t_member_service的member_id是普通索引

CREATE TABLE `t_member` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  `sex` int(255) DEFAULT NULL,
  `password` varchar(255) DEFAULT NULL,
  `create_time` datetime DEFAULT NULL,
  `token` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `index_name` (`name`) USING BTREE COMMENT '普通索引'
) ENGINE=InnoDB AUTO_INCREMENT=126008 DEFAULT CHARSET=utf8mb4;

CREATE TABLE `t_member_service` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `member_id` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `m_mid` (`member_id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=10001 DEFAULT CHARSET=utf8mb4;

存储过程造数据

--- 定义存储过程
 
DELIMITER $$
DROP PROCEDURE IF EXISTS `autoinsert`$$
CREATE PROCEDURE `autoinsert`(IN insertCount INT)
BEGIN
  DECLARE count INT DEFAULT 0;
   WHILE(count < insertCount)
    DO
        INSERT INTO `springboot1`.`t_member`( `name`, `sex`, `password`, `create_time`, `token`) 
select substring(MD5(RAND()),1,8),0,substring(MD5(RAND()),1,10),SYSDATE(),substring(MD5(RAND()),1,20) from dual;
        SET count = count+1;
    END WHILE;
    END$$
DELIMITER;


-- 调用存储过程
call autoinsert(100000);


--- 定义存储过程
 
DELIMITER $$
DROP PROCEDURE IF EXISTS `autoinsert`$$
CREATE PROCEDURE `autoinsert`(IN insertCount INT)
BEGIN
  DECLARE count INT DEFAULT 0;
   WHILE(count < insertCount)
    DO
        INSERT INTO `t_member_service`( `member_id`) 
select count+1 from dual;
        SET count = count+1;
    END WHILE;
    END$$
DELIMITER;


-- 调用存储过程
call autoinsert(10000);

注意定义完存储过程，再执行存储过程

explain select * from t_member t where  EXISTS 
(select 1 from t_member_service t0 where t0.member_id = t.id);

explain select * from t_member t where  t.id  in 
(select t0.member_id from t_member_service t0 where t0.member_id = t.id);

语句1：(耗时0.9s)