小表驱动大表优化-CSDN博客

本文链接：https://blog.csdn.net/lizc_lizc/article/details/86571758

小表驱动大表

类似循环嵌套

for(int i = 0; i < 2; i++) {
	for(int j = 0; j < 10000; j++) {
		......
	}
}

for(int i = 0; i < 10000; i++) {
	for(int j = 0; j < 2; j++) {
		......
	}
}

如果小的循环在外层，对于数据库连接来说就只连接2次，进行20000次操作，如果10000在外，则需要进行10000次数据库连接，从而浪费资源，增加消耗。这就是为什么要小表驱动大表。

数据准备

有一张user表，表中有30万条数据，有一张address表，表中有2条数据

mysql> select count(*) from user;
+----------+
| count(*) |
+----------+
|   301861 |
+----------+
1 row in set (0.13 sec)

mysql> select count(*) from address;
+----------+
| count(*) |
+----------+
|        2 |
+----------+
1 row in set (0.00 sec)

user表是大表，address是小表

案例演示

mysql> select * from user where id in(select user_id from address);
+----+--------+---------+
| id | name   | phone   |
+----+--------+---------+
|  1 | name_1 | phone_1 |
|  2 | name_2 | phone_2 |
+----+--------+---------+
2 rows in set (0.00 sec)

mysql> select * from user where exists(select 1 from address where user.id=address.user_id);
+----+--------+---------+
| id | name   | phone   |
+----+--------+---------+
|  1 | name_1 | phone_1 |
|  2 | name_2 | phone_2 |
+----+--------+---------+
2 rows in set (0.82 sec)

从上面的结果可以看出，当user表中的数据大于address中的数据时，使用in优于exists

查看两条SQL的执行计划

mysql> explain select * from user where id in(select user_id from address);
+----+-------------+---------+------------+--------+---------------+-------------+---------+----------------------+------+----------+-------------------------------------+
| id | select_type | table   | partitions | type   | possible_keys | key         | key_len | ref                  | rows | filtered | Extra                               |
+----+-------------+---------+------------+--------+---------------+-------------+---------+----------------------+------+----------+-------------------------------------+
|  1 | SIMPLE      | address | NULL       | index  | idx_test_id   | idx_test_id | 5       | NULL                 |    2 |   100.00 | Using where; Using index; LooseScan |
|  1 | SIMPLE      | user    | NULL       | eq_ref | PRIMARY       | PRIMARY     | 4       | test.address.user_id |    1 |   100.00 | NULL                                |
+----+-------------+---------+------------+--------+---------------+-------------+---------+----------------------+------+----------+-------------------------------------+
2 rows in set, 1 warning (0.00 sec)

mysql> explain select * from user where exists(select 1 from address where user.id=address.user_id);
+----+--------------------+---------+------------+------+---------------+-------------+---------+--------------+--------+----------+-------------+
| id | select_type        | table   | partitions | type | possible_keys | key         | key_len | ref          | rows   | filtered | Extra       |
+----+--------------------+---------+------------+------+---------------+-------------+---------+--------------+--------+----------+-------------+
|  1 | PRIMARY            | user    | NULL       | ALL  | NULL          | NULL        | NULL    | NULL         | 301420 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | address | NULL       | ref  | idx_test_id   | idx_test_id | 5       | test.user.id |      1 |   100.00 | Using index |
+----+--------------------+---------+------------+------+---------------+-------------+---------+--------------+--------+----------+-------------+
2 rows in set, 2 warnings (0.00 sec)

通过执行计划我们可以看出in与exists的区别

select * from user where id in(select user_id from address);
等价于
for select user_id from address
for select * from user where user.id = address.user_id

select * from user where exists(select 1 from address where user.id=address.user_id);
等价于
for select * from user
for select * from address where user.id = address.user_id