小表驱动大表
类似循环嵌套
for(int i = 0; i < 2; i++) {
for(int j = 0; j < 10000; j++) {
......
}
}
for(int i = 0; i < 10000; i++) {
for(int j = 0; j < 2; j++) {
......
}
}
如果小的循环在外层,对于数据库连接来说就只连接2次,进行20000次操作,如果10000在外,则需要进行10000次数据库连接,从而浪费资源,增加消耗。这就是为什么要小表驱动大表。
数据准备
有一张user表,表中有30万条数据,有一张address表,表中有2条数据
mysql> select count(*) from user;
+----------+
| count(*) |
+----------+
| 301861 |
+----------+
1 row in set (0.13 sec)
mysql> select count(*) from address;
+----------+
| count(*) |
+----------+
| 2 |
+----------+
1 row in set (0.00 sec)
user表是大表,address是小表
案例演示
mysql> select * from user where id in(select user_id from address);
+----+--------+---------+
| id | name | phone |
+----+--------+---------+
| 1 | name_1 | phone_1 |
| 2 | name_2 | phone_2 |
+----+--------+---------+
2 rows in set (0.00 sec)
mysql> select * from user where exists(select 1 from address where user.id=address.user_id);
+----+--------+---------+
| id | name | phone |
+----+--------+---------+
| 1 | name_1 | phone_1 |
| 2 | name_2 | phone_2 |
+----+--------+---------+
2 rows in set (0.82 sec)
从上面的结果可以看出,当user表中的数据大于address中的数据时,使用in优于exists
查看两条SQL的执行计划
mysql> explain select * from user where id in(select user_id from address);
+----+-------------+---------+------------+--------+---------------+-------------+---------+----------------------+------+----------+-------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+--------+---------------+-------------+---------+----------------------+------+----------+-------------------------------------+
| 1 | SIMPLE | address | NULL | index | idx_test_id | idx_test_id | 5 | NULL | 2 | 100.00 | Using where; Using index; LooseScan |
| 1 | SIMPLE | user | NULL | eq_ref | PRIMARY | PRIMARY | 4 | test.address.user_id | 1 | 100.00 | NULL |
+----+-------------+---------+------------+--------+---------------+-------------+---------+----------------------+------+----------+-------------------------------------+
2 rows in set, 1 warning (0.00 sec)
mysql> explain select * from user where exists(select 1 from address where user.id=address.user_id);
+----+--------------------+---------+------------+------+---------------+-------------+---------+--------------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+---------+------------+------+---------------+-------------+---------+--------------+--------+----------+-------------+
| 1 | PRIMARY | user | NULL | ALL | NULL | NULL | NULL | NULL | 301420 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | address | NULL | ref | idx_test_id | idx_test_id | 5 | test.user.id | 1 | 100.00 | Using index |
+----+--------------------+---------+------------+------+---------------+-------------+---------+--------------+--------+----------+-------------+
2 rows in set, 2 warnings (0.00 sec)
通过执行计划我们可以看出in与exists的区别
select * from user where id in(select user_id from address);
等价于
for select user_id from address
for select * from user where user.id = address.user_id
select * from user where exists(select 1 from address where user.id=address.user_id);
等价于
for select * from user
for select * from address where user.id = address.user_id
总结
in后面应该跟小表,exists后面应该跟大表。