5.1 非相关子查询,多种优化情况
示例1: 聚集非相关子查询,没有消除子查询,但被优化为只执行一次
聚集函数操作在非相关子查询中,查询执行计划如下:
mysql> EXPLAIN SELECT * FROM t1 WHERE t1.a1>(SELECT MIN(t2.a2) FROM t2);
+----+-------------+-------+------------+------+------------------------------+
| id | select_type | table | partitions | type | Extra |
+----+-------------+-------+------------+------+------------------------------+
| 1 | PRIMARY | t1 | NULL | ALL | Using where |
| 2 | SUBQUERY | NULL | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------------+------+------------------------------+
2 rows in set, 1 warning (0.00 sec)
从查询执行计划看,非相关子查询存在(列id值为2的行的列select_type的值为SUBQUERY),没有被消除且也没有必要消除(因为只执行一次即可得出结果值)。
示例2: IN谓词表达的非相关子查询
IN非相关子查询,查询执行计划如下:
mysql> EXPLAIN SELECT * FROM t1 WHERE t1.a1 IN (SELECT a2 FROM t2 WHERE t2.a2>5);
+----+-------------+-------+------+------+-----------------------------+
| id | select_type | table | type | key | Extra |
+----+-------------+-------+------+------+-----------------------------+
| 1 | SIMPLE | t1 | ALL | NULL | Using where |
| 1 | SIMPLE | t2 | ref | i2 | Using index; FirstMatch(t1) |
+----+-------------+-------+------+------+-----------------------------+
2 rows in set, 1 warning (0.00 sec)从查询执行计划看,子查询不存在,表t1和t2直接做了连接,采用首次匹配策略(FirstMatch)把子查询上拉到父查询中用连接实现IN非相关子查询的优化。
另外一个IN非相关子查询,查询执行计划如下:
mysql> EXPLAIN SELECT * FROM t1 WHERE t1.a1 IN (SELECT a2 FROM t2 WHERE t2.a2=5);
+----+-------------+-------+------+------+-----------------------------+
| id | select_type | table | type | key | Extra |
+----+-------------+-------+------+------+-----------------------------+
| 1 | SIMPLE | t1 | ref | i1 | NULL |
| 1 | SIMPLE | t2 | ref | i2 | Using index; FirstMatch(t1) |
+----+-------------+-------+------+------+-----------------------------+
2 rows in set, 1 warning (0.00 sec)查询后的语句变形为:
/* select#1 */ select `test`.`t1`.`a1` AS `a1`,`test`.`t1`.`b1` AS `b1`,`test`.`t1`.`c1` AS `c1`
from `test`.`t1` semi join (`test`.`t2`)
where ((`test`.`t1`.`a1` = 5) and (`test`.`t2`.`a2` = 5))从查询执行计划看,子查询不存在,表t1和t2直接做了半连接,把子查询上拉到父查询中用半连接实现IN操作。另外,由于子查询上拉,使得增加连接条件“a1=a2”,而原先的条件“a2=5”可以利用常量传递优化技术,使得“a1=a2=5”,所以查询执行计划中,两个索引扫描的条件分别为:a1 = 10、a2 = 5。
5.2 相关子查询,多种优化情况
再对比一个IN相关子查询,子查询没有别优化,查询执行计划如下:
mysql> EXPLAIN SELECT * FROM t1 WHERE t1.a1 IN (SELECT a2 FROM t2 WHERE t1.a1=5);
+----+-------------+-------+------+------+-----------------------------+
| id | select_type | table | type | key | Extra |
+----+-------------+-------+------+------+-----------------------------+
| 1 | SIMPLE | t1 | ref | i1 | NULL |
| 1 | SIMPLE | t2 | ref | i2 | Using index; FirstMatch(t1) |
+----+-------------+-------+------+------+-----------------------------+
2 rows in set, 2 warnings (0.00 sec)查询后的语句变形为:
/* select#1 */ select `test`.`t1`.`a1` AS `a1`,`test`.`t1`.`b1` AS `b1`,`test`.`t1`.`c1` AS `c1`
from `test`.`t1` semi join (`test`.`t2`)
where ((`test`.`t1`.`a1` = 5) and (`test`.`t2`.`a2` = 5))从查询执行计划看,子查询不存在,表t1和t2直接做了半连接,把子查询上拉到父查询中用半连接实现IN操作。另外,由于子查询上拉,使得增加连接条件“a1=a2”,而原先的条件“a2=5”可以利用常量传递优化技术,使得“a1=a2=5”,所以查询执行计划中,两个索引扫描的条件分别为:a1 = 10、a2 = 5。这个比PostgreSQL不做优化要好。
另外一个相关子查询的例子,子查询被优化:
mysql> EXPLAIN SELECT * FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t1.b1= t2.b2 AND t1.a1=5);
+----+--------------------+-------+------+------+-------------+
| id | select_type | table | type | key | Extra |
+----+--------------------+-------+------+------+-------------+
| 1 | PRIMARY | t1 | ALL | NULL | Using where |
| 2 | DEPENDENT SUBQUERY | t2 | ALL | NULL | Using where |
+----+--------------------+-------+------+------+-------------+
2 rows in set, 3 warnings (0.00 sec)查询后的语句变形为:
/* select#1 */ select `test`.`t1`.`a1` AS `a1`,`test`.`t1`.`b1` AS `b1`,`test`.`t1`.`c1` AS `c1`
from `test`.`t1`
where
exists(
/* select#2 */ select 1
from `test`.`t2`
where ((`test`.`t1`.`b1` = `test`.`t2`.`b2`) and (`test`.`t1`.`a1` = 5))
)从查询执行计划看,子查询存在。MySQL没有对此类的相关子查询进行优化。这一点不如PostgreSQL做得好。
从以上几个例子看,MySQL对子查询的优化,也并没有明确的规律区分是相关或非相关子查询。
所以,子查询的优化,两大开源的数据库,都没有明确区分相关或非相关的概念。但是对于聚集非相关子查询,都能提供子查询的一次性求解,从而优化此类子查询。