参考资料
派生表
优化器可以使用两种策略来处理派生表引用(这也适用于视图引用):
将派生表合并到外部查询中(5.7引入的优化策略 derived_merge);
将派生表物化为内部临时表,再用于外部查询。
什么是派生表?形如以下子查询结果作为表对象的就是派生表:
SELECT * FROM (SELECT * FROM t1) AS derived_t1;
SELECT * FROM t1 JOIN (SELECT t2.f1 FROM t2) AS derived_t2 ON t1.f2=derived_t2.f1WHERE t1.f1 > 0;
通过 derived_merge 策略,上面两个派生子查询执行时等同:
SELECT * FROM t1;
SELECT t1.*, t2.f1 FROM t1 JOIN t2 ON t1.f2=t2.f1 WHERE t1.f1 > 0;
限制
当子查询包含以下操作时,derived_merge 策略失效:
Aggregate functions (SUM(), MIN(), MAX(), COUNT(), and so forth)
DISTINCT
GROUP BY
HAVING
LIMIT
UNION or UNION ALL
Subqueries in the select list
Assignments to user variables
Refererences only to literal values (in this case, there is no underlying table)
如果 derived_merge 策略失效,将按以下策略执行查询:
在需要派生表之前,优化器会推迟派生表的物化,这能提高性能。举例:表 t1 与派生表进行联接,如果处理 t1 表时返回结果为空,则不需要进行下一步联接,这是可以完全跳过派生表的物化;
查询执行期间,优化器会给物化派生表添加索引,提升效率。
对于第二点,可以看这个例子:
mysql> explain select * from t1 join (select distinct a from t2) as derived_t2 on t1.a=derived_t2.a;
+----+-------------+------------+------------+-------+---------------+-------------+---------+----------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+---------------+-------------+---------+----------------+------+----------+-------------+
| 1 | PRIMARY | t1 | NULL | ALL | a | NULL | NULL | NULL | 100 | 100.00 | Using where |
| 1 | PRIMARY | | NULL | ref | | | 5 | join_test.t1.a | 10 | 100.00 | Using index |
| 2 | DERIVED | t2 | NULL | index | a | a | 5 | NULL | 1000 | 100.00 | Using index |
+----+-------------+------------+------------+-------+---------------+-------------+---------+----------------+------+----------+-------------+
可以看到派生表确实是走索引的。不过也不是所有情况下都会给派生表添加索引,官档上原文:
The optimizer constructs an index over column f1 from derived_t2 if doing so would enable use of ref access for the lowest cost execution plan.