子查询的执行方式不同于oracle,它不会首先得到一个in 列表,然后通过这些值去进行过滤:
改进方法,重写查询或者先取出列表,然后再查询。
MySQL tries to “help” the subquery by
pushing a correlation into it from the outer table, which it thinks will let the subquery
find rows more efficiently. It rewrites the query as follows:
mysql> explain
-> SELECT * FROM sakila.film
-> WHERE film_id IN(
-> SELECT film_id FROM sakila.film_actor WHERE actor_id = 1);
+----+--------------------+------------+--------+------------------------+---------+---------+------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+--------+------------------------+---------+---------+------------+------+-------------+
| 1 | PRIMARY | film | ALL | NULL | NULL | NULL | NULL | 952 | Using where |
| 2 | DEPENDENT SUBQUERY | film_actor | eq_ref | PRIMARY,idx_fk_film_id | PRIMARY | 4 | const,func | 1 | Using index |
+----+--------------------+------------+--------+------------------------+---------+---------+------------+------+-------------+
2 rows in set (0.60 sec)
mysql> explain
-> SELECT * FROM sakila.film
-> WHERE EXISTS (
-> SELECT * FROM sakila.film_actor WHERE actor_id = 1
-> AND film_actor.film_id = film.film_id);
+----+--------------------+------------+--------+------------------------+---------+---------+---------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+--------+------------------------+---------+---------+---------------------------+------+-------------+
| 1 | PRIMARY | film | ALL | NULL | NULL | NULL | NULL | 952 | Using where |
| 2 | DEPENDENT SUBQUERY | film_actor | eq_ref | PRIMARY,idx_fk_film_id | PRIMARY | 4 | const,sakila.film.film_id | 1 | Using index |
+----+--------------------+------------+--------+------------------------+---------+---------+---------------------------+------+-------------+
2 rows in set (0.02 sec)
According to the EXPLAIN output, MySQL will table-scan the film table and execute
the subquery for each row it finds. This won’t cause a noticeable performance hit on
small tables, but if the outer table is very large, the performance will be extremely
bad. Fortunately, it’s easy to rewrite such a query as a JOIN:
mysql> SELECT film.* FROM sakila.film
-> INNER JOIN sakila.film_actor USING(film_id)
-> WHERE actor_id = 1;
Another good optimization is to manually generate the IN( ) list by executing the
subquery as a separate query with GROUP_CONCAT( ). Sometimes this can be faster than
a JOIN.
MySQL has been criticized thoroughly for this particular type of subquery execution
plan. Although it definitely needs to be fixed, the criticism often confuses two different
issues: execution order and caching. Executing the query from the inside out is
one way to optimize it; caching the inner query’s result is another. Rewriting the
query yourself lets you take control over both aspects. Future versions of MySQL
should be able to optimize this type of query much better, although this is no easy
task. There are very bad worst cases for any execution plan, including the inside-out
execution plan that some people think would be simple to optimize.