mysql8.0后引入了这个语法,在sql优化上有很大的助益,特别是相同的子查询重复引用的情况。
以下是一个对比。表ttt有很大量的数据
不使用with t as,重复引用的情况
select t1.f2,sum(t1.f3) from
-> (
-> select f2,f3,count(*) from ttt group by f2,f3
-> ) t1 left join
-> (select f2,f3,count(*) from ttt group by f2,f3) t2
-> on t1.f2=t2.f2
-> group by t1.f2;
+------+------------+
| f2 | sum(t1.f3) |
+------+------------+
| 3 | 0 |
| 1 | 0 |
| 4 | 0 |
+------+------------+
3 rows in set, 3 warnings (13.79 sec)
使用with t as
mysql> with tmp as (select f2,f3,count(*) from ttt group by f2,f3)
-> select t1.f2,sum(t1.f3) from
-> tmp t1 left join tmp t2
-> on t1.f2=t2.f2
-> group by t1.f2;
+------+------------+
| f2 | sum(t1.f3) |
+------+------------+
| 3 | 0 |
| 1 | 0 |
| 4 | 0 |
+------+------------+
3 rows in set, 3 warnings (6.80 sec)
可以看到时间相差接近一倍,因为聚合后数据行数较少,sql外层的时间可以忽略,sql内层的时间差基本就等于一次查询的时间。
对比执行计划
不使用with t
mysql> explain select t1.f2,sum(t1.f3) from
-> (
-> select f2,f3,count(*) from ttt group by f2,f3
-> ) t1 left join
-> (select f2,f3,count(*) from ttt group by f2,f3) t2
-> on t1.f2=t2.f2
-> group by t1.f2;
+----+-------------+------------+------------+------+---------------+-------------+---------+-------+----------+----------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+---------------+-------------+---------+-------+----------+----------+-----------------+
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 12018185 | 100.00 | Using temporary |
| 1 | PRIMARY | <derived3> | NULL | ref | <auto_key0> | <auto_key0> | 5 | t1.f2 | 10 | 100.00 | NULL |
| 3 | DERIVED | ttt | NULL | ALL | NULL | NULL | NULL | NULL | 12018185 | 100.00 | Using temporary |
| 2 | DERIVED | ttt | NULL | ALL | NULL | NULL | NULL | NULL | 12018185 | 100.00 | Using temporary |
+----+-------------+------------+------------+------+---------------+-------------+---------+-------+----------+----------+-----------------+
4 rows in set, 1 warning (0.01 sec)
使用with t的
mysql> explain with tmp as (select f2,f3,count(*) from ttt group by f2,f3)
-> select t1.f2,sum(t1.f3) from
-> tmp t1 left join tmp t2
-> on t1.f2=t2.f2
-> group by t1.f2;
+----+-------------+------------+------------+------+---------------+-------------+---------+-------+----------+----------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+---------------+-------------+---------+-------+----------+----------+-----------------+
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 12018185 | 100.00 | Using temporary |
| 1 | PRIMARY | <derived2> | NULL | ref | <auto_key0> | <auto_key0> | 5 | t1.f2 | 10 | 100.00 | NULL |
| 2 | DERIVED | ttt | NULL | ALL | NULL | NULL | NULL | NULL | 12018185 | 100.00 | Using temporary |
+----+-------------+------------+------------+------+---------------+-------------+---------+-------+----------+----------+-----------------+
3 rows in set, 1 warning (0.01 sec)
注意看DERIVED的select type,不使用with t的DERIVED类型的id有两个,说明执行了2次内层sql。
使用with t的DERIVED类型id只有一个,就证明只执行了1次内层sql。
学习原理,孵化思路;积累工具,下笔有道