1.优化GROUP BY 语句
默认情况下,MySQL对所有GROUP BY col1,col2,...的字段进行排序。这与在查询中指定ORDER BY col1,col2,...类似。
因此,如果显示包括一个包含相同列的order by 子句,则对MySQL的实际执行性能没什么影响。
如果查询包括group by 但用户想要避免排序结果的消耗,则可以指定order by null 禁止排序,如下面的例子:
mysql> explain select payment_date,sum(amount) from payment group by payment_date;
+----+-------------+---------+------------+------+---------------+------+---------+------+-------+----------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+-------+----------+-----------------+
| 1 | SIMPLE | payment | NULL | ALL | NULL | NULL | NULL | NULL | 16125 | 100.00 | Using temporary,Using filesort |
+----+-------------+---------+------------+------+---------------+------+---------+------+-------+----------+-----------------+
1 row in set, 1 warning (0.04 sec)
mysql> explain select payment_date,sum(amount) from payment group by payment_date order by null;
+----+-------------+---------+------------+------+---------------+------+---------+------+-------+----------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+-------+----------+-----------------+
| 1 | SIMPLE | payment | NULL | ALL | NULL | NULL | NULL | NULL | 16125 | 100.00 | Using temporary |
+----+-------------+---------+------------+------+---------------+------+---------+------+-------+----------+-----------------+
1 row in set, 1 warning (0.00 sec)
从上面的例子可以看出,第一个sql语句需要进行“Filesort”,而第二个SQL由于ORDER BY NULL 不需要进行“Filesort”,而上文提过Filesort 往往非常耗费时间。
但是,在我的MySQL 8.0中,两种方式查询没有区别。
2.优化嵌套查询
MySQL 4.1开始支持SQL的子查询。这个技术可以使用SELECT 语句来创建一个单例的查询结果,然后把这个结果作为开始过滤条件在用在另一个查询中。
使用子查询可以一次性的完成很多逻辑上需要多个步骤才能完成的SQL操作,同时也可以避免事务或者表死锁。并且写起来也很容易。但是,有些情况下,子查询可以被更有效率的连接(JION)替代。
在下面的例子中,要从客户表customer 中找到不在支付表payment 中的所有客户信息:
mysql> explain select * from customer where customer_id not in (select customer_id from payment);
+----+--------------------+----------+------------+----------------+--------------------+--------------------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+----------+------------+----------------+--------------------+--------------------+---------+------+------+----------+-------------+
| 1 | PRIMARY | customer | NULL | ALL | NULL | NULL | NULL | NULL | 599 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | payment | NULL | index_subquery | idx_fk_customer_id | idx_fk_customer_id | 2 | func | 26 | 100.00 | Using index |
+----+--------------------+----------+------------+----------------+--------------------+--------------------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.10 sec)
如果使用连接 JOIN 来完成这个查询工作,速度将会快很多。尤其是当payment 表中对customer_id 建有索引,性能将会更好,j具体查询如下:
mysql> explain select * from customer a left join payment b on a.customer_id = b.customer_id where b.customer_id is null;
+----+-------------+-------+------------+------+--------------------+--------------------+---------+----------------------+------+----------+-------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+--------------------+--------------------+---------+----------------------+------+----------+-------------------------+
| 1 | SIMPLE | a | NULL | ALL | NULL | NULL | NULL | NULL | 599 | 100.00 | NULL |
| 1 | SIMPLE | b | NULL | ref | idx_fk_customer_id | idx_fk_customer_id | 2 | sakila.a.customer_id | 26 | 100.00 | Using where; Not exists |
+----+-------------+-------+------------+------+--------------------+--------------------+---------+----------------------+------+----------+-------------------------+
2 rows in set, 1 warning (0.00 sec)
从执行计划中可以看出查询关联的类型从 index_subquery 调整为了ref ,在MySQL 5.5以下版本(包括5.5),子查询的效率还是不如关联查询JOIN.
连接JOIN 之所以更有效率一些,是因为MySQL 不需要再内存中创建临时表来完成这个逻辑上需要两个步骤的查询工作。