extra主要有几种情况:Using where、Using index、Using index condition、Using filesort、Using temporary、Using join buffer(Block Nested Loop)
数据准备:
CREATE TABLE `t_blog` (
`id` int(11) NOT NULL auto_increment,
`title` varchar(50) default NULL,
`typeId` int(11) default NULL,
`a` int(11) default '0',
PRIMARY KEY (`id`),
KEY `index_1` (`title`,`typeId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
一、【Using where】使用全表扫描来执行查询
create table user (
id int primary key,
name varchar(20),
sex varchar(5),
index(name)
)engine=innodb;
数据说明:
用户表:id主键索引,name普通索引(非唯一),sex无索引;
四行记录:其中name普通索引存在重复记录lisi;
实验语句:
explain select * from user where sex='no';
结果说明:
Extra为Using where说明,SQL使用了where条件过滤数据。
需要注意的是:
(1)返回所有记录的SQL,不使用where条件过滤数据,大概率不符合预期,对于这类SQL往往需要进行优化;
(2)使用了where条件的SQL,并不代表不需要优化,往往需要配合explain结果中的type(连接类型)来综合判断;
本例虽然Extra字段说明使用了where条件过滤,但type属性是ALL,表示需要扫描全部数据,仍有优化空间。
常见的优化方法为,在where过滤属性上添加索引。
画外音:本例中,sex字段区分度不高,添加索引对性能提升有限。
二、【Using index】 查询的列被索引覆盖
mysql> EXPLAIN select title from t_blog;
+----+-------------+--------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | t_blog | index | NULL | index_1 | 158 | NULL | 7 | Using index |
+----+-------------+--------+-------+---------------+---------+---------+------+------+-------------+
1 row in set
结果说明:
Extra为Using index说明,SQL所需要返回的所有列数据均在一棵索引树上,而无需访问实际的行记录。(只查询id/name字段,只会扫描索引文件而不会扫描表的所有数据行)
注意点:当只出现Using index,没出现Using where时,表示索引用于读取数据,比如:
mysql> EXPLAIN select title from t_blog;
+----+-------------+--------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | t_blog | index | NULL | index_1 | 158 | NULL | 7 | Using index |
+----+-------------+--------+-------+---------------+---------+---------+------+------+-------------+
1 row in set
当Using index 和 Using where同时出现时,表示索引用于查找动作,言外之意:查询列被索引覆盖,并且where筛选条件是索引列之一的但不是索引的前导列
比如:
mysql> EXPLAIN select title from t_blog where title = 'java';
+----+-------------+--------+------+---------------+---------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+------+---------------+---------+---------+-------+------+--------------------------+
| 1 | SIMPLE | t_blog | ref | index_1 | index_1 | 153 | const | 1 | Using where; Using index |
+----+-------------+--------+------+---------------+---------+---------+-------+------+--------------------------+
1 row in set
三、【Using index condition】 查询使用了索引,没有被索引覆盖,需要回表查询数据
画外音:该SQL语句与上一个SQL语句不同的地方在于,被查询的列,多了一个sex字段。
结果说明:
Extra为Using index condition说明,确实命中了索引,但不是所有的列数据都在索引树上,还需要访问实际的行记录。
四、【Using filesort】使用文件排序,排序的时候没有使用索引
Using filesort通常出现在order by,当试图对一个不是索引的字段进行排序时,mysql就会自动对该字段进行排序,这个过程就称为“文件排序”
mysql> EXPLAIN select * from t_blog order by title;
+----+-------------+--------+-------+---------------+---------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+---------------+---------+---------+------+------+-------+
| 1 | SIMPLE | t_blog | index | NULL | index_1 | 158 | NULL | 7 | |
+----+-------------+--------+-------+---------------+---------+---------+------+------+-------+
1 row in set
已知title是index_1索引中的第一列索引,所以单独使用时索引生效,在排序时根据索引排序,不会产生文件排序。
mysql> EXPLAIN select * from t_blog order by typeId;
+----+-------------+--------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+------+---------------+------+---------+------+------+----------------+
| 1 | SIMPLE | t_blog | ALL | NULL | NULL | NULL | NULL | 7 | Using filesort |
+----+-------------+--------+------+---------------+------+---------+------+------+----------------+
1 row in set
虽然typeId是index_1索引的第二列,但由于缺失第一列,所以索引失效。在排序时无法根据索引排序,故mysql会自动进行排序,产生文件排序。
mysql> EXPLAIN select * from t_blog order by a;
+----+-------------+--------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+------+---------------+------+---------+------+------+----------------+
| 1 | SIMPLE | t_blog | ALL | NULL | NULL | NULL | NULL | 7 | Using filesort |
+----+-------------+--------+------+---------------+------+---------+------+------+----------------+
1 row in set
字段a上没有任何索引,所以在排序时无法根据索引排序,因此产生文件排序。
Using filesort出现的情况:排序时无法根据索引进行排序,mysql优化器只能自己进行排序,这种情况会大大降低性能,不可取。[千万注意 where条件中有用到索引,但未必是分组、排序用到的,两者是有区别的,]
五、【Using temporary】 使用了临时表,没有使用索引
表示在查询过程中产生了临时表用于保存中间结果。mysql在对查询结果进行排序时会使用临时表,常见于group by。
group by的实质是先排序后分组,同order by一样,group by和索引息息相关。
试图对一个没有索引的字段进行分组,会产生临时表:
mysql> EXPLAIN select title from t_blog group by typeId;
+----+-------------+--------+-------+---------------+---------+---------+------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+---------------+---------+---------+------+------+----------------------------------------------+
| 1 | SIMPLE | t_blog | index | NULL | index_1 | 158 | NULL | 7 | Using index; Using temporary; Using filesort |
+----+-------------+--------+-------+---------------+---------+---------+------+------+----------------------------------------------+
1 row in set
对一个有索引的字段进行分组就不会产生临时表:
mysql> EXPLAIN select title from t_blog group by title,typeId;
+----+-------------+--------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | t_blog | index | NULL | index_1 | 158 | NULL | 7 | Using index |
+----+-------------+--------+-------+---------------+---------+---------+------+------+-------------+
1 row in set
当order by子句和group by子句的字段相同时不会产生临时表:
mysql> explain select * from t_blog b left join t_type t on b.typeId = t.id group by b.id order by b.id;
+----+-------------+-------+--------+---------------+---------+---------+---------------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------+------+-------+
| 1 | SIMPLE | b | index | NULL | PRIMARY | 4 | NULL | 7 | |
| 1 | SIMPLE | t | eq_ref | PRIMARY | PRIMARY | 4 | blog.b.typeId | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+---------------+------+-------+
2 rows in set
当order by子句和group by子句的字段不同时就会产生临时表:
mysql> explain select * from t_blog b left join t_type t on b.typeId = t.id group by b.id order by b.title;
+----+-------------+-------+--------+---------------+---------+---------+---------------+------+-----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------+------+-----------------+
| 1 | SIMPLE | b | index | NULL | index_1 | 158 | NULL | 7 | Using temporary |
| 1 | SIMPLE | t | eq_ref | PRIMARY | PRIMARY | 4 | blog.b.typeId | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+---------------+------+-----------------+
2 rows in set
当时用left join时,若order by子句和group by子句都来自于从表时会产生临时表:
mysql> explain select * from t_blog b left join t_type t on b.typeId = t.id group by t.id order by t.id;
+----+-------------+-------+--------+---------------+---------+---------+---------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------+------+---------------------------------+
| 1 | SIMPLE | b | ALL | NULL | NULL | NULL | NULL | 7 | Using temporary; Using filesort |
| 1 | SIMPLE | t | eq_ref | PRIMARY | PRIMARY | 4 | blog.b.typeId | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+---------------+------+---------------------------------+
2 rows in set
mysql> explain select * from t_blog b left join t_type t on b.typeId = t.id group by t.id order by t.name;
+----+-------------+-------+--------+---------------+---------+---------+---------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------+------+---------------------------------+
| 1 | SIMPLE | b | ALL | NULL | NULL | NULL | NULL | 7 | Using temporary; Using filesort |
| 1 | SIMPLE | t | eq_ref | PRIMARY | PRIMARY | 4 | blog.b.typeId | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+---------------+------+---------------------------------+
2 rows in set
出现Using temporary意味着产生了临时表存储中间结果并且最后删掉了该临时表,这个过程很消耗性能。
六、【Using join buffer (Block Nested Loop)】
重点:join的时候被驱动表不能利用索引,为了加快访问速度利用了join buffer
实验语句:
explain select * from user where id in (select id from user where sex='no');
结果说明:
Extra为Using join buffer (Block Nested Loop)说明,需要进行嵌套循环计算。
画外音:内层和外层的type均为ALL,rows均为4,需要循环进行4*4次计算。
这类SQL语句性能往往也较低,需要进行优化。
典型的,两个关联表join,关联字段均未建立索引,就会出现这种情况。常见的优化方案是,在关联字段上添加索引,避免每次嵌套循环计算。
NLJ BNL算法补充
Nested Loop Join算法
NLJ 算法:将驱动表/外部表的结果集作为循环基础数据,然后循环从该结果集每次一条获取数据作为下一个表的过滤条件查询数据,然后合并结果。如果有多表join,则将前面的表的结果集作为循环数据,取到每行再到联接的下一个表中循环匹配,获取结果集返回给客户端。
Block Nested-Loop Join算法
BNL 算法:将外层循环的行/结果集存入join buffer, 内层循环的每一行与整个buffer中的记录做比较,从而减少内层循环的次数.
举例来说,外层循环的结果集是100行,使用NLJ 算法需要扫描内部表100次,如果使用BNL算法,先把对Outer Loop表(外部表)每次读取的10行记录放到join buffer,然后在InnerLoop表(内部表)中直接匹配这10行数据,内存循环就可以一次与这10行进行比较, 这样只需要比较10次,对内部表的扫描减少了9/10。所以BNL算法就能够显著减少内层循环表扫描的次数.
结尾:
explain是SQL优化中最常用的工具,搞定type和Extra,explain也就基本搞定了。