为了验证上一篇文章的6个sql,创建了如下表
CREATE TABLE `l_f_test` (
`id` bigint(18) NOT NULL AUTO_INCREMENT,
`last_name` char(30) NOT NULL,
`first_name` char(30) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_l_f` (`last_name`,`first_name`)
) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=utf8mb4;
使用该存储过程:
CREATE DEFINER=`root`@`%` PROCEDURE `batch_insert_l_f_test`(in num int)
BEGIN
DECLARE i INT;
declare str1 char(2);
declare str2 char(2);
DECLARE str char(62) default 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
set i = 0;
WHILE i < num DO
set str1 = concat(substring(str,1+floor(rand()*61),1),substring(str,1+floor(rand()*61),1));
set str2=concat(substring(str,1+floor(rand()*61),1),substring(str,1+floor(rand()*61),1));
INSERT INTO `account`.`l_f_test`(`last_name`, `first_name`) VALUES (str1, str2);
set i = i + 1;
END WHILE;
END
插入了100万的数据:
mysql> select count(*) from l_f_test;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
1 row in set (0.29 sec)
表整体上看起来如下:
mysql> select * from l_f_test order by id limit 10;
+----+-----------+------------+
| id | last_name | first_name |
+----+-----------+------------+
| 1 | 4s | QJ |
| 2 | ZB | Uz |
| 3 | Y3 | cB |
| 4 | it | ng |
| 5 | 2k | im |
| 6 | Ky | lT |
| 7 | iD | aH |
| 8 | Vl | LG |
| 9 | Wy | L1 |
| 10 | F7 | sM |
+----+-----------+------------+
10 rows in set (0.00 sec)
1、SELECT * FROM test WHERE last_name='jones';
mysql> explain SELECT * FROM l_f_test WHERE last_name='js';
+----+-------------+----------+------+---------------+---------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+---------+---------+-------+------+--------------------------+
| 1 | SIMPLE | l_f_test | ref | idx_l_f | idx_l_f | 120 | const | 1099 | Using where; Using index |
+----+-------------+----------+------+---------------+---------+---------+-------+------+--------------------------+
1 row in set (0.00 sec)
命中了索引idx_l_f;
2、SELECT * FROM testWHERE last_name='Jones' AND first_name='John';
mysql> explain SELECT * FROM l_f_test WHERE last_name='es' AND first_name='Jo';
+----+-------------+----------+------+---------------+---------+---------+-------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+---------+---------+-------------+------+--------------------------+
| 1 | SIMPLE | l_f_test | ref | idx_l_f | idx_l_f | 241 | const,const | 1 | Using where; Using index |
+----+-------------+----------+------+---------------+---------+---------+-------------+------+--------------------------+
1 row in set (0.00 sec)
命中了索引idx_l_f;
3、SELECT * FROM test WHERE last_name='Jones' AND (first_name='John' OR first_name='Jon');
mysql> explain SELECT * FROM l_f_test WHERE last_name='es' AND (first_name='hn' OR first_name='Jn');
+----+-------------+----------+-------+---------------+---------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+---------+---------+------+------+--------------------------+
| 1 | SIMPLE | l_f_test | range | idx_l_f | idx_l_f | 241 | NULL | 2 | Using where; Using index |
+----+-------------+----------+-------+---------------+---------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
命中了索引idx_l_f;
4、SELECT * FROM test WHERE last_name='Jones' AND first_name >='M' AND first_name < 'N';
mysql> explain SELECT * FROM l_f_test WHERE last_name='es' AND first_name >='M' AND first_name < 'N';
+----+-------------+----------+-------+---------------+---------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+---------+---------+------+------+--------------------------+
| 1 | SIMPLE | l_f_test | range | idx_l_f | idx_l_f | 241 | NULL | 5 | Using where; Using index |
+----+-------------+----------+-------+---------------+---------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
命中了索引idx_l_f;
5、SELECT * FROM test WHERE first_name='John';
mysql> explain SELECT * FROM l_f_test WHERE first_name='John';
+----+-------------+----------+-------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+---------+---------+------+--------+--------------------------+
| 1 | SIMPLE | l_f_test | index | NULL | idx_l_f | 241 | NULL | 994328 | Using where; Using index |
+----+-------------+----------+-------+---------------+---------+---------+------+--------+--------------------------+
1 row in set (0.01 sec)
匪夷所思,这个sql居然命中了索引,按照的说法它是不会命中的,这个问题我思考了很久,接下来是我的理解:按照我们的认知,这个sql是无法命中idx_l_f索引的,顺着思路,如果mysql不使用idx_l_f索引它该如何查询这个sql,不用想,肯定是遍历聚簇索引的叶子结点;如眼前所见,mysql使用了idx_l_f索引,此时它又会怎么样查询这个sql呢,我想也没有办法遍历该索引的非叶子结点,那就只能是遍历该索引的叶子结点。那为什么mysql选择了遍历idx_l_f索引的叶子结点而不是聚簇索引的叶子结点呢?或许是因为虽然该索引的最左前缀不是first_name,但是该索引的叶子结点各个数据页中的数据行可以满足当last_name一样时,first_name是有序的,也就是说,当last_name一样时,可以根据数据页的页目录快速定位到first_name的值为John的记录,因为有可能连续多个数据页的last_name都是一样的,此时定位first_name=John就会特别快,因此最终mysql的优化器选择了idx_l_f索引。用一句话总结就是:对于聚簇索引和idx_l_f索引,要遍历同样多的数据页,那还是选择一个有一定顺序的idx_l_f索引吧。
顺藤摸瓜,我又测试了如下sql:
mysql> explain SELECT * FROM l_f_test WHERE first_name='John' order by id;
+----+-------------+----------+-------+---------------+---------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+---------+---------+------+--------+-------------+
| 1 | SIMPLE | l_f_test | index | NULL | PRIMARY | 8 | NULL | 994328 | Using where |
+----+-------------+----------+-------+---------------+---------+---------+------+--------+-------------+
1 row in set (0.00 sec)
该sql只是比刚才的sql多了一个order by id就命中了聚簇索引,因为聚簇索引的叶子结点是天然按照id来排序的,大概是因为,mysql认为排序的成本要更高,其实这个例子也给了我们一个sql优化的思路,有时候我们发现自己写的sql就是不命中某个索引导致效率很低,那我们就可以通过改写sql来引导优化器来使用该索引,需要知道的是,优化器不是万能的,它有时候也会犯错,去选择一个不太合适的索引。
6、SELECT * FROM l_f_test WHERE last_name='Jones' OR first_name='John';
explain SELECT * FROM l_f_test WHERE last_name='Jones' OR first_name='John';
+----+-------------+----------+-------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+---------+---------+------+--------+--------------------------+
| 1 | SIMPLE | l_f_test | index | idx_l_f | idx_l_f | 241 | NULL | 994328 | Using where; Using index |
+----+-------------+----------+-------+---------------+---------+---------+------+--------+--------------------------+
1 row in set (0.00 sec)
这个sql也和我们想象的不太一样,命中了idx_l_f索引,原因和上一个sql一样。
最后的话:以上仅仅是自己对于mysql理解,难免会有很多错误,望批评指正。