物理查询优化（一）：索引的使用（案例分析）

数据库内核

已于 2024-02-28 09:45:31 修改

阅读量602

点赞数 29

分类专栏：查询优化 MySQL 文章标签：数据库数据结构 mysql

于 2024-02-27 21:24:27 首次发布

本文链接：https://blog.csdn.net/weixin_47156401/article/details/136332019

版权

MySQL 同时被 2 个专栏收录

52 篇文章 1 订阅

订阅专栏

查询优化

4 篇文章 0 订阅

订阅专栏

注意：本文主要是围绕 MySQL 展开，测试用例都是基于 MySQL 8.0.32；

如何利用索引

如果某个列上存在索引，并不意味着索引能够有效使用。查询优化器使用索引原则如下：

索引列作为条件出现在 where、having、on 子句中；
索引列是被连接的表（内表）对象的列且存在于连接条件中；

注：除了上述两种情况外，还有一些特殊情况可以使用索引，如：排序操作、在索引列上求 min、max 值等。

创建表如下：

mysql> create table a(a1 int, a2 varchar(9), a3 int, key (a1));

mysql> insert into a values(1, '1', 1), (2, '2', 2), (3, '3', 3), (4, '4', 4), (5, '5', 5), (6, '6', 6); 

mysql> show create table a;
+-------+---------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                         |
+-------+---------------------------------------------------------------+
| a     | CREATE TABLE `a` (
  `a1` int DEFAULT NULL,
  `a2` varchar(9) COLLATE utf8mb4_general_ci DEFAULT NULL,
  `a3` int DEFAULT NULL,
  KEY `a1` (`a1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci |
+-------+--------------------------------------------------------------+
1 row in set (0.00 sec)

下面对索引使用情况举例说明：

1）对表做查询，没有索引列作为过滤条件（即 where 条件中没有索引列的过滤条件），只能顺序扫描；

2）对表做查询，有索引列作为过滤条件，可走索引扫描，例如：


mysql> explain analyze select * from a where a1 = 1;
+-----------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                             |
+-----------------------------------------------------------------------------------------------------+
| -> Index lookup on a using a1 (a1=1)  (cost=0.35 rows=1) (actual time=0.109..0.140 rows=1 loops=1)
 |
+-----------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

3）对表做查询，有列对象作为过滤条件，但索引列被运算符“-”处理，查询优化器不能在执行前进行取反运算，不可利用索引扫描，只能做顺序扫描，例如：

mysql> explain select * from a where -a1 = -2;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | a     | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    6 |   100.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain analyze select * from a where -a1 = -2;
+--------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                  |
+--------------------------------------------------------------------------------------------------------+
| -> Filter: (-(a.a1) = <cache>(-(2)))  (cost=0.85 rows=6) (actual time=0.165..0.303 rows=1 loops=1)
    -> Table scan on a  (cost=0.85 rows=6) (actual time=0.118..0.288 rows=7 loops=1)
 |
+--------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

4）对表做查询，有列对象作为过滤条件，且目标列没有超出索引列，可做只读索引扫描（即覆盖索引扫描），这种扫描方式比单纯的索引扫描的效率更高（因为无需回表操作），例如：

mysql> explain analyze select a1 from a where a1 = 1;
+--------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                      |
+--------------------------------------------------------------------------------------------------------------+
| -> Covering index lookup on a using a1 (a1=1)  (cost=0.35 rows=1) (actual time=0.089..0.120 rows=1 loops=1)
 |
+--------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

5）对表做查询，有索引存在，但选择条件不包括索引列对象，只能使用顺序扫描，例如：

mysql> explain select * from a where a3 = 1;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | a     | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    6 |    16.67 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain analyze select * from a where a3 = 1;
+-------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                   |
+-------------------------------------------------------------------------------------+
| -> Filter: (a.a3 = 1)  (cost=0.85 rows=1) (actual time=0.148..0.377 rows=1 loops=1)
    -> Table scan on a  (cost=0.85 rows=6) (actual time=0.144..0.367 rows=7 loops=1)
 |
+-------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

6）对表做查询，有索引存在，选择条件包括索引列对象，可使用索引扫描，对选择条件中不存在索引的列作为过滤器被使用（即没有索引的条件在 server 层过滤），例如：

mysql> explain analyze select * from a where a1 = 1 and a3 = 1;
+----------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                       |
+----------------------------------------------------------------------------------------------------------------+
| -> Filter: (a.a3 = 1)  (cost=0.27 rows=0.2) (actual time=0.118..0.162 rows=1 loops=1)
    -> Index lookup on a using a1 (a1=1)  (cost=0.27 rows=1) (actual time=0.113..0.157 rows=1 loops=1)
 |
+----------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

7）对表做查询，有索引存在，选择条件包括索引列对象，但索引列对象位于一个表达式中，参与了运算，不是 “key=常量” 格式，则索引不可使用，只能是顺序扫描，例如：

// 索引列在表达式中
mysql> explain analyze select * from a where a1 + a3 = 2;
+----------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                            |
+----------------------------------------------------------------------------------------------------+
| -> Filter: ((a.a1 + a.a3) = 2)  (cost=0.85 rows=6) (actual time=0.153..0.332 rows=1 loops=1)
    -> Table scan on a  (cost=0.85 rows=6) (actual time=0.141..0.313 rows=6 loops=1)
 |
+----------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

// 索引列参与运算，不可使用索引扫描
mysql> explain analyze select * from a where a1 = 2 - a3;
+----------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                            |
+----------------------------------------------------------------------------------------------------+
| -> Filter: (a.a1 = (2 - a.a3))  (cost=0.85 rows=1) (actual time=0.135..0.290 rows=1 loops=1)
    -> Table scan on a  (cost=0.85 rows=6) (actual time=0.130..0.280 rows=6 loops=1)
 |
+----------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)
  
// 索引列不在 1 + 2 的表达式中，被独立使用，可使用索引扫描
mysql> explain analyze select * from a where a1 < 1 + 2;
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Index range scan on a using a1 over (NULL < a1 < 3), with index condition: (a.a1 < <cache>((1 + 2)))  (cost=1.16 rows=2) (actual time=0.133..0.219 rows=2 loops=1)
 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

8）对表做查询，有索引列对象作为过滤条件，操作符是范围操作符 > 或

9）对表做查询，有索引列对象作为过滤条件，操作符是范围操作符 <>，可做索引扫描（PG 不可做索引扫描），且可走 ICP下推；

mysql> explain analyze select * from a where a1 <> 1;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                              |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Index range scan on a using a1 over (NULL < a1 < 1) OR (1 < a1), with index condition: (a.a1 <> 1)  (cost=3.21 rows=6) (actual time=0.153..0.389 rows=5 loops=1)
 |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> explain select * from a where a1 <> 1;
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | a     | NULL       | range | a1            | a1   | 5       | NULL |    6 |   100.00 | Using index condition |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)

10）对表做查询，有索引列对象作为过滤条件，操作符是范围操作符 between and，可做索引扫描（且可走 ICP），例如：

mysql> explain select * from a where a1 between 1 and 3;
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | a     | NULL       | range | a1            | a1   | 5       | NULL |    3 |   100.00 | Using index condition |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain analyze select * from a where a1 between 1 and 3;
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                            |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Index range scan on a using a1 over (1 <= a1 <= 3), with index condition: (a.a1 between 1 and 3)  (cost=1.61 rows=3) (actual time=0.121..0.259 rows=3 loops=1)
 |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

对于索引列，索引可用的条件如下：

在 where、join/on、having 的条件中出现 “key 常量” 格式的条件子句（=、>、>=、、between and），但索引列不能参与带有变量的表达式运算；
索引列的值选择率越低，索引越有效，通常认为选择率小于 0.1 则索引扫描效果会好一些；

索引列的位置对索引使用的影响

在查询语句中，索引列出现在不同的位置，对索引的使用有着不同的影响。

对目标列、where 等条件子句的影响

1）索引列出现在目标列，可使用索引（PG 不能使用，但并不是全部不能使用），例如：

mysql> explain analyze select a1 from a;
+-----------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                             |
+-----------------------------------------------------------------------------------------------------+
| -> Covering index scan on a using a1  (cost=0.85 rows=6) (actual time=0.118..0.289 rows=6 loops=1)
 |
+-----------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

此时，优化器自动选择走 a1 索引；

2）聚集函数 MIN/MAX 用在索引列上，出现在目标列，可使用索引，例如：

// 下述语句既没有走全表扫描也没有走索引是因为没有对数据进行额外的排序操作，此时 MySQL 能直接拿到最大值和最小值
mysql> explain select max(a1) from a;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                        |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL       | NULL | NULL          | NULL | NULL    | NULL | NULL |     NULL | Select tables optimized away |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain analyze select max(a1) from a;
+-------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                               |
+-------------------------------------------------------------------------------------------------------+
| -> Rows fetched before execution  (cost=0.00..0.00 rows=1) (actual time=0.000..0.000 rows=1 loops=1)
 |
+-------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

3）索引列出现在 where 子句中，可使用索引；

4）索引列出现在 join/on 子句中，作为连接条件，不一定使用索引，例如：

mysql> create table b(b1 int, b2 varchar(9), b3 int, key (b1));

mysql> insert into b values(1, '1', 1), (2, '2', 2), (3, '3', 3), (4, '4', 4), (5, '5', 5), (6, '6', 6); 

// 对表 a 进行全表扫描，然后 “常量传递”，所以对表 b 进行索引扫描
mysql> explain select a.*, b.* from a join b on a1 = b1;
+----+-------------+-------+------------+------+---------------+------+---------+----------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref      | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+----------+------+----------+-------------+
|  1 | SIMPLE      | a     | NULL       | ALL  | a1            | NULL | NULL    | NULL     |    6 |   100.00 | Using where |
|  1 | SIMPLE      | b     | NULL       | ref  | b1            | b1   | 5       | db1.a.a1 |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+------+---------------+------+---------+----------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

 mysql> explain analyze select a.*, b.* from a join b on a1 = b1;
+------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                                                                                                                                     |
+------------------------------------------------------------------------------------------------------------+
| -> Nested loop inner join  (cost=2.95 rows=6) (actual time=0.261..0.913 rows=6 loops=1)
    -> Filter: (a.a1 is not null)  (cost=0.85 rows=6) (actual time=0.137..0.304 rows=7 loops=1)
        -> Table scan on a  (cost=0.85 rows=6) (actual time=0.133..0.297 rows=7 loops=1)
    -> Index lookup on b using b1 (b1=a.a1)  (cost=0.27 rows=1) (actual time=0.064..0.085 rows=1 loops=7)
 |
+------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

从上述查询执行计划来看，尽管连接条件 a1 = b1 中的 a1 列和 b1 列都是索引列，但在表 a 中使用了顺序扫描，在表 b 中使用了索引扫描。（代价估算决定哪种扫描方式最优，上述示例对表 a 做顺序扫描，然后把扫描到的 a1 值做 “常量传递” 优化，使得表 b 可以使用索引扫描）

5）索引列出现在 join/on 子句中，作为限制条件满足 “key 常量” 格式可用索引，例如：

mysql> explain select a.*, b.* from a join b on a1 = b1 and a1 = 1;
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | a     | NULL       | ref  | a1            | a1   | 5       | const |    1 |   100.00 | NULL  |
|  1 | SIMPLE      | b     | NULL       | ref  | b1            | b1   | 5       | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)

从查询执行计划来看，a1 和 b1 都走了索引，能走索引的原因是在连接条件中，出现了 a1 = 1 条件，使得查询优化器可以根据 ”常量传递“ 优化技术推知 b1 = 1，所以在表 a 和 b 上可以各自使用索引扫描。

6）索引列出现在 where 子句中，可用索引，例如：

mysql> explain select a.*, b.* from a join b on a1 = b1 where a1 = 1;
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | a     | NULL       | ref  | a1            | a1   | 5       | const |    1 |   100.00 | NULL  |
|  1 | SIMPLE      | b     | NULL       | ref  | b1            | b1   | 5       | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)

表 a 和表 b 可以使用索引的原因和第 5）条类似（”常量传递“优化）。

7）索引列出现在 where 子句中，但与子查询比较，格式上不满足 “key 常量”，不可用索引，例如：

mysql> explain select b1 from b where b1 in (select a1 from a);
+----+-------------+-------+------------+-------+---------------+------+---------+----------+------+----------+----------------------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref      | rows | filtered | Extra                      |
+----+-------------+-------+------------+-------+---------------+------+---------+----------+------+----------+----------------------------+
|  1 | SIMPLE      | b     | NULL       | index | b1            | b1   | 5       | NULL     |    6 |   100.00 | Using where; Using index   |
|  1 | SIMPLE      | a     | NULL       | ref   | a1            | a1   | 5       | db1.b.b1 |    1 |   100.00 | Using index; FirstMatch(b) |
+----+-------------+-------+------------+-------+---------------+------+---------+----------+------+----------+----------------------------+
2 rows in set, 1 warning (0.01 sec)

mysql> explain analyze select b1 from b where b1 in (select a1 from a);
+---------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                                                                                                                                                              |
+---------------------------------------------------------------------------------------------------------------------+
| -> Nested loop semijoin  (cost=2.95 rows=6) (actual time=0.241..0.688 rows=6 loops=1)
    -> Filter: (b.b1 is not null)  (cost=0.85 rows=6) (actual time=0.152..0.355 rows=6 loops=1)
        -> Covering index scan on b using b1  (cost=0.85 rows=6) (actual time=0.149..0.349 rows=6 loops=1)
    -> Covering index lookup on a using a1 (a1=b.b1)  (cost=0.27 rows=1) (actual time=0.054..0.054 rows=1 loops=6)
 |
+--------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

问题：为什么表 a 也是覆盖索引扫描？

答：对于 select a1 from a 语句，优化器会直接选择走 a1 索引进行全索引扫描；

对 group by 子句的影响

1）索引列出现在 group by 子句中，触发索引扫描（PG 可能不触发，待验证），例如：

mysql> explain analyze select a1 from a group by a1;
+-------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                           |
+-------------------------------------------------------------------------------------------------------------+
| -> Group (no aggregates)  (cost=1.45 rows=2) (actual time=0.155..0.308 rows=7 loops=1)
    -> Covering index scan on a using a1  (cost=0.85 rows=6) (actual time=0.112..0.289 rows=7 loops=1)
 |


+-------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

注意：Group (no aggregates)：是指上述查询只是简单的对数据进行分组，没有执行任何聚合操作；

2）where 子句出现索引列，且 group by 子句出现索引列，索引扫描被使用，例如：

mysql> explain analyze select a1 from a where a1 > 2 group by a1;
+------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                           |
+------------------------------------------------------------------------------------------------------------------------------+
| -> Group (no aggregates)  (cost=1.76 rows=2) (actual time=0.127..0.224 rows=5 loops=1)
    -> Filter: (a.a1 > 2)  (cost=1.26 rows=5) (actual time=0.087..0.212 rows=5 loops=1)
        -> Covering index range scan on a using a1 over (2 < a1)  (cost=1.26 rows=5) (actual time=0.076..0.199 rows=5 loops=1)
 |

+------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

3）where 子句中出现非索引列，且 group by 子句出现索引列，索引扫描被使用（PG 可能无法使用索引，待验证），例如：

mysql> explain analyze  select a1 from a where a2 > 2 group by a1;
+-------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                           |
+-------------------------------------------------------------------------------------------------+
| -> Group (no aggregates)  (cost=1.05 rows=1) (actual time=0.334..0.517 rows=5 loops=1)
    -> Filter: (a.a2 > '2')  (cost=0.85 rows=2) (actual time=0.273..0.502 rows=5 loops=1)
        -> Index scan on a using a1  (cost=0.85 rows=6) (actual time=0.152..0.490 rows=7 loops=1)
 |

+-------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

对 having 子句的影响

索引列出现在 having 子句中与出现在 where 子句中类似，是否能够使用索引，要看具体情况。

1）where 子句中出现非索引列，且 group by 和 having 子句出现索引列，索引扫描被使用，例如：

mysql> explain analyze select a1 from a where a3 > 2 group by a1 having a1 > 2;
+-----------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                                                                                                                           |
+-----------------------------------------------------------------------------------------------------+
| -> Filter: (a.a1 > 2)  (cost=1.05 rows=1) (actual time=0.316..0.507 rows=5 loops=1)
    -> Group (no aggregates)  (cost=1.05 rows=1) (actual time=0.315..0.504 rows=5 loops=1)
        -> Filter: (a.a3 > 2)  (cost=0.85 rows=2) (actual time=0.257..0.491 rows=5 loops=1)
            -> Index scan on a using a1  (cost=0.85 rows=6) (actual time=0.147..0.483 rows=7 loops=1)
 |
+-----------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> explain select a1 from a where a3 > 2 group by a1 having a1 > 2;
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | a     | NULL       | index | a1            | a1   | 5       | NULL |    6 |    33.33 | Using where |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql>  explain analyze select a1 from a having a1 > 2;;
+----------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                     |
+----------------------------------------------------------------------------------------------------+
| -> Filter: (a.a1 > 2)  (cost=0.85 rows=6) (actual time=0.530..0.638 rows=5 loops=1)
    -> Covering index scan on a using a1  (cost=0.85 rows=6) (actual time=0.468..0.628 rows=7 loops=1)
 |
+----------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

对 order by 子句的影响

索引列出现在 order by 子句中，不可使用索引（PG 可能可用，待验证）。

1）order by 子句中出现索引列，不可使用索引扫描，例如：

mysql> create table c(c1 int, c2 varchar(9), c3 int, key (c1));

mysql> insert into c values(1, '2', 1), (5, '4', 6), (3, '1', 4), (6, '6', 2), (4, '5', 5), (2, '3', 3); 

mysql> explain analyze select * from c order by c1;
+------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                           |
+------------------------------------------------------------------------------------+
| -> Sort: c.c1  (cost=0.85 rows=6) (actual time=0.376..0.377 rows=6 loops=1)
    -> Table scan on c  (cost=0.85 rows=6) (actual time=0.169..0.324 rows=6 loops=1)
 |
+------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> explain analyze select * from c where c1 > 2 order by c1;
+--------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                          |
+--------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Index range scan on c using c1 over (2 < c1), with index condition: (c.c1 > 2)  (cost=2.06 rows=4) (actual time=0.120..0.309 rows=4 loops=1)
 |
+--------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

注意：基于优化器的考量上述第一条语句并没有走索引扫描，所以需要对查询的结果再次进行排序，即 order by 无法被优化掉！但第二条语句中的 order by 可以被优化掉！

2）order by 子句中出现非索引列，不可使用索引扫描，例如：

mysql> explain analyze select * from a order by a3;
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                           |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Sort: a.a3  (cost=0.85 rows=6) (actual time=0.347..0.348 rows=7 loops=1)
    -> Table scan on a  (cost=0.85 rows=6) (actual time=0.119..0.302 rows=7 loops=1)
 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

对 distinct 的影响

索引列出现在 distinct 子句管辖的范围中，与索引没有关联。

1）distinct 子句管辖范围内出现索引列，可使用索引扫描，例如：

mysql> explain analyze select distinct a1 from a;
+-------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                        |
+-------------------------------------------------------------------------------------------------------+
| -> Group (no aggregates)  (cost=1.45 rows=2) (actual time=0.193..0.341 rows=7 loops=1)
    -> Covering index scan on a using a1  (cost=0.85 rows=6) (actual time=0.149..0.322 rows=7 loops=1)
 |
+-------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

2）distinct 子句管辖范围内出现索引列，因 where 子句内使用索引列，故其可使用索引扫描，例如：

mysql> explain analyze select distinct a1 from a where a1 > 2;
+------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                                                        |
+------------------------------------------------------------------------------------------------------------------------------+
| -> Group (no aggregates)  (cost=1.76 rows=2) (actual time=0.121..0.218 rows=5 loops=1)
    -> Filter: (a.a1 > 2)  (cost=1.26 rows=5) (actual time=0.080..0.205 rows=5 loops=1)
        -> Covering index range scan on a using a1 over (2 < a1)  (cost=1.26 rows=5) (actual time=0.078..0.200 rows=5 loops=1)
 |
+------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

联合索引对索引使用的影响

准备数据：

mysql> create table d(d1 int, d2 varchar(9), d3 int, key (d1, d3));

mysql> insert into d values(1, '2', 1), (5, '4', 6), (3, '1', 4), (6, '6', 2), (4, '5', 5), (2, '3', 3); 

mysql> show create table d;
+-------+------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                              |
+-------+------------------------------------------------------------------+
| d     | CREATE TABLE `d` (
  `d1` int DEFAULT NULL,
  `d2` varchar(9) COLLATE utf8mb4_general_ci DEFAULT NULL,
  `d3` int DEFAULT NULL,
  KEY `d1` (`d1`,`d3`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci |
+-------+-----------------------------------------------------------------+
1 row in set (0.00 sec)

1）使用联合索引的全部索引键，可触发索引的使用，例如：

mysql> explain analyze select * from d where d1 = 1 and d3 = 1;
+-----------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                   |
+-----------------------------------------------------------------------------------------------------------+
| -> Index lookup on d using d1 (d1=1, d3=1)  (cost=0.35 rows=1) (actual time=0.135..0.167 rows=1 loops=1)
 |
+-----------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

2）使用联合索引的前缀部分索引键，如 “key_part_1 常量”，可触发索引的使用，例如：

mysql> explain analyze select * from d where d1 > 3;
+--------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                          |
+--------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Index range scan on d using d1 over (3 < d1), with index condition: (d.d1 > 3)  (cost=1.61 rows=3) (actual time=0.129..0.272 rows=3 loops=1)
 |
+--------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

3）使用联合索引的非前缀部分索引键，如 “key_part_1 常量”，不可触发索引的使用，例如：

mysql> explain analyze select * from d where d3 > 3;
+---------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                   |
+---------------------------------------------------------------------------------------+
| -> Filter: (d.d3 > 3)  (cost=0.85 rows=2) (actual time=0.154..0.270 rows=3 loops=1)
    -> Table scan on d  (cost=0.85 rows=6) (actual time=0.117..0.263 rows=6 loops=1)
 |
+---------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

4）使用联合索引的全部索引键，但索引键不是 and 操作，不可触发索引的使用，例如：

mysql> explain analyze select * from d where d1 = 1 or d3 = 2;
+------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                   |
+------------------------------------------------------------------------------------------------------+
| -> Filter: ((d.d1 = 1) or (d.d3 = 2))  (cost=0.85 rows=2) (actual time=0.146..0.302 rows=2 loops=1)
    -> Table scan on d  (cost=0.85 rows=6) (actual time=0.143..0.293 rows=6 loops=1)
 |
+------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

触发联合索引的情况如下：

使用联合索引全部索引键，但索引键之间需要使用 and 连接；
使用联合索引部分键时，需满足最左匹配原则；

多个索引对索引使用的影响

准备数据：

mysql> create table e(e1 int unique not null, e2 varchar(9), e3 int, e4 int, key (e1, e3));

mysql> insert into e values(1, '2', 1, 6), (5, '4', 6, 4), (3, '1', 4, 1), (6, '6', 2, 5), (4, '5', 5, 2), (2, '3', 3, 3); 

mysql> show create table e;
+-------+-----------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                    |
+-------+-----------------------------------------------------------------------+
| e     | CREATE TABLE `e` (
  `e1` int NOT NULL,
  `e2` varchar(9) COLLATE utf8mb4_general_ci DEFAULT NULL,
  `e3` int DEFAULT NULL,
  `e4` int DEFAULT NULL,
  UNIQUE KEY `e1` (`e1`),
  KEY `e1_2` (`e1`,`e3`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci |


+-------+-----------------------------------------------------------------------+
1 row in set (0.00 sec)

1）where 条件子句出现两个可利用的索引，优选最简单的索引，例如：


mysql> explain analyze select * from e where e1 > 2 and e3 > 4;
+-------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                 |
+-------------------------------------------------------------------------------------------------------------------+
| -> Filter: ((e.e1 > 2) and (e.e3 > 4))  (cost=1.06 rows=1) (actual time=0.130..0.201 rows=2 loops=1)
    -> Index range scan on e using e1 over (2 < e1)  (cost=1.06 rows=4) (actual time=0.090..0.194 rows=4 loops=1)
 |

+-------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

从上述执行计划可知，选择索引 e1，因为 e1_2 是联合索引，比 e1 更为复杂；

2）where 条件子句出现两个可利用索引且索引键有重叠部分（e1 出现在两个索引中），优选最简单的索引，例如：

mysql> explain analyze select * from e where e1 > 2 and (e1 > 4 and e3 > 2);
+---------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                |
+---------------------------------------------------------------------------------------------------------------------+
| -> Filter: ((e.e1 > 2) and (e.e1 > 4) and (e.e3 > 2))  (cost=0.91 rows=1) (actual time=0.091..0.150 rows=1 loops=1)
    -> Index range scan on e using e1 over (4 < e1)  (cost=0.91 rows=2) (actual time=0.087..0.144 rows=2 loops=1)
 |
+---------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

从查询执行计划看，索引扫描选择了 e1 而不是联合索引 e1_2；

总结：where 条件子句出现多个可利用的索引时，优选最简单的索引，与索引字段先后顺序无关！

数据库内核

关注

29
点赞
踩
9

收藏

觉得还不错? 一键收藏
打赏
0
评论
物理查询优化（一）：索引的使用（案例分析）

从上述查询执行计划来看，尽管连接条件 a1 = b1 中的 a1 列和 b1 列都是索引列，但在表 a 中使用了顺序扫描，在表 b 中使用了索引扫描。从查询执行计划来看，a1 和 b1 都走了索引，能走索引的原因是在连接条件中，出现了 a1 = 1 条件，使得查询优化器可以根据 ”常量传递“ 优化技术推知 b1 = 1，所以在表 a 和 b 上可以各自使用索引扫描。9）对表做查询，有索引列对象作为过滤条件，操作符是范围操作符，可做索引扫描（PG 不可做索引扫描），且可走 ICP下推；
复制链接

扫一扫