MySQL查询开销计算原理和索引失效原因

最新推荐文章于 2025-06-08 15:16:10 发布

AirGo.

最新推荐文章于 2025-06-08 15:16:10 发布

阅读量1.8k

点赞数 8

CC 4.0 BY-SA版权

分类专栏： MySQL 文章标签： mysql 索引失效索引优化优化器查询开销计算

码字不易，转载请附原链，搬砖繁忙回复不及时见谅，技术交流请加QQ群：909211071，或关注微信公众号：程序猿AirGo

本文链接：https://blog.csdn.net/why444216978/article/details/104986241

MySQL 专栏收录该内容

35 篇文章

订阅专栏

码字不易，转载请附原链，搬砖繁忙回复不及时见谅，技术交流请加QQ群：909211071

抛出问题

今天运营同学说我们mis后台的一个列表页打不开了，经过排查每一条 sql 语句，几条都是一秒多，sql 大概都一样，拿出来其中一条：

select count(*) as count from t_price where day_time >= '2020-02-08' and status = 2;
+--------+
| count  |
+--------+
| 780711 |
+--------+
1 row in set (2.21 sec)

count(*) 看一下表里数据：

select count(*) from t_price;
+----------+
| count(*) |
+----------+
|  1815023 |
+----------+
1 row in set (0.26 sec)

才一百多万条数据，也不多啊，估计大概率是索引失效了，看一下联合索引的顺序：

CREATE TABLE `t_price` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT COMMENT 'id',
  ......
  `created_time` int(11) NOT NULL DEFAULT '0' COMMENT '创建时间',
  `updated_time` int(11) NOT NULL DEFAULT '0' COMMENT '更新时间',
  PRIMARY KEY (`id`),
  KEY `product` (`day_time`,`status`,`product_id`) USING BTREE,
  KEY `market` (`day_time`,`status`,`market_id`) USING BTREE,
  KEY `day_time` (`day_time`) USING BTREE,
  KEY `status` (`status`) USING BTREE,
  KEY `province_id` (`province_id`) USING BTREE,
  KEY `customer_id` (`customer_id`) USING BTREE,
  KEY `dn_market` (`day_time`,`status`,`dn_market_id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=1799748 DEFAULT CHARSET=utf8 COMMENT='行情价格' |

没有任何问题啊，用 explain 看一下执行计划：

explain select count(*) as count from t_price where day_time >= '2020-02-08' and status = 2;
+----+-------------+---------+------+------------------------------------------+--------+---------+-------+--------+-------------+
| id | select_type | table   | type | possible_keys                            | key    | key_len | ref   | rows   | Extra       |
+----+-------------+---------+------+------------------------------------------+--------+---------+-------+--------+-------------+
|  1 | SIMPLE      | t_price | ref  | product,market,day_time,status,dn_market | status | 1       | const | 839884 | Using where |
+----+-------------+---------+------+------------------------------------------+--------+---------+-------+--------+-------------+
1 row in set (0.01 sec)

发现问题了，实际使用的索引是 status，而我们的语句希望使用的是 product ，可是为什么它明明能够满足联合索引 product 的最左原则，却使用单列索引 status 呢？

原理分析

突然想到《高性能MySQL》中的一句话：优化器会根据需要扫描的行数选择合适的索引，当行数相近时，优化器会选择开销最小的索引。那么何为开销最小呢？

首先记住一个公式：Cost = CPU Cost + IO Cost

I/O成本：从磁盘到内存这个加载的过程损耗的时间称之为I/O成本
CPU成本：读取以及检测记录是否满足对应到搜索条件、对结果进行排序等操作损耗的时间称之为CPU成本

页是磁盘和内存之间交互等基本单位，读取一个页花费的成本默认是1.0，读取及检测一条记录是否符合搜索条件的成本默认是0.2。

优化器的优化有两种，一种是基于成本的优化，另一种是基于索引统计数据的成本计算。

基于成本的优化

SQL执行的代价分为server层和engine层。server层主要是cpu的代价，而engine层主要是io的代价。5.7 引入了两个系统表mysql.server_cost和mysql.engine_cost来分别配置这两个层的代价，具体的值可以通过这两张表查询。

一条查询经历以下4个步骤：

根据搜索条件，找出所有可能使用的索引
计算全表扫描的代价
计算使用不同索引执行查询的代价
对比各种执行方案的代价，找出成本最低的那个

show table status like 't_price'\G;
*************************** 1. row ***************************
           Name: t_price
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 1679769
 Avg_row_length: 631
    Data_length: 1061158912
Max_data_length: 0
   Index_length: 290586624
      Data_free: 7340032
 Auto_increment: 1827554
    Create_time: 2020-06-18 15:25:19
    Update_time: NULL
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options:
        Comment: 行情价格
1 row in set (0.00 sec)

计算全表扫描的代价

Data_length = 聚簇索引的页面数量 * 每个页面的大小

聚簇索引的页面数量 = 1061158912 / 16 / 1024 = 64768

记录行数：1690802

Cost = (64768 * 1.0 + 1.1) + (1690802 * 0.2 + 1.0)

使用二级索引的成本分析

where 查询：读取一个索引范围区间的 I/O 成本和读取一个页面是相同的
in区间查询：in列表大小 * 1.0

Cost = (索引范围区间数 * 1.0 + 1.0) + (读取二级索引记录数 * 0.2 + 回表检测的行数 * 0.2)

使用覆盖索引的成本分析

Cost = 区间数 * 1.0 + (读取二级索引记录数 * 0.2)

引出一个问题：当 in 查询条件很多的时候怎么办，比如3000个，那明显会无限拉大 I/O 开销，这种情况下需要用下面的计算方法

基于索引统计数据的成本计算

一个值的重复次数 ≈ 记录行数(rows) / 使用的索引基数(Cardinality)

需要回表的记录数：in条件个数 * 一个值的重复次数

Cost = I/O开销(和基于成本的计算方式相同) + 需要回表的记录数 * 0.2

计算方式选择方式

show variables like '%dive%';
+---------------------------+-------+
| Variable_name             | Value |
+---------------------------+-------+
| eq_range_index_dive_limit | 10    |
+---------------------------+-------+
1 row in set (0.01 sec)

当 in 中的条件个数大于该配置值时，则采取基于索引统计数据计算 CPU 开销，否则用基于成本的计算。

回归问题并解决

回到我们的问题中来，status 的索引长度远小于 product 索引长度，所以优化器认为 status 索引 I/O 开销比 product 索引要小，而最终返回的记录数都是相同，所以综合起来，status 索引开销更低，优化器分析的结果如下：

              "considered_execution_plans": [
              {
                "plan_prefix": [
                ],
                "table": "`t_price` FORCE INDEX (`product`)",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "access_type": "range",
                      "rows": 826097,
                      "cost": 334857,
                      "chosen": true
                    }
                  ]
                },
                "cost_for_plan": 334857,
                "rows_for_plan": 826097,
                "chosen": true
              }

explain select count(*) as count from t_price force index (product) where day_time >= '2020-02-08' and status = 2;
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type | table   | type  | possible_keys | key     | key_len | ref  | rows   | Extra                    |
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
|  1 | SIMPLE      | t_price | range | product       | product | 36      | NULL | 839884 | Using where; Using index |
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
1 row in set (0.00 sec)

explain select count(*) as count from t_price force index (product) where day_time >= '2020-02-08' and status = 2;
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type | table   | type  | possible_keys | key     | key_len | ref  | rows   | Extra                    |
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
|  1 | SIMPLE      | t_price | range | product       | product | 36      | NULL | 839884 | Using where; Using index |
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+

我们可以看到 status 索引的查询 type 为 ref，cost 为 335949；product 索引的查询 type 为 range，cost 为340441，所以优化器选择了 status 索引。

优化很简单，加 hint 即可：

select count(*) as count from t_price force index (product) where day_time >= '2020-02-08' and status = 2;
+--------+
| count  |
+--------+
| 780711 |
+--------+
1 row in set (0.50 sec)

多场景实验

优化完之后回想起之前也在网上查过索引失效的问题，不过文章之间都有相互矛盾的地方，本着“实验出真知”的原则，自己特意针对不同情况做了一些实验，将结果做个总结。

准备数据

CREATE TABLE `test` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(50) NOT NULL DEFAULT '',
  `age` int(11) NOT NULL DEFAULT 0,
  `nick` varchar(200) DEFAULT NULL,
  `status` int(11) NOT NULL DEFAULT 0 COMMENT '状态',
  PRIMARY KEY (`id`),
  KEY `name` (`name`) USING BTREE,
  KEY `age` (`age`) USING BTREE,
  KEY `nick` (`nick`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8

其中nick字段有索引，但是nick字段允许为NULL（用于验证判断null是否走索引的情况）

        for ($i = 1; $i <= 30000; $i++){
            $tmp = [
                'name' => 'abc' . rand(0, 10000),
                'age' => rand(0, 10000),
                'status' => $i % 2,
            ];
            if ($i % 5 != 0){
                $tmp['nick'] = 'xyz' . rand(0, 10000);
            }
            $table->insert($tmp);
        }

当字段允许为NULL时，is null和is not null

首先设置数据分布，这个数据比例一共测试了2次，这两次当查询 * 时都不会走索引，而只查询nick字段，也就是是满足覆盖索引条件时，可以走索引：

当 null 行数为5136时，查询 nick is not null 查询不走索引
当 null 行数为5148时，查询 nick is not null 查询不走索引

update test set nick = null where id >= 35150;
update test set nick = 'abc' where id < 35150;

mysql> select count(*) from test where nick is null;  
+----------+
| count(*) |
+----------+
|    24852 |
+----------+
1 row in set (0.01 sec)

mysql> select count(*) from test where nick is not null;
+----------+
| count(*) |
+----------+
|     5148 |
+----------+
1 row in set (0.03 sec)

查询所有字段时，索引失效：

mysql> explain select * from test where nick is not null;
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
|  1 | SIMPLE      | test  | ALL  | nick          | NULL | NULL    | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.01 sec)

改成只查询 nick 字段，试一下满足覆盖索引的情况：

mysql> explain select nick from test where nick is not null;
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|  1 | SIMPLE      | test  | range | nick          | nick | 603     | NULL | 5146 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
1 row in set (0.02 sec)

我们再改变临界的一行，继续 select * 发现可以走索引了，结果如下：

update test set nick = null where id >= 35149;
update test set nick = 'abc' where id < 35149;

mysql> explain select * from test where nick is not null;
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
|  1 | SIMPLE      | test  | ALL  | nick          | NULL | NULL    | NULL | 29905 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.02 sec)

is null 在网上争议不大，所以我们不改变数据分布，直接测试 select * ... is null 的情况

mysql> explain select nick from test where nick is null;
+----+-------------+-------+------+---------------+------+---------+-------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref   | rows  | Extra                    |
+----+-------------+-------+------+---------------+------+---------+-------+-------+--------------------------+
|  1 | SIMPLE      | test  | ref  | nick          | nick | 603     | const | 14952 | Using where; Using index |
+----+-------------+-------+------+---------------+------+---------+-------+-------+--------------------------+
1 row in set (0.01 sec)

所以结论：当满足普通索引时，is null 可以走索引，is not null 在对比cost后，也大概率会走索引，并不是网上大多数文章说的那样不走索引；而满足覆盖索引条件时，也会走索引。

当使用!=条件时（这里需要注意了啊，网上很多文章一口咬定!=不会走索引）

实验过程和上面一样，也是测试了2次，下面直接贴其中一个过程和结果：

update test set name = '' where id >= 35150;
update test set name = 'abc' where id < 35150;

mysql> select count(*) from test where name != ''; 
+----------+
| count(*) |
+----------+
|     5148 |
+----------+
1 row in set (0.04 sec)

mysql> select count(*) from test where name = '';
+----------+
| count(*) |
+----------+
|    24852 |
+----------+
1 row in set (0.01 sec)

mysql> explain select * from test where name != '';
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
|  1 | SIMPLE      | test  | ALL  | name          | NULL | NULL    | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.02 sec)

mysql> explain select name from test where name != '';
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|  1 | SIMPLE      | test  | range | name          | name | 152     | NULL | 5148 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
1 row in set (0.01 sec)

结论：!= 可以走索引，!= 在对比cost后，也大概率会走索引，但并不是网上大多数文章说的那样不走索引；而满足覆盖索引条件时，也会走索引。

使用or的情况

当我们status字段没有索引的时候，发现是全表扫描

explain select * from test where age = 2644 or status = 1;

我们为status字段加上索引之后，发现用到了age和status两个索引

explain select * from test where age = 2644 or status = 1;

结论：or是否走索引取决于or前后的两个字段是否都建立了索引。

not in情况

not in 我们分2种情况测试，分别是使用主键索引和使用普通索引，每一个里边又分别测试了查询所有字段、不满足覆盖索引的字段，以及满足索引的字段。

mysql> explain select * from test where age not in (9837);
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
|  1 | SIMPLE      | test  | ALL  | age           | NULL | NULL    | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.03 sec)

mysql> explain select name from test where age not in (9837);
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
|  1 | SIMPLE      | test  | ALL  | age           | NULL | NULL    | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.01 sec)

mysql> explain select age from test where age not in (9837);
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows  | Extra                    |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
|  1 | SIMPLE      | test  | range | age           | age  | 4       | NULL | 15389 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
1 row in set (0.08 sec)

mysql> explain select * from test where id not in (9837);
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows  | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
|  1 | SIMPLE      | test  | range | PRIMARY       | PRIMARY | 4       | NULL | 14920 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
1 row in set (0.01 sec)

mysql> explain select name from test where id not in (9837);
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows  | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
|  1 | SIMPLE      | test  | range | PRIMARY       | PRIMARY | 4       | NULL | 14920 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
1 row in set (0.01 sec)

mysql> explain select age from test where id not in (9837);
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows  | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
|  1 | SIMPLE      | test  | range | PRIMARY       | PRIMARY | 4       | NULL | 14920 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
1 row in set (0.01 sec)

结论：使用主键索引查询任何字段都可以走索引，满足覆盖索引时可以走索引，这两种功能情况都是因为无需二次回表。

having情况

having 我们同样也分两种情况测试，一种使用主键索引，一种是普通索引。

使用普通索引

mysql> explain select name from test where age HAVING (2644);
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
|  1 | SIMPLE      | test  | ALL  | NULL          | NULL | NULL    | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.01 sec)

mysql> explain select age from test where age HAVING (2644);
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows  | Extra                    |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
|  1 | SIMPLE      | test  | index | NULL          | age  | 4       | NULL | 29839 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
1 row in set (0.01 sec)

使用主键索引

mysql> explain select status from test where id HAVING (12345);
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
|  1 | SIMPLE      | test  | ALL  | NULL          | NULL | NULL    | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.01 sec)

mysql> explain select name from test where id HAVING (12345);
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows  | Extra                    |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
|  1 | SIMPLE      | test  | index | NULL          | name | 152     | NULL | 29839 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
1 row in set (0.01 sec)

mysql> explain select age from test where id HAVING (12345);
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows  | Extra                    |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
|  1 | SIMPLE      | test  | index | NULL          | age  | 4       | NULL | 29839 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
1 row in set (0.01 sec)

mysql> explain select id from test where id HAVING (12345);
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows  | Extra                    |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
|  1 | SIMPLE      | test  | index | NULL          | age  | 4       | NULL | 29839 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
1 row in set (0.01 sec)

结论：使用having时，当使用普通索引时，只有满足覆盖索引时才会走索引，否则不会走索引；当使用主键索引时，只有返回建立索引的字段时才会走索引，否则不会走索引。