MySQL 中的索引

呦，又写BUG呢

已于 2024-08-06 19:10:33 修改

阅读量358

点赞数

分类专栏：数据库文章标签： mysql 数据库

于 2023-09-24 21:34:31 首次发布

本文链接：https://blog.csdn.net/qq_43686863/article/details/133242907

版权

数据库专栏收录该内容

13 篇文章 0 订阅

订阅专栏

文章目录

一、索引的创建
二、聚簇索引与非聚簇索引
三、联合索引和索引下推
四、B+ 树索引
五、索引失效

在这里插入图片描述

一、索引的创建

创建索引的方式包括两种：

隐式创建：数据库一般会在创建 PRIMARY KEY 和 UNIQUE 约束列时自动创建索引。
显示创建：使用 CREAT INDEX 语句创建，建立之后由数据库负责使用和维护索引。

下面展示了 MySQL 中隐式创建的索引和显示创建的索引，不过需要注意的是索引查询时显示的索引类型为 BTREE，但实际上 MySQL 默认使用的是 B+ 树。

DROP TABLE IF EXISTS student;

CREATE TABLE student
(
    student_id   INT PRIMARY KEY,
    student_name VARCHAR(50) UNIQUE,
    gender       ENUM ('Male', 'Female') DEFAULT 'Male',
    major        VARCHAR(50) NOT NULL,
    birthdate    DATE        NOT NULL
);

INSERT INTO student (student_id, student_name, gender, major, birthdate)
VALUES (1, 'Alice', 'Female', 'Computer Science', '2000-05-15'),
       (2, 'Andrew', 'Male', 'Engineering', '1999-09-20'),
       (3, 'Maria', 'Female', 'Mathematics', '2001-03-10'),
       (4, 'Samantha', 'Female', 'Physics', '2002-01-25'),
       (5, 'Michael', 'Male', 'Biology', '1998-11-05'),
       (6, 'Jessica', 'Female', 'History', '1997-06-30'),
       (7, 'William', 'Male', 'Geology', '1996-04-15'),
       (8, 'John', 'Male', 'Computer Science', '1995-08-08');

SHOW INDEXES FROM student;
# +-------+----------+------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+
# |Table  |Non_unique|Key_name    |Seq_in_index|Column_name |Collation|Cardinality|Sub_part|Packed|Null|Index_type|Comment|Index_comment|Visible|Expression|
# +-------+----------+------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+
# |student|0         |PRIMARY     |1           |student_id  |A        |4          |null    |null  |    |BTREE     |       |             |YES    |null      |
# |student|0         |student_name|1           |student_name|A        |4          |null    |null  |YES |BTREE     |       |             |YES    |null      |
# +-------+----------+------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+

# 在 birthdate 字段上创建唯一索引
CREATE UNIQUE INDEX idx_major ON student (birthdate);

# 在 birthdate 字段上创建普通索引
CREATE INDEX idx_birthdate ON student (birthdate);
# +-------+----------+-------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+
# |Table  |Non_unique|Key_name     |Seq_in_index|Column_name |Collation|Cardinality|Sub_part|Packed|Null|Index_type|Comment|Index_comment|Visible|Expression|
# +-------+----------+-------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+
# |student|0         |PRIMARY      |1           |student_id  |A        |4          |null    |null  |    |BTREE     |       |             |YES    |null      |
# |student|0         |idx_major    |1           |birthdate   |A        |8          |null    |null  |    |BTREE     |       |             |YES    |null      |
# |student|0         |student_name |1           |student_name|A        |4          |null    |null  |YES |BTREE     |       |             |YES    |null      |
# |student|1         |idx_birthdate|1           |birthdate   |A        |8          |null    |null  |    |BTREE     |       |             |YES    |null      |
# +-------+----------+-------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+

SHOW TABLE STATUS LIKE 'student';
# +-------+------+-------+----------+----+--------------+-----------+---------------+------------+---------+--------------+-------------------+-------------------+----------+------------------+--------+--------------+-------+
# |Name   |Engine|Version|Row_format|Rows|Avg_row_length|Data_length|Max_data_length|Index_length|Data_free|Auto_increment|Create_time        |Update_time        |Check_time|Collation         |Checksum|Create_options|Comment|
# +-------+------+-------+----------+----+--------------+-----------+---------------+------------+---------+--------------+-------------------+-------------------+----------+------------------+--------+--------------+-------+
# |student|InnoDB|10     |Dynamic   |8   |2048          |16384      |0              |16384       |0        |null          |2023-09-24 21:09:40|2023-09-24 21:09:04|null      |utf8mb4_0900_ai_ci|null    |              |       |
# +-------+------+-------+----------+----+--------------+-----------+---------------+------------+---------+--------------+-------------------+-------------------+----------+------------------+--------+--------------+-------+

二、聚簇索引与非聚簇索引

聚簇索引：聚簇索引（clustered index）也叫聚集索引、聚类索引，一张表有且仅有一个聚簇索引。

默认情况下，MySQL 在建表时会根据不同的场景选择不同的列建立聚簇索引：

聚簇索引默认是主键索引，会在定义主键时自动创建聚簇索引。
如果表中没定义主键，那么就选取首个非空且唯一的索引列来建立聚簇索引。
如果前两条都不满足，那么数据库就创建一个隐藏的 row-id 作为聚簇索引。

对于一个主键聚簇索引来说，叶子结点中按照主键顺序依次存放着整张表的全部记录。这个特性决定了索引组织表中的数据本身就是索引的一部分，每张表也只能拥有一个聚簇索引。

非聚簇索引：非聚簇索引（secondary index）也叫辅助索引、二级索引，一张表可以有多个非聚簇索引。

辅助索引不是建立在记录上的，而是建立在上述的聚簇索引上的。对于一个非聚簇索引，叶子结点中只存放当前关键字以及对应的聚簇索引关键字（见下例），当我们要查找整条记录时，需要在访问非聚簇索引的叶结点后继续访问聚簇索引查找完整信息，这个过程也叫做回表，总计需要查找两次索引。

不过，如果非聚簇索引存在索引覆盖也能避免回表。索引覆盖指的是一个索引覆盖或者说包含了所有需要查询的字段，即需要查询的字段正好是索引的字段，那么直接根据该索引就可以查到数据，无需回表查询。

基于上面示例中的 student 表，如果 student_id 聚簇索引采用 B+ 树来实现，那么这个聚簇索引的四个叶子结点中会按 student_id 递增依次存放对应记录。当我们通过 student_id 聚簇索引查询住址时，会直接在叶子结点中查找成功。如果我们再在 studnet_name 字段上建立一个非聚簇索引，叶子结点只会存放 studnet_name 与 student_id ，不会存储完整记录。当我们通过 studnet_name 非聚簇索引查询住址时，在非聚簇索引中根据 studnet_name 查找到 student_id 后，要继续根据 student_id 回表查找地址。

$\begin{aligned} student\_id（主键聚簇索引）：8\begin{cases} 8\begin{cases} [(7, William, Male, ...), (8, John, Male, ...)]\\\\ [(5, Michael, Male, ...), (6, Jessica, Female, ...)] \end{cases}\\\\ 4\begin{cases} [(3, Maria, Female, ...), (4, Samantha, Female, ...)]\\\\ [(1, Alice, Female, ...), (2, Andrew, Male, ...)] \end{cases}\\ \end{cases} \end{aligned}\\\\ \begin{aligned} student\_name（非聚簇索引）：William\begin{cases} William\begin{cases} [(Samantha, 4), (William, 7)]\\\\ [(Maria, 3), (Michael, 5)], \end{cases}\\\\ John\begin{cases} [(Jessica, 6), (John, 8)]\\\\ [(Alice, 1), ( Andrew, 2)] \end{cases} \end{cases} \end{aligned}$

三、联合索引和索引下推

联合索引即由多个列组成的索引，而索引下推是一种基于联合索引的查询优化策略。

在下面的例子中，需要查询计算机专业所有女生，如果不采用索引下推，因为最左前缀匹配原则，首先会根据辅助索引 idx_major_gender 匹配到四条 major = 'Computer Science' 的记录，然后根据这四条记录中的主键值依次回表获取对应的四条完整记录，最后通过 gender = 'Female' 对这四条记录进行后过滤，累计需要四次回表。

如果采用索引下推，gender 字段的过滤操作也会在联合索引中完成，即在辅助索引中完全匹配所有 major = 'Computer Science AND gender = 'Female' 的记录，这样只需要回表两次。可以说索引下推是对联合索引更充分的利用。

此外，我们可以使用 EXPLAIN 命令来查看查询执行计划，Using index condition 表明索引下推发挥了作用，Using where 则表明采用的是上述的后过滤。

DROP TABLE IF EXISTS student;

CREATE TABLE student
(
    student_id   INT PRIMARY KEY,
    student_name VARCHAR(50) UNIQUE,
    gender       ENUM ('Male', 'Female') DEFAULT 'Male',
    major        VARCHAR(50) NOT NULL,
    birthdate    DATE        NOT NULL
);

INSERT INTO student (student_id, student_name, gender, major, birthdate)
VALUES (1, 'Alice', 'Female', 'Computer Science', '2000-05-15'),
       (2, 'Andrew', 'Male', 'Engineering', '1999-09-20'),
       (3, 'Maria', 'Female', 'Mathematics', '2001-03-10'),
       (4, 'Samantha', 'Female', 'Physics', '2002-01-25'),
       (5, 'Michael', 'Male', 'Biology', '1998-11-05'),
       (6, 'Jessica', 'Female', 'Computer Science', '1997-06-30'),
       (7, 'William', 'Male', 'Computer Science', '1996-04-15'),
       (8, 'John', 'Male', 'Computer Science', '1995-08-08');

EXPLAIN
SELECT student_id, student_name
FROM student
WHERE major = 'Computer Science'
  AND gender = 'Female';
# +--+-----------+-------+----------+----+-------------+----+-------+----+----+--------+-----------+
# |id|select_type|table  |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra      |
# +--+-----------+-------+----------+----+-------------+----+-------+----+----+--------+-----------+
# |1 |SIMPLE     |student|null      |ALL |null         |null|null   |null|8   |12.5    |Using where|
# +--+-----------+-------+----------+----+-------------+----+-------+----+----+--------+-----------+

# 建立在 major 上的普通索引
CREATE INDEX idx_major ON student (major);

EXPLAIN
SELECT student_id, student_name
FROM student
WHERE major = 'Computer Science'
  AND gender = 'Female';
# +--+-----------+-------+----------+----+-------------+---------+-------+-----+----+--------+-----------+
# |id|select_type|table  |partitions|type|possible_keys|key      |key_len|ref  |rows|filtered|Extra      |
# +--+-----------+-------+----------+----+-------------+---------+-------+-----+----+--------+-----------+
# |1 |SIMPLE     |student|null      |ref |idx_major    |idx_major|202    |const|4   |50      |Using where|
# +--+-----------+-------+----------+----+-------------+---------+-------+-----+----+--------+-----------+

# 建立在 major 和 gender 上的联合索引
CREATE INDEX idx_major_gender ON student (major, gender);

EXPLAIN
SELECT student_id, student_name
FROM student
WHERE major = 'Computer Science'
  AND gender = 'Female';
# +--+-----------+-------+----------+----+--------------------------+----------------+-------+-----------+----+--------+---------------------+
# |id|select_type|table  |partitions|type|possible_keys             |key             |key_len|ref        |rows|filtered|Extra                |
# +--+-----------+-------+----------+----+--------------------------+----------------+-------+-----------+----+--------+---------------------+
# |1 |SIMPLE     |student|null      |ref |idx_major,idx_major_gender|idx_major_gender|204    |const,const|2   |100     |Using index condition|
# +--+-----------+-------+----------+----+--------------------------+----------------+-------+-----------+----+--------+---------------------+

四、B+ 树索引

下面是对 $m$ 阶 B+ 树的一些说明，不同资料上的定义不尽相同，这里参照维基百科对于 B+ 树的定义：

每个结点至多有 $m$ 棵子树， $m - 1$ 个关键字。
除根结点外所有内部结点（非叶节点）至少有 $\lfloor m/2 \rfloor$ 棵子树， $\lfloor m/2 \rfloor - 1$ 个关键字。
所有叶节点都在相同的高度上（绝对平衡），且结构为 $[P_0 k_0 P_1 k_1 \dots P_{n-1} K_{n-1} P_n]$ （键值交错）。
叶结点本身按关键字大小从小到大链接。

B+ 树相比于 B 树的优势：

B+ 树只有叶子节点同时存放键和数据，内部节点只存放键，所以同样大小的磁盘页上可以容纳更多键值对。在相同的数据量下，B+ 树通常具有更扁平的树结构，这意味着在查找操作中需要更少的磁盘 I/O 操作，因此查找效率更高。
B+ 树的叶子节点之间以双向链表的形式顺序连接，便于范围查找和遍历。

五、索引失效

索引失效主要以以下几种情况：

使用联合索引时未遵循最左前缀法则。
在索引字段上进行了计算或者调用了函数。如：SELECT * FROM tb_user WHERE SUBSTRING(address, 1, 2) = '上海';
字符串字段没有加引号，发生隐式转换。如：SELECT * FROM tb_user WHERE phone = 12345678910;。
采用模糊查询，且为头部模糊匹配。如：SELECT * FROM tb_user WHERE address like '%海';
查询条件中使用 OR，且 OR 前后任意一个条件字段没有索引。
由于数据分布等原因，使用索引后仍需要回表查询大量数据，此时全表扫描效率高于使用索引。

最左前缀匹配法则：在使用联合索引时，数据库会根据联合索引中的字段顺序，从左到右依次到查询条件中去匹配，索引字段顺序与条件字段顺序必须完全一致。如果在匹配过程中发生了跳跃就会使后面的字段索引失效。但如果查询条件由 AND 连接，优化器会自动调整顺序以避免索引失效，此时的索引失效主要考虑字段缺失。

最左前缀示例：

create index index_a_b_c on wide_table (a, b, c);

-- 索引生效
explain select * from wide_table where a = '1' and b = '1';
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+
# |id|select_type|table     |partitions|type|possible_keys|key        |key_len|ref        |rows|filtered|Extra|
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+
# |1 |SIMPLE     |wide_table|null      |ref |index_a_b_c  |index_a_b_c|2046   |const,const|1   |100     |null |
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+

-- 索引生效
explain select * from wide_table where b = '1' and a = '1';
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+
# |id|select_type|table     |partitions|type|possible_keys|key        |key_len|ref        |rows|filtered|Extra|
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+
# |1 |SIMPLE     |wide_table|null      |ref |index_a_b_c  |index_a_b_c|2046   |const,const|1   |100     |null |
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+

-- 索引生效
explain select * from wide_table where b > '1' and a = '1';
# +--+-----------+----------+----------+-----+-------------+-----------+-------+----+----+--------+---------------------+
# |id|select_type|table     |partitions|type |possible_keys|key        |key_len|ref |rows|filtered|Extra                |
# +--+-----------+----------+----------+-----+-------------+-----------+-------+----+----+--------+---------------------+
# |1 |SIMPLE     |wide_table|null      |range|index_a_b_c  |index_a_b_c|2046   |null|1   |100     |Using index condition|
# +--+-----------+----------+----------+-----+-------------+-----------+-------+----+----+--------+---------------------+

-- 未走索引
explain select * from wide_table where b = '1';
# +--+-----------+----------+----------+----+-------------+----+-------+----+-----+--------+-----------+
# |id|select_type|table     |partitions|type|possible_keys|key |key_len|ref |rows |filtered|Extra      |
# +--+-----------+----------+----------+----+-------------+----+-------+----+-----+--------+-----------+
# |1 |SIMPLE     |wide_table|null      |ALL |null         |null|null   |null|99551|10      |Using where|
# +--+-----------+----------+----------+----+-------------+----+-------+----+-----+--------+-----------+

-- 索引部分失效
explain select * from wide_table where a = '1' and c = '1';
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------+
# |id|select_type|table     |partitions|type|possible_keys|key        |key_len|ref  |rows|filtered|Extra                |
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------+
# |1 |SIMPLE     |wide_table|null      |ref |index_a_b_c  |index_a_b_c|1023   |const|1   |10      |Using index condition|
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------

-- 索引部分失效
explain select * from wide_table where c = '1' and a = '1';
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------+
# |id|select_type|table     |partitions|type|possible_keys|key        |key_len|ref  |rows|filtered|Extra                |
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------+
# |1 |SIMPLE     |wide_table|null      |ref |index_a_b_c  |index_a_b_c|1023   |const|1   |10      |Using index condition|
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------+