mysql高性能索引应用-CSDN博客

本文链接：https://blog.csdn.net/weixin_43197795/article/details/107935734

高性能索引

文章目录

- 高性能索引

B-Tree索引的限制

create table peoples (
    last_name varchar(50) not null,
    first_name varchar(50) not null,
    dob date not null,
    gender enum('f','m') not null,
    key(last_name, first_name, dob)
);

如果不是按照索引的最左列开始查找，则无法使用索引
不能跳过索引中的列
如果查询中有某个列的范围查询，则其右边的列都无法使用索引查询
决定使用哪个索引不仅仅是where条件，要查询的列也会影响，优先级较低

一句话，B-Tree索引顺序很重要！

哈希索引的限制

哈希索引只包含哈希值和行指针，不存储字段值，所以不能使用索引避免读取行
哈希索引数据并不是按照索引值顺序存储的，所以也就无法用于排序
哈希索引不支持部分索引列匹配查找
哈希索引只支持等值查询 =, in(),<=>,不支持任何范围查询

<=> 运算符用来严格比较两个值是否为null，可以用来代替 is null

前缀索引计算方式

select count(distinct left(idcard, 5)) / count(*);
select count(distinct left(idcard, 6)) / count(*);

复合索引顺序选择

当不需要考虑排序和分组时，将选择性最高的列放在最前面通常是很好的

高效索引案例

延迟关联

-- 存在索引 key(age, name);
select * from peoples where age=20 and `name` like '张三%'; 
select * from peoples t1 join (select id from peoples where age=20 and `name` like '张三%') t2 on t1.id=t2.id;
-- 第二种方式使用了覆盖索引

此方式效率的提高取决于where条件返回的行数，不见得一定会提高效率，要具体分析

索引在where与order中的使用

create index idx on table1 (c1, c2, c3, c4);
# 满足最左前缀原则，且排序的列索引顺序必须一致，排序方向必须一致，不能同时存在asc和desc，查询条件只要存在范围查询，则右面的索引全部失效

# 排序正常使用了索引
select id from table1 order by c1, c2, c3, c4;
select id from table1 where c1 = 1 and c2 = 2 order by c3, c4;
select id from table1 where c2 = 2 and c1 = 1 order by c3, c4;
select id from table1 where c1 = 1 and c2 = 2 order by c3 desc, c4 desc;
select id from table1 where c1 = 1 and c2 = 2 order by c3;
select id from table1 where c1 = 1 and c2 = 2 order by c3 desc;
# 排序未使用索引
select id from table1 order by c2, c3;
select id from table1 where c1 = 1 and c2 = 2 order by c3 asc , c4 desc;
select id from table1 where c1 = 1 and c2 >= 2 order by c3;
select id from table1 where c1 = 1 and c2 = 2 order by c4;
select id from table1 where c2 = 2 order by c1;

总结排序使用索引的条件

如果存在where条件，所有where条件必须等值查询，且满足最左前缀，顺序可互换
排序列的顺序必须与列索引顺序完全一致，且排序方向一致(不可同时出现asc和desc)
关联查询时，order by 子句的列必须全部为第一张表的，且满足1，2才能使用索引排序
无论是否有索引，where条件的顺序不影响查询效率

索引案例

1. 考虑支持多种过滤条件，而不是随便创建符合条件的索引

-- eg 存在如下索引
create index idx_sex_country_age on people(sex, country, age)
-- 想查询住在China的20岁的人，需要创建索引 idx_country_age ? no, 可以绕过sex，使用idx_sex_country_age
select * from people where sex in (0, 1) and country="China" and age = 20;
-- 绕过的选项列表不宜过多

2. 尽量将范围查询索引后置

年龄列经常被范围查询，age在索引中要尽量放在后面，尽可能的让优化器使用更多的索引。

能用IN的尽量用IN，因为IN是等值查询，不受B-tree索引第三条的限制。但是也要考虑IN的值个数。

3. 避免多个范围查询

维护索引和表

检查表

check table xxx;
repair table xxx; -- innodb 不支持
alter table xxx engine=INNODB; -- 不做任何操作重建InnoDB表

更新索引统计信息

-- 查看表索引的详细信息
show index from xxxx;
-- 或者
select * from information_schema.STATISTICS where TABLE_NAME='xxxx';

-- 更新索引统计信息, 一般系统默认会开启自动更新统计信息
analyze table xxxx;

减少索引和数据的碎片

optimize table xxxx;
-- 不支持优化的引擎可通过如下操作
alter table xxx engine=INNODB; -- 不做任何操作重建InnoDB表

索引使用频率

# 查看所有未使用到的索引
select * from sys.schema_unused_indexes;
# 查看整体索引使用情况
show status like '%Handler_read%'

查看索引使用

Handler_read_key这个值代表了一个行将索引值读的次数，很低的值表明增加索引得到的性能改善不高，因为索引并不经常使用。
Handler_read_rnd_next 的值高则查询低效，并且应该建立索引补救。这个值是指在数据文件中读下一行的请求数。如果正进行大量的表扫描，Handler_read_rnd_next的值较高，则通常说明表索引不正确或查询没有利用索引

总结

单行访问是很慢的，如果服务器从存储中读取一个数据块只是为了获取其中的一行，那么就浪费了很多工作。最好读取的块中包含尽可能多所需要的行。使用索引可以创建位置索引用以提升效率。
按顺序访问范围数据是很快的，原因有二，其一是不需要多次磁盘寻道，其二，如果服务器能够按需要顺序读取数据，那么就不需要额外的排序操作，并且GROUP BY 查询也无须再做排序和按行按组聚合计算了。
索引覆盖是很快的，不需要回表操作，避免了大量单行访问(随机磁盘IO)。