文章目录
- 1. SQL慢原因、MySQL瓶颈
- 2. 索引 - 排好序的快速查找数据结构
我的MySQL版本是5.7.29
1. SQL慢原因、MySQL瓶颈
2. 索引 - 排好序的快速查找数据结构
实际上索引也是一张表 - 保存了索引字段、指向数据记录的地址
2.0 基础知识 - 每条SQL只能使用一个索引
默认开启 主键索引和当前查询使用的索引合并
- 通俗理解索引: 查询英文单词cat,肯定是在字典索引页中查找到C开头的一块,然后在查找ca开头的单词,最后从中检索到cat单词。如果没有索引页。那么查找cat单词,你则需要从第一页开始查找是否有cat单词。
- 通常所说的索引: 一般是B+树(多路搜索树)索引 - 叶子节点才存储真实信息
- 索引劣势: 一旦建立索引,我们需要维护数据库文件、以及对应的索引文件,频繁的变动数据库数据(增删改),都要同时更新索引文件与数据库文件的一 一对应( 但利远大于弊 )
- 覆盖索引: 不包含where,则selec必须只能包含索引列
- 复合索引: 只能依次从左到右使用
- 建议:
1. 一张表建议不要超过5个索引、
2. 范围查询会导致部分索引列失效、
3. 表连接时小表驱动大表(后面的大表会使用到索引进行筛选行数据 - 加快检索效率)
2.1 索引原理 - 简单解析
图片来自:https://blog.csdn.net/v_JULY_v/article/details/6530142
B树 - 每检索一个节点,都有关键字对应的数据地址信息 - 缩小数据入内存的次数 - 最后搜索的终点不一定是叶子节点 - 访问磁盘的次数比B+树多。因为同大小的节点存储的索引信息比B+树少
每个节点存储的信息
B+树 - 非叶子节点只存储关键字以及子节点指针信息 - 最后找到叶子节点才是数据库数据地址信息 - 并且叶子节点有序且用链表相连起来 - 故与B树相比非叶子节点存储更小的信息,只具有查找下一个节点的作用而已 - 最后搜索的终点都是叶子节点
单列索引 - B+树检索
复合索引 - B+树的检索
叶子节点存储 - 真实数据加上对应的对应数据项的硬盘地址
2.2 索引分类
2.3 创建索引情况 - 适合、不适合
字段值越是唯一,创建索引后查询的效率越高。公式 ( distinct(字段)/总记录数 - 越接近1,值的唯一性越高,创建索引后查询效率越高 )
2.3.1 索引失效
2.3.2 案例
1. 全值匹配 - 与索引列顺序必须一致,且不能跳过索引列
explain select * from member where name ='6190c108b6' ;
explain select * from member where name ='6190c108b6' and year = 57 ;
explain select * from member where name ='6190c108b6' and year = 57 and nickname='6a54008181' ;
2. 最佳左前缀原则 - 必须按索引列顺序进行where
explain select * from member where year = 57 ;
explain select * from member where nickname='6a54008181' ;
explain select * from member where name ='6190c108b6' and nickname='6a54008181' ;
3. where时不在索引列上做任何操作,否则索引失效
explain select * from member where name like'619%';
explain select * from member where name like concat(substr('6190c108b6', 1, 3), '%');
explain select * from member where substr(name, 1, 3) like '619%';
4. 覆盖索引:selec不能出现索引列
explain select * from member;
# 覆盖索引
explain select name from member;
explain select name,year from member;
explain select name,year,nickname from member;
5. 使用!=、<>、is not null、or都会导致索引失效 - is null会使用到索引下推
explain select * from member where name != '6190c108b6';
explain select * from member where name <> '6190c108b6';
explain select * from member where name is not null;
explain select * from member where name like '619%' or year > 25;
# 使用到索引下推ICP
explain select * from member where name is null;
6. like模糊匹配,尽量避免通配符在开头被使用
explain select *from member where name like '%619%';
explain select *from member where name like '619%';
explain select *from member where name like '61%9';
# 解决开头是通配符,索引失效的方法 - select时不要使用*,而是使用当前索引的索引列 - select不用遵从索引列顺序 - 一旦非当前索引列,索引列会失效
explain select id,name,year,nickname from member where name like '%619%'
explain select year from member where name like '%619%'
explain select year,nickname from member where name like '%619%'
#索引失效
explain select name,hobby from member where name like '%619%'
# select主键列也可使得索引不失效
explain select id from member where name like '%619%'
# 在添加一个 idx_name_hobby 索引 - 这样也可以使索引不失效 - 因为name已经架好梯子
explain select id,hobby from member where name like '%619%';
6. 字符串不加单引号,导致索引列隐式调用函数转换成字符、从而导致索引失效
explain select * from member where hobby = 1;
# 上面等价于:
# ① explain select * from member where hobby+0 = 1;
# ② explain select * from member where CONVERT(hobby, SIGNED) = 1;
explain select * from member where hobby = '1';
7. order by排序按前面已架好的梯子进行排序字段 - 否则失效"
explain select * from member where name = '6190c108b6' order by year;
explain select * from member where name = '6190c108b6' and nickname = '6a54008181' order by year;
# 没有用到索引排序
explain select * from member where name = '6190c108b6' order by nickname;
explain select * from member where name = '6190c108b6' order by year,nickname;
# 没有用到索引排序
explain select * from member where name = '6190c108b6' order by nickname,year;
# 没有用到索引排序 - 只有name用到索引排序,后面的二级排序需要数据行入内存进行快速排序
explain select * from member where name = '6190c108b6' order by name,nickname;
explain select * from member where name = '6190c108b6' order by name,year;
7. order by时,前面的where是范围查询的那个索引字段失效,则where塔好的路从范围查询索引字段不起作用,order by排序时需要重新按最左前缀进行排序字段 - 否则索引排序失效
# 强迫使用索引进行检索、排序数据
explain select * from member force index(idx_name_year_nickname) where name like '%619%' order by name;
# 不强迫使用索引、如果要用到索引,只能select索引列才可以
explain select id,name,year,nickname from member where name like '%619%' order by name;
explain select id,name,year,nickname from member where name like '%619%' order by name,year;
explain select id,name,year,nickname from member where name like '%619%' order by year;
explain select * from member where name like '619%' order by year;
explain select * from member where name like '619%' order by name,year;
# 如果order by 没有使用索引排序,可以强制使用索引排序
explain select * from member force index(idx_year_name) where year >50 order by year;
2.4 Explain、show Warnings - 分析SQL - 模拟优化器执行SQL语句
语法 - 下面两个语句必须同时选中运行
# 查看SQL语句使用索引的情况
explain SQL;
# 查看被优化器优化后的SQL语句
show warnings;
2.4.1 id - 表的读取顺序 - 越大越优先读取 - 相同则从上至下读取
表连接
2.4.2 select_type - 查询类型
explain select *from food where id = 1
union
select *from food where id = 2;
explain
select student.name, (select name from class where id = student.class_id) className
from student,food where student.food_id = food.id
2.4.3 type - 联接类型
1. 经常出现的type类型性能:system > const > eq_ref > ref > range > index > all
2. 保证每表查询至少达到range级别,最好能达到ref级别 - All级别尽量的消除
从最好到最坏查询性能由上至下排列
explain
select *from student where id = 1;
# name字段有创建索引 - 如果name不是索引则不会是ref,而是全表检索All
explain select *
from student,food
where student.food_id = food.id and student.name = 'lrc';
explain
select *from student where id between 1 and 10;
这两种情况不知道怎么复现 - 复现不了
# unique_subquery
value IN (SELECT 主键/唯一键列 FROM single_table WHERE 筛选表达式)
# index_subquery
value IN (SELECT 索引列 FROM single_table WHERE some_expr)
2.4.4 ref - 筛选值取自哪个表的那列数据
explain select stu.name, (select id from food where food.id = stu.food_id)
from student stu
2.4.5 extra - 查询时的详细信息
想要查询快 - 尽量的解决掉 using filesort、using temporary 字符串出现
2.4.5.1 Using index condition - 索引下推 - 子节点进行判断是否符合在进行IO查找
- 减少IO请求访问基表的次数,优势就是在找到索引以及准备访问数据库时在进行一次where筛选。
- 索引下堆的只是索引列字段哦,非索引列字段是不可以下推的
- 触发条件:①非聚集索引(主键、外键索引)
ICP的执行流程
2.4.5.2 案例
not exists
explain SELECT * from student LEFT JOIN test ON student.food_id=test.id
WHERE test.id is null;
/*
在这种情况下,MySQL使用t1.id的值扫描t1并查找t2中的行。如果MySQL在t2中发现一个匹配的行,它知道t2.id绝不会为NULL,并且不再扫描t2内有相同的id值的行。换句话说,对于t1的每个行,MySQL只需要在t2中查找一次,无论t2内实际有多少匹配的行。*/
using index condition - 索引下堆的只是索引列字段哦,非索引列字段是不可以下推的
# 索引列只使用到name
explain select * from member where name = 'lrc' and nickname like '%f%';
# 索引列使用到name、nickname
explain select * from member where name = 'lrc' and nickname like 'f%';
2.4.6 key_len - 索引长度
2.4.7 explain案例
explain select id from student;
explain select * from student;
explain
select * from student
left join class on class_id = class.id
left join food on food_id = food.id
where student.id >=2;
# 非索引列
explain select count(year) from student;
# 主键索引列
explain select count(id) from student;
1. 如果没有where筛选语句,要使用到索引,则select的列必须是索引列,只要含有非索引列都会导致索引失效
# member有 主键索引id,普通复索引idx_name_nickname(name, nickname))
explain select * from member where name = 'lrc';
explain select * from member where name = 'lrc' and nickname = 'cc1';
explain select * from member where nickname = 'cc1';
explain select name,nickname from member;
explain select nickname from member;
explain select name,nickname,year from member;
2. 外键索引失效情况 - where仅是外键情况且select只能外键列(外键索引才起效)
explain select * from student where food_id >1;
explain select food_id from student where food_id >1;
explain select * from student where name like 'l%'
explain select food_id,name,nickname from student where food_id >1;
# name列有创索引,但依然不起作用
explain select food_id,name from student;
3. MySQL根据情况是否使用索引进行完全加载表数据
# 背景Member表有100条数据,Class表有5条数据,food表5条数据
explain select * from member left join class on class_id = class.id;
explain select * from class left join member on class_id = class.id;
select * from
class left join member on member.class_id = class.id
left join food on member.food_id = food.id;
2.4.8 查询优化
如果max_length_for_sort_data设得太高,导致数据行不能在sort_buffer_size一次性在运存中排好序,则会将排序数据拆分成多个临时文件。这对单路排序是很不好的,因为要频繁的读取临时文件。很大可能单路排序的临时文件多于双路排序的。因为单路排序是整条数据行记录读入内存,而双路排序只将排序列的值读入内存。双路排序这从根本上减少了临时文件的生成 。
2.4.8.1 双路排序、单路排序
如果需要输出的行记录数据大于max_length_for_sort_data,则使用双路进行排序输出
双路排序 - 排序内存中行记录地址、需要排序的列值
取出符合where筛选条件的行记录地址以及需要被排序的列值,在内存中进行排序。排好序,根据行记录地址在从数据库取出完成的行记录 - 使用运行内存少,但访问磁盘频繁 - 最少都要访问两次数据库表
单路排序 - 整行记录的列值,而不是需要排序的列值
取出符合where筛选条件的完整行记录,在内存中进行排序。排好序,直接读取在内存排好序的行记录即可 - 使用运行内存大,但访问磁盘次数少 - 运存充足访问一次数据库表即可
2.4.8.2 order by案例
explain select * from member where year > 20 order by year;
# 需要强制使用索引进行排序才能生效
explain select * from member force index(idx_year_name_nickname) where year > 20 order by year;
# 排序字段没有遵守最左前缀原则
explain select * from member force index(idx_year_name_nickname) where year > 20 order by name;
explain select * from member force index(idx_year_name_nickname) where year > 20 order by year,name;
explain select * from member force index(idx_year_name_nickname) where year > 20 order by year,name,nickname;
explain select *from member force index(idx_year_name_nickname) order by year;
explain select *from member force index(idx_year_name_nickname) order by year desc, name desc;
# 无效,排序列的升、降序不一致,故会文件排序而不是索引排序
explain select *from member force index(idx_year_name_nickname) order by year desc, name asc;
2.4.8.3 group by案例
explain select class_id,max(year) from member group by class_id;
2.4.8.4 in、exists案例
# 主查询的数据多于子查询的数据,则使用in
# 因为只要加载子表的记录入内存比较即可,字表记录数较少
select id from A where id in (select * from B)
# 子查询的数据多于主查询的数据,则使用exists - 只返回true\false而不是子表的记录数
# 因为子查询只要检索N次主查询的记录数
select id from A where exists (select 1 from B where B.id = A.id)