Mysql -SQL 优化及分析

最新推荐文章于 2023-05-25 17:08:59 发布

PowerMan_Frank

最新推荐文章于 2023-05-25 17:08:59 发布

阅读量114

点赞数

分类专栏：数据库/MySql

本文链接：https://blog.csdn.net/fuwenshen/article/details/108667446

版权

数据库/MySql 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

1.SQL优化
原因：性能低、执行时间太长、等待时间太长、SQL语句欠佳（连接查询）、索引失效、服务器参数设置不合理（缓冲、线程数）
SQL ：
编写过程：
select …from …join …on …where …group by …having …order by …limit …
解析过程：
from … on… join …where …group by …having …select …order by limit …

2.SQL性能问题
分析SQL的执行计划 : explain ，可以模拟SQL优化器执行SQL语句，知道自己编写的SQL状况
查询执行计划： explain +SQL语句 explain select * from tb ;

id : 编号
select_type ：查询类型
table ：表
type ：索引类型
possible_keys ：预测用到的索引不准
key ：实际使用的索引
key_len ：实际使用索引的长度
ref :表之间的引用关系
rows ：通过索引查询到的数据个数
Extra:额外的优化信息

(1)id: id值相同，从上往下顺序执行。
表的执行顺序因数量的个数改变而改变的原因：笛卡儿积
对内存来说，数据量越小越好,内存占用比较小,数据小的表优先查询

id值不同：id值越大越优先查询 (在嵌套子查询时，先查内层再查外层)

id值有相同，又有不同： id值越大越优先；id值相同，从上往下顺序执行

(2)select_type:查询类型
PRIMARY:包含子查询SQL中的主查询（最外层）
SUBQUERY：包含子查询SQL中的子查询（非最外层）
simple:简单查询（不包含子查询、union）
derived:衍生查询(使用到了临时表)
a.在from子查询中只有一张表
explain select cr.cname from ( select * from course where tid in (1,2) ) cr ;
主查询涉及到了衍生表，deriver2 2代表id为2的course来的
b.在from子查询中，如果有table1 union table2 ，则table1 就是derived,table2就是union
explain select cr.cname from ( select * from course where tid = 1 union select * from course where tid = 2 ) cr ;
union result :表示哪些表之间存在union查询

(3)type:索引类型、类型
system>const>eq_ref>ref>range>index>all 要对type进行优化的前提：有索引

system（忽略）: 只有一条数据的系统表；或衍生表只有一条数据的主查询
explain select * from (select * from test01 )t where tid =1 ;

const:仅仅能查到一条数据的SQL ,用于Primary key 或unique索引（类型与索引类型有关）
explain select tid from test01 where tid =1 ;

eq_ref:唯一性索引：对于每个索引键的查询，返回匹配唯一行数据（有且只有1个，不能多、不能0）
select … from …where name = … .常见于唯一索引和主键索引。
添加索引：
alter table teacherCard add constraint pk_tcid primary key(tcid);
alter table teacher add constraint uk_tcid unique index(tcid) ;

explain select t.tcid from teacher t,teacherCard tc where t.tcid = tc.tcid ;

如果teacher表的数据个数和连接查询的数据个数一致（3条数据），就有可能满足eq_ref级别；否则无法满足。

ref：非唯一性索引，对于每个索引键的查询，返回匹配的所有行（0，多）
insert into teacher values(4,‘tz’,4) ;
insert into teacherCard values(4,‘tz222’);

测试：
alter table teacher add index index_name (tname) ;
explain select * from teacher where tname = ‘tz’;

range：检索指定范围的行 ,where后面是一个范围查询(between ,> < >=, 特殊:in有时候会失效，从而转为无索引all)
alter table teacher add index tid_index (tid) ;
explain select t.* from teacher t where t.tid in (1,2) ;
explain select t.* from teacher t where t.tid ❤️ ;

index：查询全部索引中数据
explain select tid from teacher ; --tid 是索引，只需要扫描索引表，不需要所有表中的所有数据

all：查询全部表中的数据
explain select cid from course ; --cid不是索引，需要全表所有，即需要所有表中的所有数据

（4）possible_keys ：可能用到的索引，是一种预测，不准。
如果 possible_key/key是NULL，说明没用索引

（5） key ：实际使用到的索引

（6）key_len ：索引的长度 ;
作用：用于判断复合索引是否被完全使用（a,b,c）。
alter table test_kl add index index_name(name) ; --name可以为null
explain select * from test_kl where name =’’ ; – key_len :60
utf8：1个字符占3个字节

alter table test_kl add index index_name1(name1) ;
explain select * from test_kl where name1 =’’ ;
–索引字段可以为Null时,会使用1个字节用于标识

增加一个复合索引
alter table test_kl add index name_name1_index (name,name1) ;

explain select * from test_kl where name1 = '' ; 
explain select * from test_kl where name = '' ; --60

可变长度varchar(20)
alter table test_kl add column name2 varchar(20) ; --可以为Null
alter table test_kl add index name2_index (name2) ;

explain select * from test_kl where name2 = ‘’ ; --63
20*3=60 + 1(null) +2(用2个字节标识可变长度) =63

（7）Extra：
(a).using filesort ：性能消耗大；需要“额外”的一次排序（查询）。常见order by 语句中。
排序：先查询

explain select * from test02 where a1 =’’ order by a1 ;
如果排序和查找是同一个字段，不会出现using filesort

explain select * from test02 where a1 =’’ order by a2 ; --using filesort
如果排序和查找不是同一个字段，则会出现using filesort；
避免using filesort： where哪些字段，就order by那些字段

复合索引：不能跨列（最佳左前缀）
drop index idx_a1 on test02;
drop index idx_a2 on test02;
drop index idx_a3 on test02;

alter table test02 add index idx_a1_a2_a3 (a1,a2,a3) ;
explain select *from test02 where a1=’’ order by a3 ; --using filesort
explain select *from test02 where a2=’’ order by a3 ; --using filesort
explain select *from test02 where a1=’’ order by a2 ;
避免： where和order by 按照复合索引的顺序使用，不要跨列或无序使用。

(b). using temporary:性能损耗大 ，用到了临时表。一般出现在group by 语句中。
explain select a1 from test02 where a1 in ('1','2','3') group by a1 ;
explain select a1 from test02 where a1 in ('1','2','3') group by a2 ; --using temporary
避免：查询那些列，就根据那些列 group by 

(c). using index :性能提升; 只从索引文件中获取数据 （不需要回表查询）
	只要使用到的列 全部都在索引中，就是索引覆盖using index

例如：test02表中有一个复合索引(a1,a2,a3)
	explain select a1,a2 from test02 where a1='' or a2= '' ; --using index   

(d).using where （需要回表查询）
	假设age是索引列
	但查询语句select age,name from ...where age =...,回原表查Name，会显示using where.

(e). impossible where ： where子句永远为false
	explain select * from test02 where a1='x' and a1='y'  ;

6.优化案例
（1）单表优化
explain select bid from book where typeid in(2,3) and authorid=1 order by typeid desc ;
alter table book add index idx_bta (bid,typeid,authorid);
优化：加索引（根据SQL实际解析的顺序）
删掉之前的索引，防止干扰。
drop index idx_bta on book;

alter table book add index idx_tab (typeid,authorid,bid);

再次优化,因为范围查询in有时会失效，因此交换 索引的顺序，将typeid in(2,3) 放到最后。
drop index idx_tab on book;
alter table book add index idx_atb (authorid,typeid,bid);
explain select bid from book where  authorid=1 and  typeid in(2,3) order by typeid desc ;

--结论 1.保持索引的定义和使用的顺序一致性  2.索引逐步优化  3.将含In的范围查询 放到where条件的最后，防止失效。

本例中同时出现了Using where（需要回原表）; Using index（不需要回原表）：因为含in的范围查询已经使该typeid索引失效，因此相当于没有typeid这个索引，所以需要回原表（using where）；
没有了In，就不会出现using where
explain select bid from book where  authorid=1 and  typeid =3 order by typeid desc ;

（2）两表优化

左连接：
explain select *from teacher2 t left outer join course2 c
on t.cid=c.cid where c.cname=‘java’;

索引加在那张表上？  
-小表驱动大表原则，小表放左边，大表放右边 
小表：10
大表：300
where   小表.x 10 = 大表.y 300;  小表每执行一次，大表都要执行300次	
	大表.y 300=小表.x 10

–这里相当于2个for循环，最终都会循环3000次；但是对于双层循环来说：建议将数据小的循环放外层；数据大的循环放内层，外层数据越小，程序速度越快,
所以将数据量小的表放左边，一般情况对于左外连接，给左表加索引；右外连接，给右表加索引

alter table teacher2 add index index_teacher2_cid(cid) ;
alter table course2 add index index_course2_cname(cname);

Using join buffer:extra中的一个选项，作用：Mysql引擎使用了 连接缓存。

（3）三张表优化
1.小表驱动大表  2.索引建立在经常查询的字段上

7.避免索引失效的一些原则
（1）复合索引(a,b,c)
a.复合索引，不要跨列或无序使用

b.复合索引，尽量使用全索引匹配

（2）不要在索引上进行任何操作（计算、函数、类型转换），否则索引失效
	select ..where A.x = .. ;  --假设A.x是索引
	不要：select ..where A.x*3 = .. ;
	explain select * from book where authorid = 1 and typeid = 2 ;--用到了at2个索引
	explain select * from book where authorid = 1 and typeid*2 = 2 ;--用到了a1个索引
	explain select * from book where authorid*2 = 1 and typeid*2 = 2 ;----用到了0个索引
	explain select * from book where authorid*2 = 1 and typeid = 2 ;----用到了0个索引,因为对于复合索引，如果左边失效，右侧全部失效。(a,b,c)，例如如果 b失效，则b c同时失效。

（3）复合索引不能使用不等于（!=  <>）或is null (is not null)，否则自身以及右侧所有全部失效。
	复合索引中如果有>，则自身和右侧索引全部失效。
explain select * from book where authorid != 1 and typeid =2 ;
	

sql优化也是概率事件，概率情况(< > =)：因为服务层中有SQL优化器，可能会影响到优化
drop index idx_typeid on book;
drop index idx_authroid on book;
alter table book add index idx_book_at (authorid,typeid);
explain select * from book where authorid = 1 and typeid =2 ;--复合索引at全部使用
explain select * from book where authorid > 1 and typeid =2 ; --复合索引中如果有>，则自身和右侧索引全部失效。
explain select * from book where authorid = 1 and typeid >2 ;--复合索引at全部使用
更明显的概率情况
explain select * from book where authorid < 1 and typeid =2 ;--复合索引at只用到了1个索引
explain select * from book where authorid < 4 and typeid =2 ;--复合索引全部失效

--由于SQL优化器等原因  该结论不是100%正确，一般而言， 范围查询（> <  in），之后的索引会失效

（4）尽量使用索引覆盖（using index）
		复合索引（a,b,c）
select a,b,c from xx..where a=  .. and b =.. ;

(5) like尽量以“常量”开头，不要以'%'开头，否则索引失效
select * from xx where name like '%x%' ; --name索引失效

explain select * from teacher  where tname like '%x%'; --tname索引失效

explain select * from teacher  where tname like 'x%';

explain select tname from teacher  where tname like '%x%'; --如果必须使用like '%x%'进行模糊查询，可以使用索引覆盖 挽救一部分。


（6）尽量不要使用类型转换（显示、隐式），否则索引失效
explain select * from teacher where tname = 'abc' ;
explain select * from teacher where tname = 123 ;//程序底层将 123 -> '123'，即进行了类型转换，因此索引失效

（7）尽量不要使用or，否则索引失效
explain select * from teacher where tname ='' or tcid >1 ; --将or左侧的tname 失效。

8.一些其他的优化方法
（1）
exist和in
select …from table where exist (子查询) ;
select …from table where 字段 in (子查询) ;

如果主查询的数据集大，则使用In   ,效率高。
如果子查询的数据集大，则使用exist,效率高。	

（2）order by 优化
using filesort 底层有两种算法：双路排序、单路排序 （根据IO的次数区分）
MySQL4.1之前 默认使用 双路排序；双路：扫描2次磁盘（1：从磁盘读取排序字段 ,对排序字段进行排序（在buffer中进行的排序）   2：扫描其他字段 ）
	--IO较消耗性能
MySQL4.1之后 默认使用 单路排序  ： 只读取一次（全部字段），在buffer中进行排序。但种单路排序 不一定真的是“单路|1次IO”，有可能多次IO
            原因：如果数据量特别大，就无法 将所有字段的数据 一次性读取完毕，因此 会进行“分片读取、多次读取”。
	注意：单路排序 比双路排序 会占用更多的buffer。
		单路排序在使用时，如果数据大，可以考虑调大buffer的容量大小：  set max_length_for_sort_data = 1024  单位byte

如果max_length_for_sort_data值太低，则mysql会自动从 单路->双路   （太低：需要排序的列的总大小超过了max_length_for_sort_data定义的字节数）

提高order by查询的策略：
a.选择使用单路、双路 ；调整buffer的容量大小；
b.避免select * ...  
c.复合索引 不要跨列使用 ，避免using filesort
d.保证全部的排序字段 排序的一致性（都是升序 或 降序）