mysql 语句优化实例_MySQL优化实例及知识点

weixin_39700215

于 2021-01-19 22:46:41 发布

阅读量96

点赞数

文章标签： mysql 语句优化实例

本文链接：https://blog.csdn.net/weixin_39700215/article/details/113289552

版权

MySQL优化案例

1.单表优化

2.两表优化

3.三表优化

1.单表优化

create table book

(

bid int(4) primary key,

name varchar(20) not null,

authorid int(4) not null,

publicid int(4) not null,

typeid int(4) not null

);

insert into book values(1,'tjava',1,1,2) ;

insert into book values(2,'tc',2,1,2) ;

insert into book values(3,'wx',3,2,1) ;

insert into book values(4,'math',4,2,3) ;

commit;

查询authorid=1且 typeid为2或3的bid

explain select bid from book where typeid in(2,3) and authorid=1 order by typeid desc ;

(a,b,c)

(a,b)

优化：加索引

alter table book add index idx_bta (bid,typeid,authorid);

索引一旦进行升级优化，需要将之前废弃的索引删掉，防止干扰。

drop index idx_bta on book;

根据SQL实际解析的顺序，调整索引的顺序：

alter table book add index idx_tab (typeid,authorid,bid); --虽然可以回表查询bid，但是将bid放到索引中可以提升使用using index ;

再次优化(之前是index级别)：思路。因为范围查询in有时会实现，因此交换索引的顺序，将typeid in(2,3) 放到最后。

drop index idx_tab on book;

alter table book add index idx_atb (authorid,typeid,bid);

explain select bid from book where authorid=1 and typeid in(2,3) order by typeid desc ;

--小结：

a.最佳做前缀，保持索引的定义和使用的顺序一致性

b.索引需要逐步优化

c.将含In的范围查询放到where条件的最后，防止失效。

本例中同时出现了Using where(需要回原表);

Using index(不需要回原表)：原因，where authorid=1 and typeid in(2,3)中authorid在索引(authorid,typeid,bid)中，因此不需要回原表(直接在索引表中能查到)；而typeid虽然也在索引(authorid,typeid,bid)中，但是含in的范围查询已经使该typeid索引失效，因此相当于没有typeid这个索引，所以需要回原表(using where)；

例如以下没有了In，则不会出现using where

explain select bid from book where authorid=1 and typeid =3 order by typeid desc ;

还可以通过key_len证明In可以使索引失效。

2.两表优化

create table teacher2

(

tid int(4) primary key,

cid int(4) not null

);

insert into teacher2 values(1,2);

insert into teacher2 values(2,1);

insert into teacher2 values(3,3);

create table course2

(

cid int(4) ,

cname varchar(20)

);

insert into course2 values(1,'java');

insert into course2 values(2,'python');

insert into course2 values(3,'kotlin');

commit;

左连接：

explain select *from teacher2 t left outer join course2 c

on t.cid=c.cid where c.cname='java';

索引往哪张表加？ -小表驱动大表

-索引建立经常使用的字段上 (本题 t.cid=c.cid可知，t.cid字段使用频繁，因此给该字段加索引) [一般情况对于左外连接，给左表加索引；右外连接，给右表加索引]

小表：10

大表：300

where 小表.x 10 = 大表.y 300; --循环了几次？10

大表.y 300=小表.x 10--循环了300次

小表:10

大表:300

select ...where 小表.x10=大表.x300 ;

for(int i=0;i

{

for(int j=0;j

{

...

}

select ...where 大表.x300=小表.x10 ;

for(int i=0;i

{

for(int j=0;j

{

...

}

--以上2个FOR循环，最终都会循环3000次；但是对于双层循环来说：一般建议将数据小的循环放外层；数据大的循环放内存。

--当编写 ..on t.cid=c.cid 时，将数据量小的表放左边(假设此时t表数据量小)

alter table teacher2 add index index_teacher2_cid(cid) ;

alter table course2 add index index_course2_cname(cname);

Using join buffer:extra中的一个选项，作用：Mysql引擎使用了连接缓存。

3.三张表优化A B C

a.小表驱动大表

b.索引建立在经常查询的字段上

示例：

create table test03

(

a1 int(4) not null,

a2 int(4) not null,

a3 int(4) not null,

a4 int(4) not null

);

alter table test03 add index idx_a1_a2_a3_4(a1,a2,a3,a4) ;

explain select a1,a2,a3,a4 from test03 where a1=1 and a2=2 and a3=3 and a4 =4 ; --推荐写法，因为索引的使用顺序(where后面的顺序) 和复合索引的顺序一致

explain select a1,a2,a3,a4 from test03 where a4=1 and a3=2 and a2=3 and a1 =4 ; --虽然编写的顺序和索引顺序不一致，但是 sql在真正执行前经过了SQL优化器的调整，结果与上条SQL是一致的。

--以上 2个SQL，使用了全部的复合索引

explain select a1,a2,a3,a4 from test03 where a1=1 and a2=2 and a4=4 order by a3;

--以上SQL用到了a1 a2两个索引，该两个字段不需要回表查询using index ;而a4因为跨列使用，造成了该索引失效，需要回表查询因此是using where；以上可以通过 key_len进行验证

explain select a1,a2,a3,a4 from test03 where a1=1 and a4=4 order by a3;

--以上SQL出现了 using filesort(文件内排序，“多了一次额外的查找/排序”) ：不要跨列使用( where和order by 拼起来【a1,a4,a3 跨列了】，不要跨列使用)

explain select a1,a2,a3,a4 from test03 where a1=1 and a4=4 order by a2 , a3; --不会using filesort

-总结：

i.如果 (a,b,c,d)复合索引和使用的顺序全部一致(且不跨列使用)，则复合索引全部使用。如果部分一致(且不跨列使用)，则使用部分索引。

select a,c where a = and b= and d=

ii.where和order by 拼起来，不要跨列使用

using temporary:需要额外再多使用一张表. 一般出现在group by语句中；已经有表了，但不适用，必须再来一张表。

解析过程：

from .. on.. join ..where ..group by ....having ...select dinstinct ..order by limit ...

(1). explain select * from test03 where a2=2 and a4=4 group by a2,a4 ;--没有using temporary

(2)explain select * from test03 where a2=2 and a4=4 group by a3 ; --有using temporary【a3的原因(因为通过索引a2,a4后还要再来一张表排序a3)】

避免索引失效的一些原则

(1)复合索引

a.复合索引，不要跨列或无序使用(最佳左前缀)

(a,b,c)

b.复合索引，尽量使用全索引匹配

(a,b,c)

(2)不要在索引上进行任何操作(计算、函数、类型转换)，否则索引失效

select ..where A.x = .. ; --假设A.x是索引

不要：select ..where A.x*3 = .. ;

explain select * from book where authorid = 1 and typeid = 2 ;--用到了at2个索引

explain select * from book where authorid = 1 and typeid*2 = 2 ;--用到了a1个索引

explain select * from book where authorid*2 = 1 and typeid*2 = 2 ;----用到了0个索引

explain select * from book where authorid*2 = 1 and typeid = 2 ;----用到了0个索引,原因：对于复合索引，如果左边失效，右侧全部失效。

(a,b,c)，例如如果 b失效，则b c同时失效。

drop index idx_atb on book ;

alter table book add index idx_authroid (authorid) ;

alter table book add index idx_typeid (typeid) ;

explain select * from book where authorid*2 = 1 and typeid = 2 ;

(3)复合索引不能使用不等于(!= <>)或is null (is not null)，否则自身以及右侧所有全部失效。

复合索引中如果有>，则自身和右侧索引全部失效。

explain select * from book where authorid = 1 and typeid =2 ;

-- SQL优化，是一种概率层面的优化。至于是否实际使用了我们的优化，需要通过explain进行推测。

explain select * from book where authorid != 1 and typeid =2 ;

explain select * from book where authorid != 1 and typeid !=2 ;

体验概率情况(< > =)：原因是服务层中有SQL优化器，可能会影响我们的优化。

drop index idx_typeid on book;

drop index idx_authroid on book;

alter table book add index idx_book_at (authorid,typeid);

explain select * from book where authorid = 1 and typeid =2 ;--复合索引at全部使用

explain select * from book where authorid > 1 and typeid =2 ; --复合索引中如果有>，则自身和右侧索引全部失效。

explain select * from book where authorid = 1 and typeid >2 ;--复合索引at全部使用

----明显的概率问题---

explain select * from book where authorid < 1 and typeid =2 ;--复合索引at只用到了1个索引

explain select * from book where authorid < 4 and typeid =2 ;--复合索引全部失效

--我们学习索引优化，是一个大部分情况适用的结论，但由于SQL优化器等原因该结论不是100%正确。

--一般而言，范围查询(> < in)，之后的索引失效。

(4)补救。尽量使用索引覆盖(using index)

(a,b,c)

select a,b,c from xx..where a= .. and b =.. ;

(5)like尽量以“常量”开头，不要以'%'开头，否则索引失效

select * from xx where name like '%x%' ; --name索引失效

explain select * from teacher where tname like '%x%'; --tname索引失效

explain select * from teacher where tname like 'x%';

explain select tname from teacher where tname like '%x%'; --如果必须使用like '%x%'进行模糊查询，可以使用索引覆盖挽救一部分。[tname字段上有索引，查询tname字段.]

(6)尽量不要使用类型转换(显示、隐式)，否则索引失效

explain select * from teacher where tname = 'abc' ;

explain select * from teacher where tname = 123 ;//程序底层将 123 -> '123'，即进行了类型转换，因此索引失效

(7)尽量不要使用or，否则索引失效

explain select * from teacher where tname =''" or tcid >1 ; --将or左侧的tname 失效。

一些其他的优化方法

(1)exist和in

select ..from table where exist (子查询) ;

select ..from table where 字段 in (子查询) ;

如果主查询的数据集大，则使用In ,效率高。

如果子查询的数据集大，则使用exist,效率高。

exist语法：将主查询的结果，放到子查需结果中进行条件校验(看子查询是否有数据，如果有数据则校验成功) ，

如果复合校验，则保留数据；

select tname from teacher where exists (select * from teacher) ;

--等价于select tname from teacher

select tname from teacher where exists (select * from teacher where tid =9999) ;

in:

select ..from table where tid in (1,3,5) ;

(2)order by 优化

using filesort 有两种算法：双路排序、单路排序 (根据IO的次数)

MySQL4.1之前默认使用双路排序；双路：扫描2次磁盘(1：从磁盘读取排序字段 ,对排序字段进行排序(在buffer中进行的排序) 2：扫描其他字段 )

--IO较消耗性能

MySQL4.1之后默认使用单路排序：只读取一次(全部字段)，在buffer中进行排序。但这种单路排序会有一定的隐患 (不一定真的是“单路|1次IO”，有可能多次IO)。原因：如果数据量特别大，则无法将所有字段的数据一次性读取完毕，因此会进行“分片读取、多次读取”。

注意：单路排序比双路排序会占用更多的buffer。

单路排序在使用时，如果数据大，可以考虑调大buffer的容量大小： set max_length_for_sort_data = 1024 单位byte

查看： show variables like '%max_length_for_sort_data%'

如果max_length_for_sort_data值太低，则mysql会自动从单路->双路 (太低：需要排序的列的总大小超过了max_length_for_sort_data定义的字节数)

提高order by查询的策略：

a.选择使用单路、双路；调整buffer的容量大小；

b.避免select * ...

c.复合索引不要跨列使用，避免using filesort

d.保证全部的排序字段排序的一致性(都是升序或降序)

SQL排查 - 慢查询日志

MySQL提供的一种日志记录，用于记录MySQL种响应时间超过阈值的SQL语句 (long_query_time，默认10秒)

慢查询日志默认是关闭的；建议：开发调优是打开，而最终部署时关闭。

检查是否开启了慢查询日志： show variables like '%slow_query_log%' ;

临时开启【数据库服务重新启动就会失效】：

set global slow_query_log = 1 ; --在内存种开启

exit

service mysql restart

永久开启：

/etc/my.cnf 中追加配置：

vi /etc/my.cnf

[mysqld]

slow_query_log=1

slow_query_log_file=/var/lib/mysql/localhost-slow.log

慢查询阈值：

show variables like '%long_query_time%' ;

临时设置阈值：

set global long_query_time = 5 ; --设置完毕后，重新登陆后起效 (不需要重启服务)

永久设置阈值：

/etc/my.cnf 中追加配置：

vi /etc/my.cnf

[mysqld]

long_query_time=3

select sleep(4)【查询休息4秒】;

select sleep(5);

select sleep(3);

--查询超过阈值的SQL： show global status like '%slow_queries%' ;

(1)慢查询的sql被记录在了日志中，因此可以通过日志查看具体的慢SQL。

cat /var/lib/mysql/localhost-slow.log

(2)通过mysqldumpslow工具查看慢SQL,可以通过一些过滤条件快速查找出需要定位的慢SQL

mysqldumpslow --help

s：排序方式

r:逆序

l:锁定时间

g:正则匹配模式

--获取返回记录最多的3个SQL

mysqldumpslow -s r -t 3 /var/lib/mysql/localhost-slow.log

--获取访问次数最多的3个SQL

mysqldumpslow -s c -t 3 /var/lib/mysql/localhost-slow.log

--按照时间排序，前10条包含left join查询语句的SQL

mysqldumpslow -s t -t 10 -g "left join" /var/lib/mysql/localhost-slow.log

语法：

mysqldumpslow 各种参数慢查询日志的文件

分析海量数据

a.模拟海量数据

存储过程(无return)/存储函数(有return)

create database testdata ;

use testdata

create table dept

(

dno int(5) primary key default 0,

dname varchar(20) not null default '',

loc varchar(30) default ''

)engine=innodb default charset=utf8;

create table emp

(

eid int(5) primary key,

ename varchar(20) not null default '',

job varchar(20) not null default '',

deptno int(5) not null default 0

)engine=innodb default charset=utf8;

通过存储函数插入海量数据：

创建存储函数：

randstring(6) ->aXiayx 用于模拟员工名称

delimiter $

create function randstring(n int) returns varchar(255)

begin

declare all_str varchar(100) default 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' ;

declare return_str varchar(255) default '' ;

declare i int default 0 ;

while i

set return_str = concat( return_str,substring(all_str, FLOOR(1+rand()*52) ,1) );

set i=i+1 ;

end while ;

return return_str;

end $

FLOOR()函数向下取整 FLOOR(4.2)就是4

自定义randstring(n)函数返回一个n位的字符串

--如果报错：You have an error in your SQL syntax，说明SQL语句语法有错，需要修改SQL语句；

如果报错This function has none of DETERMINISTIC, NO SQL, or READS SQL DATA in its declaration and binary logging is enabled (you *might* want to use the less safe log_bin_trust_function_creators variable)

是因为存储过程/存储函数在创建时与之前的开启慢查询日志冲突了

解决冲突：

临时解决( 开启log_bin_trust_function_creators )

show variables like '%log_bin_trust_function_creators%';

set global log_bin_trust_function_creators = 1;

永久解决：

/etc/my.cnf

[mysqld]

log_bin_trust_function_creators = 1

--产生随机整数

create function ran_num() returns int(5)

begin