阿里云数据库挑战赛"SQL优化大师"获奖案例

最新推荐文章于 2024-09-04 10:42:38 发布

老叶茶馆_

最新推荐文章于 2024-09-04 10:42:38 发布

阅读量1k

点赞数

导读

作者：田帅萌

知数堂MySQL DBA班第9期优秀学员，现任职知数堂助教

一、前言

2017/07在阿里云举办的第一届“阿里云数据库挑战赛第一季“慢SQL性能优化赛”期间，我得到知数堂叶老师的鼎力相助，成功突破重围，过关斩将，获得“SQL优化大师”荣誉称号!

阿里云数据库挑战赛

第一季“SQL优化大师”

undefined

获奖链接:

https://yq.aliyun.com/roundtable/56333?spm=5176.100239.blogcont179173.12.WzLpKF

通过这次挑战赛的实践，加上中间叶老师的指导，让我增进了对SQL优化的认识。

在此，分享下我的SQL优化过程，希望能给各位提供一些SQL优化方面的思路，大家共同交流进步。

二、优化过程

1、优化前

原始SQL

select a.seller_id,a.seller_name,b.user_name,c.state  
from  a,b,c
where a.seller_name=b.seller_name and 
b.user_id=c.user_id and 
c.user_id=17 and
a.gmt_create BETWEEN DATE_ADD(NOW(), INTERVAL - 600 MINUTE) 
AND DATE_ADD(NOW(), INTERVAL 600 MINUTE)  
order by a.gmt_create

原始表结构

create table a(
id int auto_increment,
seller_id bigint,
seller_name varchar(100) collate utf8_bin ,
gmt_create varchar(30),
primary key(id)) character set utf8;

create table b (
id int auto_increment,
seller_name varchar(100),
user_id varchar(50),
user_name varchar(100),
sales bigint,
gmt_create varchar(30),
primary key(id)) character set utf8;

create table c (
id int auto_increment,
user_id varchar(50),
order_id  varchar(100),
state bigint,
gmt_create varchar(30),
primary key(id)) character set utf8;

2、优化前的SQL执行计划

explain select a.seller_id,a.seller_name,b.user_name,c.state  from  a,b,c
where  a.seller_name=b.seller_name  and    b.user_id=c.user_id   
and  c.user_id=17 and
a.gmt_create BETWEEN DATE_ADD(NOW(), 
INTERVAL - 600 MINUTE) AND  DATE_ADD(NOW(), INTERVAL 600 MINUTE)
order  by  a.gmt_create
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 16109
     filtered: 11.11
        Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: b
   partitions: NULL
         type: ALL         
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 16174         
     filtered: 100.00
        Extra: Using where; Using join buffer (Block Nested Loop)
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: c
   partitions: NULL
         type: ALL         
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 359382         
     filtered: 1.00
        Extra: Using where; Using join buffer (Block Nested Loop)

3、优化后

先看下经过优化后的终版SQL执行计划

mysql> explain select a.seller_id, a.seller_name,b.user_name,
c.state from a left join b
on (a.seller_name=b.seller_name)
left join c on (b.user_id=c.user_id)
where c.user_id='17'
and a.gmt_create BETWEEN DATE_ADD(NOW(), INTERVAL - 600 MINUTE)
AND DATE_ADD(NOW(), INTERVAL 600 MINUTE);
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b
   partitions: NULL
         type: ref         
possible_keys: i_seller_name,i_user_id
          key: i_user_id
      key_len: 3
          ref: const
         rows: 1         
     filtered: 100.00
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: c
   partitions: NULL
         type: ref         
possible_keys: i_user_id
          key: i_user_id          
      key_len: 3     
          ref: const
         rows: 1         
     filtered: 100.00
        Extra: Using index condition
*************************** 3. row ***************************     
           id: 1
  select_type: SIMPLE
        table: a
   partitions: NULL
         type: ref         
possible_keys: i_seller_name
          key: i_seller_name          
      key_len: 25      
          ref: test1.b.seller_name
         rows: 1         
     filtered: 11.11
        Extra: Using where

优化完后这个SQL毫秒级出结果（看下方profiling截图）

4、优化思路

硬件&系统环境

硬盘：SSD（pcie)

内存：16G

CPU：8核

操作系统：选择Centos7系统，xfs文件系统

内核参数做些调整：

vm.swappiness = 5 #建议设置5-10
io schedule选择 deadline/noop 之一

MySQL 版本选择

推荐MySQL 5.6以上的版本，最好是MySQL 5.7。

MySQL 5.6优化器增加了ICP、MRR、BKA等特性，5.7在性能上有更多提升。

MySQL参数调整

innodb_buffer_pool_size #物理内存的50% - 70%
innodb_flush_log_at_trx_commit = 1
innodb_max_dirty_pages_pct = 50 #建议不高于50
innodb_io_capacity = 5000 #SSD盘
  
#大赛要求关闭QC
query_cache_size = 0
query_cache_type = 0

还有一些标准参数调整，建议看看老叶的my.cnf 生成器，详见：

http://imysql.com/my-cnf-wizard.html

帅萌建议：上面这些基本优化建议可以听听叶老师的MySQL DBA优化课程，或者知数堂的公开课。

SQL调优过程详解

首先，我们看到原来的执行计划中3个表的查询都是全表扫描（type = ALL），所以先把关联查询字段以及WHERE条件中的字段加上索引。

1、添加索引

alter table a add index i_seller_name(seller_name);
alter table a add index i_seller_id(seller_id);
alter table b add index i_seller_name(seller_name);
alter table b add index i_user_id(user_id);
alter table c add index i_user_id(user_id);
alter table c add index i_state(state);

添加完索引后，再看下新的执行计划：

explain select  a.seller_id,
a.seller_name,b.user_name ,c.state from a  
left join b on (a.seller_name=b.seller_name)   
left join c on( b.user_id=c.user_id )  where c.user_id='17'  
and  a.gmt_create BETWEEN DATE_ADD(NOW(), 
INTERVAL - 600 MINUTE) AND  
DATE_ADD(NOW(), INTERVAL 600 MINUTE)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b
   partitions: NULL
         type: ref
possible_keys: i_user_id
          key: i_user_id
      key_len: 53
          ref: const
         rows: 1 
     filtered: 100.00
        Extra: NULL
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: c
   partitions: NULL
         type: ref
possible_keys: i_user_id
          key: i_user_id
      key_len: 53      
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
   partitions: NULL
         type: ref
possible_keys: i_seller_name
          key: i_seller_name
      key_len: 303
          ref: func
         rows: 947
     filtered: 11.11
        Extra: Using index condition; Using where

我们注意到执行计划中3个表的key_len列都太大了，最小也有53字节，最大303字节，要不要这么夸张啊～

2、修改字符集、修改字段数据类型

默认字符集是utf8（每个字符最多占3个字节），因为该表并不存储中文，因此只需要用latin1字符集（最大占1个字节）。

除此外，我们检查3个表的字段数据类型，发现有些varchar(100)的列实际最大长度并没这么大，有些实际存储datetime数据的却采用varchar(30)类型，有些用bigint/int就足够的也采用varchar类型，真是醉了。于是分别把这些数据类型改为更合适的类型。

修改表字符集和调整各个列数据类型很重要的作用是可以减小索引的key_len，从而减少关联的字段的字节，减少内存消耗。

优化后的表结构

CREATE TABLE `a` (
  `id` int NOT NULL AUTO_INCREMENT,
  `seller_id` int(6) DEFAULT NULL,
  `seller_name` char(8) DEFAULT NULL,
  `gmt_create` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `i_seller_id` (`seller_id`),
  KEY `i_seller_name` (`seller_name`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `b` (
  `id` int NOT NULL AUTO_INCREMENT,
  `seller_name` char(8) DEFAULT NULL,
  `user_id` smallint(5) DEFAULT NULL,
  `user_name` char(10) DEFAULT NULL,
  `sales` int(11) DEFAULT NULL,
  `gmt_create` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `i_seller_name` (`seller_name`),
  KEY `i_user_id` (`user_id`),
  KEY `i_user_name` (`user_name`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `c` (
  `id` int NOT NULL AUTO_INCREMENT,
  `user_id` smallint(5) DEFAULT NULL,
  `order_id` char(10) DEFAULT NULL,
  `state` int(11) DEFAULT NULL,
  `gmt_create` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `i_user_id` (`user_id`),
  KEY `i_state` (`state`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

以上是我在阿里云数据库挑战赛中的获奖案例，感谢在比赛过程中叶老师对我的提点和帮助，同时非常感谢知数堂教授SQL优化技能！

最后，我想说的是，只要掌握SQL优化的几个常规套路，你也可以完成绝大多数的SQL优化工作滴！

附录：3个表数据初始化

insert into a (seller_id,seller_name,gmt_create) values (100000,'uniqla','2017-01-01');
insert into a (seller_id,seller_name,gmt_create) values (100001,'uniqlb','2017-02-01');
insert into a (seller_id,seller_name,gmt_create) values (100002,'uniqlc','2017-03-01');
insert into a (seller_id,seller_name,gmt_create) values (100003,'uniqld','2017-04-01');
...重复N次写入

insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqla','1','a',1,now());
insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqlb','2','b',3,now());
insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqlc','3','c',1,now());
insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqld','4','d',4,now());
...重复N次写入

insert into c (user_id,order_id,state,gmt_create) values( 21,1,0 ,now() );
insert into c (user_id,order_id,state,gmt_create)  values( 22,2,0 ,now() );
insert into c (user_id,order_id,state,gmt_create)  values( 33,3,0 ,now() );
insert into c (user_id,order_id,state,gmt_create)  values( 43,4,0 ,now() );
...重复N次写入

分割线