0 数据基础
数据大概200多w,用测试服务器进行测试。导出来的数据是一个堆表,没有主键,没有索引。有聚集索引的表叫聚集表。没有聚集索引的表叫堆表(heap table)。
select CustName,count(1) c
from WorkOrder
where CreateDate>'2016-5-1' and CreateDate<'2017-1-1'
group by CustName having c>100
order by c desc;
mysql> show index from WorkOrder; 查询索引方法1
Empty set (0.00 sec)
mysql> show keys from WorkOrder; 查询索引方法2
Empty set (0.00 sec)
1 堆表的情况
用执行计划分析下语句。
explain select CustName,count(1) c from WorkOrder
where CreateDate>'2016-5-1' and CreateDate<'2017-1-1'
group by CustName
having c>100
order by c desc;
+----+-------------+-----------+------+---------------+------+---------+------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+------+---------------+------+---------+------+---------+----------------------------------------------+
| 1 | SIMPLE | WorkOrder | ALL | NULL | NULL | NULL | NULL | 2528727 | Using where; Using temporary; Using filesort |
+----+-------------+-----------+------+---------------+------+---------+------+---------+----------------------------------------------+
1 row in set
select_type
的值为SIMPLE
,表示简单的select
查询,不使用union
或子查询。
type
的值为ALL,表示要对表进行表扫描。
possible_keys
表示能使用哪个索引找到行记录。
key
表示MySQL决定使用的索引(键)。
key_len
表示MySQL决定使用索引的长度。
ref
表示使用哪个列和key
一起从表中选择行。
rows
表示MySQL认为它执行查询时必须检查的行数。
extra
表示查询的详情信息,用到where
,临时表,排序。
执行下该语句三次,发现执行了16.30 sec、16.34 sec、16.24 sec。
2 有索引的情况
建了四个索引,分别以custname
,CreateDate
建两个单列索引,另外两个是联合索引,只是最左边列不一样。
alter table WorkOrder add index ix_name(custname)
alter table WorkOrder add index ix_date(CreateDate)
alter table WorkOrder add index ix_namedate(custname,CreateDate)
alter table WorkOrder add index ix_datename(CreateDate,custname)
查询一下索引:
show keys from WorkOrder;
+-----------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| WorkOrder | 1 | ix_name | 1 | CustName | A | 1264363 | NULL | NULL | YES | BTREE | | |
| WorkOrder | 1 | ix_date | 1 | CreateDate | A | 2528727 | NULL | NULL | | BTREE | | |
| WorkOrder | 1 | ix_namedate | 1 | CustName | A | 1264363 | NULL | NULL | YES | BTREE | | |
| WorkOrder | 1 | ix_namedate | 2 | CreateDate | A | 2528727 | NULL | NULL | | BTREE | | |
| WorkOrder | 1 | ix_datename | 1 | CreateDate | A | 2528727 | NULL | NULL | | BTREE | | |
| WorkOrder | 1 | ix_datename | 2 | CustName | A | 2528727 | NULL | NULL | YES | BTREE | | |
+-----------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
6 rows in set (0.00 sec)
执行计划分析下sql查询语句:
explain select CustName,count(1) c
from WorkOrder
where CreateDate>'2016-5-1' and CreateDate<'2017-1-1'
group by CustName
having c>100
order by c desc;
+----+-------------+-----------+-------+-----------------------------------------+-------------+---------+------+--------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------+-----------------------------------------+-------------+---------+------+--------+-----------------------------------------------------------+
| 1 | SIMPLE | WorkOrder | range | ix_name,ix_date,ix_namedate,ix_datename | ix_datename | 4 | NULL | 824372 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+-----------+-------+-----------------------------------------+-------------+---------+------+--------+-----------------------------------------------------------+
1 row in set (0.01 sec)
从执行计划可以看出,MySQL从四个索引中选取了ix_datename
这个索引,type
为range
表示索引范围扫描。rows
的数量值没堆表的1/3。
执行语句三次,时间是 8.64 sec、8.61sec、8.55 sec。
2.1 force index()
我建了三个索引,那么我想用下另外三个索引怎么办?
这里可以用force index()
,这个指令可以指定本次查询强制使用哪个索引,因为MySQL优化器的选择并不是最优的索引。
explain select CustName,count(1) c
from WorkOrder force index(ix_namedate)
where CreateDate>'2016-5-1' and CreateDate<'2017-1-1'
group by CustName
having c>100
order by c desc;
+----+-------------+-----------+-------+---------------------------------+-------------+---------+------+---------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------+---------------------------------+-------------+---------+------+---------+-----------------------------------------------------------+
| 1 | SIMPLE | WorkOrder | index | ix_name,ix_namedate,ix_datename | ix_namedate | 307 | NULL | 2528727 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+-----------+-------+---------------------------------+-------------+---------+------+---------+-----------------------------------------------------------+
选用另一个联合索引 ix_namedate
,这次type
变为index
,可以这样理解,根据索引的顺序进行全表扫描,比ALL效率要高些,rows
的值和堆表的值差不多。
执行语句三次,时间是 7.84 sec、7.92 sec、7.84 sec。
explain select CustName,count(1) c
from WorkOrder force index(ix_name)
where CreateDate>'2016-5-1' and CreateDate<'2017-1-1'
group by CustName
having c>100
order by c desc;
+----+-------------+-----------+-------+---------------------------------+---------+---------+------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------+---------------------------------+---------+---------+------+---------+----------------------------------------------+
| 1 | SIMPLE | WorkOrder | index | ix_name,ix_namedate,ix_datename | ix_name | 303 | NULL | 2528727 | Using where; Using temporary; Using filesort |
+----+-------------+-----------+-------+---------------------------------+---------+---------+------+---------+----------------------------------------------+
1 row in set
选用另一个联合索引 ix_name
,这次type
是index
,可以这样理解,根据索引的顺序进行全表扫描,比ALL效率要高些,rows
的值和堆表的值差不多。
执行语句三次,时间是 1 min 28.17 sec、1 min 27.64 sec、1 min 27.58 sec。
explain select CustName,count(1) c
from WorkOrder force index(ix_date)
where CreateDate>'2016-5-1' and CreateDate<'2017-1-1'
group by CustName
having c>100
order by c desc;
+----+-------------+-----------+-------+-----------------------------------------+---------+---------+------+--------+-------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------+-----------------------------------------+---------+---------+------+--------+-------------------------------------------------------------------+
| 1 | SIMPLE | WorkOrder | range | ix_name,ix_date,ix_namedate,ix_datename | ix_date | 4 | NULL | 921062 | Using index condition; Using MRR; Using temporary; Using filesort |
+----+-------------+-----------+-------+-----------------------------------------+---------+---------+------+--------+-------------------------------------------------------------------+
选用另一个联合索引 ix_date
,这次type
是range
,表示索引范围扫描,rows
的值是堆表的1/3多些 。
执行语句三次,时间是 9.55 sec、9.52 sec、9.39 sec。
第1次 | 第2次 | 第3次 | |
堆表 | 16.3 | 16.34 | 16.24 |
ix_datename | 8.64 | 8.61 | 8.55 |
ix_namedate | 7.84 | 7.92 | 7.84 |
ix_name | 88.17 | 87.64 | 87.58 |
ix_date | 9.55 | 9.52 | 9.39 |
2.2 ignore index()
假如我不想用索引了怎么办?
可以使用ignore index()
,这个指令可以强制MySQL在查询时,不使用某索引。
explain select CustName,count(1) c
from WorkOrder ignore index(ix_date)
where CreateDate>'2016-5-1' and CreateDate<'2017-1-1'
group by CustName
having c>100
order by c desc;
+----+-------------+-----------+-------+---------------------------------+-------------+---------+------+--------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------+---------------------------------+-------------+---------+------+--------+-----------------------------------------------------------+
| 1 | SIMPLE | WorkOrder | range | ix_name,ix_namedate,ix_datename | ix_datename | 4 | NULL | 824372 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+-----------+-------+---------------------------------+-------------+---------+------+--------+-----------------------------------------------------------+
explain select CustName,count(1) c
from WorkOrder ignore index(ix_date,ix_name,ix_namedate,ix_datename)
where CreateDate>'2016-5-1' and CreateDate<'2017-1-1'
group by CustName
having c>100
order by c desc;
+----+-------------+-----------+------+---------------------------------+------+---------+------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+------+---------------------------------+------+---------+------+---------+----------------------------------------------+
| 1 | SIMPLE | WorkOrder | ALL | ix_name,ix_namedate,ix_datename | NULL | NULL | NULL | 2528727 | Using where; Using temporary; Using filesort |
+----+-------------+-----------+------+---------------------------------+------+---------+------+---------+----------------------------------------------+
- 上面第一个强制不使用
ix_date
索引,那么就MySQL就从剩下的三个索引中,选取他认为是最优的索引。 - 第二个时将4个索引都不使用,那么MySQL就进行全表扫描了。
3 总结
- MySQL的语句优化,没有绝对的正确,
explain
也只是给出个大致的方向,例如
key_len值小的,rows小的。 按理说,时间应该最短,效率最高。但是,实验中时间最少的却不是那个值最小的。 - 优化还需根据实际数据情况,例如,假如我
where
选取的时间范围变化,或者说CustName
的分布有些变化,可能跟刚才的实验,又会产生一定偏差; - 同样我还实验了,当给表加上主键时,整体的查询时间会缩短些。
4 附相关index命令
--删除主键
ALTER TABLE WorkOrder MODIFY id int(11); --1.先删除auto_increment
ALTER TABLE WorkOrder DROP PRIMARY KEY; --2.再删除主键
--删除索引
ALTER TABLE WorkOrder DROP index ix_datename;
Reference:https://www.baidu.com/link?url=BHAgDGO22duzUSLH_WuKoXBfQ6EOPM_21zKR4Jo-dUg9wHF0drPIqFRPaFCdfVsRyYZFPbiYk2KJEBtMqiwwxK&wd=&eqid=cd82753c0009bd26000000035fad1e6c