在一次的优化过程中,由于没有关注执行计划中type列,仅看key列来查看"使用到的索引",导致优化过程走了不少弯路。
以下面SQL为例:
SELECT wave_no,
SUM(IF(picking_qty IS NULL,0, picking_qty)) AS PICKED_QTY,
SUM(IF(differ_qty IS NULL,0, differ_qty)) AS PICKED_DIFFER_QTY,
SUM(IF(relocate_qty IS NULL,0, relocate_qty)) AS PICKED_RELOCATE_QTY
FROM picking_locate_d
WHERE yn= 0AND wave_no IN
('BC76361213164811','BC76361213164810',
...'BC76361213158692')
AND org_No= '661'AND distribute_No= '763'AND warehouse_No= '612'GROUP BY wave_no;
走索引查找的执行计划为:
+----+-------------+------------------+------------+-------+---------------+-------------+---------+------+-------+----------+------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------------+------------+-------+---------------+-------------+---------+------+-------+----------+------------------------------------+
| 1 | SIMPLE | picking_locate_d | NULL | range | idx_wave_no | idx_wave_no | 153 | NULL | 14238 | 0.01 | Using index condition; Using where |
+----+-------------+------------------+------------+-------+---------------+-------------+---------+------+-------+----------+------------------------------------+
走索引扫描执行计划为:
+----+-------------+------------------+------------+-------+---------------+-------------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------------+------------+-------+---------------+-------------+---------+------+----------+----------+-------------+
| 1 | SIMPLE | picking_locate_d | NULL | index | idx_wave_no | idx_wave_no | 153 | NULL | 37660147 | 0.01 | Using where |
+----+-------------+------------------+------------+-------+---------------+-------------+---------+------+----------+----------+-------------+
上面两个执行计划都使用索引idx_wave_no,但:
第一个执行计划影响行数为14238,与IN查询中的值数量相同,其执行计划type列值为range,表示index range scan。
第二个执行计划影响行数为37660147,与整表数据量相同,其执行计划type列为index,表示index scan。
哪为啥索引查找是index range scan呢?通过MySQL trace工具查看,其中输出包含以下信息:
"chosen_range_access_summary": {"range_access_plan": {"type": "range_scan","index": "idx_wave_no","rows": 5,"ranges": ["BC76361213164810 <= wave_no <= BC76361213164810","BC76361213164811 <= wave_no <= BC76361213164811","BC76361213158692 <= wave_no <= BC76361213158692"]/*ranges*/}/*range_access_plan*/,"rows_for_plan": 5,"cost_for_plan": 9.01,"chosen": true}/*chosen_range_access_summary*/
其中查询中WHERE子句:
wave_no IN
('BC76361213164811','BC76361213164810','BC76361213158692')
由于idx_wave_no为非唯一索引,虽然是等值查询,仍需要从第一个等于指定值的索引记录开始扫描,直到第一个不等于指定值的索引记录,因为被称为范围扫描(Range Scan) :
"ranges": ["BC76361213164810 <= wave_no <= BC76361213164810","BC76361213164811 <= wave_no <= BC76361213164811","BC76361213158692 <= wave_no <= BC76361213158692"]
IN子句中的3个值被转换为3次INDEX RANGE SCAN。
对于全索引扫描(INDEX SCAN),通过MySQL trace工具查看,其中输出包含以下信息:
"considered_execution_plans": [
{"plan_prefix": [
]/*plan_prefix*/,"table": "`picking_locate_d`","best_access_path": {"considered_access_paths": [
{"rows_to_scan": 37660147,"access_type": "scan","resulting_rows": 3.77e7,"cost": 9.58e6,"chosen": true,"use_tmp_table": true}
]/*considered_access_paths*/}/*best_access_path*/,"condition_filtering_pct": 100,"rows_for_plan": 3.77e7,"cost_for_plan": 9.58e6,"sort_cost": 3.77e7,"new_cost_for_plan": 4.72e7,"chosen": true}
]/*considered_execution_plans*/
其中access_type=scan表明操作为INDEX SCAN,rows_to_scan=37660147表名扫描整个索引上37660147行记录。
通过DESC或EXPLAIN输出的执行计划中,Type列的可选值分别对应:
all: 全表扫描
index: 索引全扫描
range: 索引范围扫描,常用语=,between等操作ref: 使用非唯一索引扫描或唯一索引前缀扫描,返回单条记录,常出现在关联查询中
eq_ref: 类似ref,区别在于使用的是唯一索引,使用主键的关联查询const/system: 单条记录,系统会把匹配行中的其他列作为常数处理,如主键或唯一索引查询null: MySQL不访问任何表或索引,直接返回结果