MySQL 优化器⽣成执⾏计划的整个过程(optimizer trace)

qq_27502511

已于 2023-06-01 10:22:16 修改

阅读量729

点赞数 1

分类专栏： Mysql 文章标签： mysql 数学建模数据库

于 2023-05-31 17:32:13 首次发布

本文链接：https://blog.csdn.net/qq_27502511/article/details/130973745

版权

Mysql 专栏收录该内容

20 篇文章 0 订阅

订阅专栏

optimizer_trace是MySQL5.6及以上版本的一个特性，用于追踪查询优化的详细步骤。通过设置optimizer_trace为on，可以观察查询如何被优化，包括条件处理、索引选择和成本估算等。例如，一个包含多个搜索条件的查询，optimizer_trace会显示为何选择了特定的索引而不是其他选项或全表扫描，帮助理解MySQL的查询优化策略。

摘要由CSDN通过智能技术生成

optimizer trace

在MySQL 5.6以及之后的版本中才会有optimizer_trace。

mysql>	SHOW	VARIABLES	LIKE	'optimizer_trace';
+-----------------+--------------------------+
|	Variable_name  	|	Value                   	|
+-----------------+--------------------------+
|	optimizer_trace	|	enabled=off,one_line=off	|
+-----------------+--------------------------+
1	row	inset	(0.02	sec)

如果想打开这个功能，必须⾸先把enabled的值改为on，就像这样：

mysql>	SET	optimizer_trace="enabled=on";
Query	OK,	0	rows	affected	(0.00	sec)

完整的使⽤optimizer trace功能的步骤总结如下：

打开optimizer trace功能 (默认情况下它是关闭的):
SET optimizer_trace=“enabled=on”;
这⾥输⼊你⾃⼰的查询语句
SELECT …;
从OPTIMIZER_TRACE表中查看上⼀个查询的优化过程
SELECT * FROM information_schema.OPTIMIZER_TRACE;
可能你还要观察其他语句执⾏的优化过程，重复上边的第2、3步
…
当你停⽌查看语句的优化过程时，把optimizer trace功能关闭
SET optimizer_trace=“enabled=off”;

完整案例：

⼀个搜索条件⽐较多的查询语句，它的执⾏计划如下：

mysql>	EXPLAIN	SELECT	*	FROM	s1	WHERE
   	->    	key1	>	'z'	AND
   	->    	key2	<	1000000	AND
   	->    	key3	IN	('a',	'b',	'c')	AND
   	->    	common_field	=	'abc';
+----+-------------+-------+------------+-------+----------------------------+----------+---------+------+----
--+----------+------------------------------------+
|	id	|	select_type	|	table	|	partitions	|	type 	|	possible_keys             	|	key     	|	key_len	|	ref 	|
rows	|	filtered	|	Extra                             	|
+----+-------------+-------+------------+-------+----------------------------+----------+---------+------+----
--+----------+------------------------------------+
| 	1	|	SIMPLE     	|	s1   	|	NULL      	|	range	|	idx_key2,idx_key1,idx_key3	|	idx_key2	|	5      	|	NULL	|  
12	|    	0.42	|	Using	index	condition;	Using	where	|
+----+-------------+-------+------------+-------+----------------------------+----------+---------+------+----
--+----------+------------------------------------+
1	row	inset,	1	warning	(0.00	sec)

以看到该查询可能使⽤到的索引有3个，那么为什么优化器最终选择了idx_key2⽽不选择其他的索引或者直接全表扫描呢？这时候就可以通过otpimzer trace 功能来查看优化器的具体⼯作过程：

我们直接看⼀下通过查询OPTIMIZER_TRACE表得到的输出（我使⽤#后跟随注释的形式为⼤家解释了优化过程中的⼀些⽐较重要的点，⼤家重点关注⼀下）：
*************************** 1. row ***************************

分析的查询语句是什么
QUERY: SELECT * FROM s1 WHERE
key1 > ‘z’ AND
key2 < 1000000 AND
key3 IN (‘a’, ‘b’, ‘c’) AND
common_field = ‘abc’
优化的具体过程

TRACE:	{
 	"steps":	[
   	{
     	"join_preparation":	{    	#	prepare阶段
       	"select#":	1,
       	"steps":	[
         	{
           	"IN_uses_bisection":	true
         	},
         	{
           	"expanded_query":	"/*	select#1	*/	select	`s1`.`id`	AS	`id`,`s1`.`key1`	AS	`key1`,`s1`.`key2`	AS
`key2`,`s1`.`key3`	AS	`key3`,`s1`.`key_part1`	AS	`key_part1`,`s1`.`key_part2`	AS	`key_part2`,`s1`.`key_part3`
AS	`key_part3`,`s1`.`common_field`	AS	`common_field`	from	`s1`	where	((`s1`.`key1`	>	'z')	and	(`s1`.`key2`	<
1000000)	and	(`s1`.`key3`	in	('a','b','c'))	and	(`s1`.`common_field`	=	'abc'))"
         	}
       	]	/*	steps	*/
     	}	/*	join_preparation	*/
   	},
   	{
     	"join_optimization":	{   	#	optimize阶段
       	"select#":	1,
       	"steps":	[
         	{
           	"condition_processing":	{  	#	处理搜索条件
             	"condition":	"WHERE",
             	#	原始搜索条件
             	"original_condition":	"((`s1`.`key1`	>	'z')	and	(`s1`.`key2`	<	1000000)	and	(`s1`.`key3`	in
('a','b','c'))	and	(`s1`.`common_field`	=	'abc'))",
             	"steps":	[
               	{
                 	#	等值传递转换
                 	"transformation":	"equality_propagation",
                 	"resulting_condition":	"((`s1`.`key1`	>	'z')	and	(`s1`.`key2`	<	1000000)	and	(`s1`.`key3`	in
('a','b','c'))	and	(`s1`.`common_field`	=	'abc'))"
               	},
               	{
                 	#	常量传递转换   
                 	"transformation":	"constant_propagation",
                 	"resulting_condition":	"((`s1`.`key1`	>	'z')	and	(`s1`.`key2`	<	1000000)	and	(`s1`.`key3`	in
('a','b','c'))	and	(`s1`.`common_field`	=	'abc'))"
               	},
               	{
                 	#	去除没⽤的条件
                 	"transformation":	"trivial_condition_removal",
                 	"resulting_condition":	"((`s1`.`key1`	>	'z')	and	(`s1`.`key2`	<	1000000)	and	(`s1`.`key3`	in
('a','b','c'))	and	(`s1`.`common_field`	=	'abc'))"
               	}
             	]	/*	steps	*/
           	}	/*	condition_processing	*/
         	},
         	{
           	#	替换虚拟⽣成列
           	"substitute_generated_columns":	{
           	}	/*	substitute_generated_columns	*/
         	},
         	{
           	#	表的依赖信息
           	"table_dependencies":	[
             	{
               	"table":	"`s1`",
               	"row_may_be_null":	false,
               	"map_bit":	0,
               	"depends_on_map_bits":	[
               	]	/*	depends_on_map_bits	*/
             	}
           	]	/*	table_dependencies	*/
         	},
         	{
           	"ref_optimizer_key_uses":	[
           	]	/*	ref_optimizer_key_uses	*/
         	},
         	{
         
           	#	预估不同单表访问⽅法的访问成本
           	"rows_estimation":	[
             	{
               	"table":	"`s1`",
               	"range_analysis":	{
                 	"table_scan":	{  	#	全表扫描的⾏数以及成本
                   	"rows":	9688,
                   	"cost":	2036.7
                 	}	/*	table_scan	*/,
                 
                 	#	分析可能使⽤的索引
                 	"potential_range_indexes":	[
                   	{
                     	"index":	"PRIMARY",  	#	主键不可⽤
                     	"usable":	false,
                     	"cause":	"not_applicable"
                   	},
                   	{
                     	"index":	"idx_key2", 	#	idx_key2可能被使⽤
                     	"usable":	true,
                     	"key_parts":	[
                       	"key2"
                     	]	/*	key_parts	*/
                   	},
                   	{
                     	"index":	"idx_key1", 	#	idx_key1可能被使⽤
                     	"usable":	true,
                     	"key_parts":	[
                       	"key1",
                       	"id"
                     	]	/*	key_parts	*/
                   	},
                   	{
                     	"index":	"idx_key3", 	#	idx_key3可能被使⽤
                     	"usable":	true,
                     	"key_parts":	[
                       	"key3",
                       	"id"
                     	]	/*	key_parts	*/
                   	},
                   	{
                     	"index":	"idx_key_part", 	#	idx_keypart不可⽤
                     	"usable":	false,
                     	"cause":	"not_applicable"
                   	}
                 	]	/*	potential_range_indexes	*/,
                 	"setup_range_conditions":	[
                 	]	/*	setup_range_conditions	*/,
                 	"group_index_range":	{
                   	"chosen":	false,
                   	"cause":	"not_group_by_or_distinct"
                 	}	/*	group_index_range	*/,
                 
                 	#	分析各种可能使⽤的索引的成本
                 	"analyzing_range_alternatives":	{
                   	"range_scan_alternatives":	[
                     	{
                       	#	使⽤idx_key2的成本分析
                       	"index":	"idx_key2",
                       	#	使⽤idx_key2的范围区间
                       	"ranges":	[
                         	"NULL	<	key2	<	1000000"
                       	]	/*	ranges	*/,
                       	"index_dives_for_eq_ranges":	true,  	#	是否使⽤index	dive
                       	"rowid_ordered":	false,    	#	使⽤该索引获取的记录是否按照主键排序
                       	"using_mrr":	false,    	#	是否使⽤mrr
                       	"index_only":	false,   	#	是否是索引覆盖访问
                       	"rows":	12,    	#	使⽤该索引获取的记录条数
                       	"cost":	15.41, 	#	使⽤该索引的成本
                       	"chosen":	true 	#	是否选择该索引
                     	},
                     	{
                       	#	使⽤idx_key1的成本分析
                       	"index":	"idx_key1",
                       	#	使⽤idx_key1的范围区间
                       	"ranges":	[
                         	"z	<	key1"
                       	]	/*	ranges	*/,
                       	"index_dives_for_eq_ranges":	true,  	#	同上
                       	"rowid_ordered":	false,  	#	同上
                       	"using_mrr":	false,  	#	同上
                       	"index_only":	false,  	#	同上
                       	"rows":	266,  	#	同上
                       	"cost":	320.21,  	#	同上
                       	"chosen":	false,  	#	同上
                       	"cause":	"cost"  	#	因为成本太⼤所以不选择该索引
                     	},
                     	{
                       	#	使⽤idx_key3的成本分析
                       	"index":	"idx_key3",
                       	#	使⽤idx_key3的范围区间
                       	"ranges":	[
                         	"a	<=	key3	<=	a",
"b	<=	key3	<=	b",
"c	<=	key3	<=	c"
                       	]	/*	ranges	*/,
                       	"index_dives_for_eq_ranges":	true,  	#	同上
                       	"rowid_ordered":	false,  	#	同上
                       	"using_mrr":	false,  	#	同上
                       	"index_only":	false,  	#	同上
                       	"rows":	21,  	#	同上
                       	"cost":	28.21,  	#	同上
                       	"chosen":	false,  	#	同上
                       	"cause":	"cost"  	#	同上
                     	}
                   	]	/*	range_scan_alternatives	*/,
                   
                   	#	分析使⽤索引合并的成本
                   	"analyzing_roworder_intersect":	{
                     	"usable":	false,
                     	"cause":	"too_few_roworder_scans"
                   	}	/*	analyzing_roworder_intersect	*/
                 	}	/*	analyzing_range_alternatives	*/,
                 
                 	#	对于上述单表查询s1最优的访问⽅法
                 	"chosen_range_access_summary":	{
                   	"range_access_plan":	{
                     	"type":	"range_scan",
                     	"index":	"idx_key2",
                     	"rows":	12,
                     	"ranges":	[
                       	"NULL	<	key2	<	1000000"
                     	]	/*	ranges	*/
                   	}	/*	range_access_plan	*/,
                   	"rows_for_plan":	12,
                   	"cost_for_plan":	15.41,
                   	"chosen":	true
                 	}	/*	chosen_range_access_summary	*/
               	}	/*	range_analysis	*/
             	}
           	]	/*	rows_estimation	*/
         	},
         	{
           
           	#	分析各种可能的执⾏计划
           	#（对多表查询这可能有很多种不同的⽅案，单表查询的⽅案上边已经分析过了，直接选取idx_key2就好）
           	"considered_execution_plans":	[
             	{
               	"plan_prefix":	[
               	]	/*	plan_prefix	*/,
               	"table":	"`s1`",
               	"best_access_path":	{
                 	"considered_access_paths":	[
                   	{
                     	"rows_to_scan":	12,
                     	"access_type":	"range",
                     	"range_details":	{
                       	"used_index":	"idx_key2"
                     	}	/*	range_details	*/,
                     	"resulting_rows":	12,
                     	"cost":	17.81,
                     	"chosen":	true
                   	}
                 	]	/*	considered_access_paths	*/
               	}	/*	best_access_path	*/,
               	"condition_filtering_pct":	100,
               	"rows_for_plan":	12,
               	"cost_for_plan":	17.81,
               	"chosen":	true
             	}
           	]	/*	considered_execution_plans	*/
         	},
         	{
           	#	尝试给查询添加⼀些其他的查询条件
           	"attaching_conditions_to_tables":	{
             	"original_condition":	"((`s1`.`key1`	>	'z')	and	(`s1`.`key2`	<	1000000)	and	(`s1`.`key3`	in
('a','b','c'))	and	(`s1`.`common_field`	=	'abc'))",
             	"attached_conditions_computation":	[
             	]	/*	attached_conditions_computation	*/,
             	"attached_conditions_summary":	[
               	{
                 	"table":	"`s1`",
                 	"attached":	"((`s1`.`key1`	>	'z')	and	(`s1`.`key2`	<	1000000)	and	(`s1`.`key3`	in
('a','b','c'))	and	(`s1`.`common_field`	=	'abc'))"
               	}
             	]	/*	attached_conditions_summary	*/
           	}	/*	attaching_conditions_to_tables	*/
         	},
         	{
           	#	再稍稍的改进⼀下执⾏计划
           	"refine_plan":	[
             	{
               	"table":	"`s1`",
               	"pushed_index_condition":	"(`s1`.`key2`	<	1000000)",
               	"table_condition_attached":	"((`s1`.`key1`	>	'z')	and	(`s1`.`key3`	in	('a','b','c'))	and
(`s1`.`common_field`	=	'abc'))"
             	}
           	]	/*	refine_plan	*/
         	}
       	]	/*	steps	*/
     	}	/*	join_optimization	*/
   	},
   	{
     	"join_execution":	{   	#	execute阶段
       	"select#":	1,
       	"steps":	[
       	]	/*	steps	*/
     	}	/*	join_execution	*/
   	}
 	]	/*	steps	*/
}

因优化过程⽂本太多⽽丢弃的⽂本字节⼤⼩，值为0时表示并没有丢弃
MISSING_BYTES_BEYOND_MAX_MEM_SIZE: 0
权限字段
INSUFFICIENT_PRIVILEGES: 0
1 row inset (0.00 sec)
这个输出的第⼀感觉就是这⽂本也太多了点⼉吧，其实这只是优化器执⾏过程中的⼀⼩部分。