Description
The EXPLAIN statement is used to provide logical/physical plans for an input statement. By default, this clause provides information about a physical plan only.
EXPLAIN语句用于为输入语句提供逻辑/物理计划。默认情况下,此子句仅提供有关物理计划的信息。
Syntax
EXPLAIN [ EXTENDED | CODEGEN | COST | FORMATTED ] statement
Generates parsed logical plan, analyzed logical plan, optimized logical plan and physical plan. Parsed Logical plan is a unresolved plan that extracted from the query. Analyzed logical plans transforms which translates unresolvedAttribute and unresolvedRelation into fully typed objects. The optimized logical plan transforms through a set of optimization rules, resulting in the physical plan.
生成分析的逻辑计划、分析的逻辑计划、优化的逻辑计划和物理计划。已解析的逻辑计划是从查询中提取的未解析计划。分析了将UnsolvedAttribute和UnsolvedRelationship转换为全类型对象的逻辑计划转换。优化的逻辑计划通过一组优化规则进行转换,从而生成物理计划。
CODEGEN
Generates code for the statement, if any and a physical plan.
展示要 Codegen 生成的可执行 Java 代码。
COST
If plan node statistics are available, generates a logical plan and the statistics.
展示优化后的逻辑执行计划以及相关的统计。
FORMATTED
Spark 3.0 大版本发布,Spark SQL 的优化占比将近 50%。Spark SQL 取代 Spark Core,成为新一代的引擎内核,所有其他子框架如 Mllib、Streaming 和 Graph,都可以共享 Spark SQL 的性能优化,都能从 Spark 社区对于 Spark SQL 的投入中受益。
以官网
Generates two sections: a physical plan outline and node details.
以分隔的方式输出,它会输出更易读的物理执行计划,并展示每个节点的详细信息。
statement
Specifies a SQL statement to be explained.
Spark 3.0 大版本发布,Spark SQL 的优化占比将近 50%。Spark SQL 取代 Spark Core,成为新一代的引擎内核,所有其他子框架如 Mllib、Streaming 和 Graph,都可以共享 Spark SQL 的性能优化,都能从 Spark 社区对于 Spark SQL 的投入中受益。
-- Using Extended
EXPLAIN EXTENDED select k, sum(v) from values (1, 2), (1, 3) t(k, v) group by k;
+----------------------------------------------------+
| plan|
+----------------------------------------------------+
| == Parsed Logical Plan ==
'Aggregate ['k], ['k, unresolvedalias('sum('v), None)]
+- 'SubqueryAlias `t`
+- 'UnresolvedInlineTable [k, v], [List(1, 2), List(1, 3)]
== Analyzed Logical Plan ==
k: int, sum(v): bigint
Aggregate [k#47], [k#47, sum(cast(v#48 as bigint)) AS sum(v)#50L]
+- SubqueryAlias `t`
+- LocalRelation [k#47, v#48]
== Optimized Logical Plan ==
Aggregate [k#47], [k#47, sum(cast(v#48 as bigint)) AS sum(v)#50L]
+- LocalRelation [k#47, v#48]
== Physical Plan ==
*(2) HashAggregate(keys=[k#47], functions=[sum(cast(v#48 as bigint))], output=[k#47, sum(v)#50L])
+- Exchange hashpartitioning(k#47, 200), true, [id=#79]
+- *(1) HashAggregate(keys=[k#47], functions=[partial_sum(cast(v#48 as bigint))], output=[k#47, sum#52L])
+- *(1) LocalTableScan [k#47, v#48]
|
+----------------------------------------
|==已解析的逻辑计划==
'合计['k],'k,未解决别名('sum('v),无)]
+-“SubqueryAlias”t`
+-'unsolvedinlinetable[k,v],[List(1,2),List(1,3)]
==分析的逻辑计划==
k:int,和(v):bigint
聚合[k#47],[k#47,总和(铸造(v#48为bigint))为总和(v)#50L]
+-亚Queryalias`t`
+-局部关系[k#47,v#48]
==成本优化优化的逻辑计划==
聚合[k#47],[k#47,总和(铸造(v#48为bigint))为总和(v)#50L]
+-局部关系[k#47,v#48]
==物理实际计划==
*(2) HashAggregate(键=[k#47],函数=[sum(cast(v#48为bigint))],输出=[k#47,sum(v)#50L])
+-交换hashpartitioning(k#47200),true[id=#79]
+-*(1)HashAggregate(键=[k#47],函数=[partial#u sum(cast(v#48作为bigint))],输出=[k#47,sum#52L])
+-*(1)LocalTableScan[k#47,v#48]