Spark SQL架构

这里写图片描述

通过执行计划理解上图

spark-sql (default)> explain extended
                   > select 
                   > a.key*(4+5),
                   > b.value 
                   > from
                   > aa a join aa b
                   > on a.key=b.key and a.key>10;

plan
== Parsed Logical Plan ==
'Project [unresolvedalias(('a.key * (4 + 5)), None), 'b.value]
+- 'Join Inner, (('a.key = 'b.key) && ('a.key > 10))
   :- 'SubqueryAlias a
   :  +- 'UnresolvedRelation `aa`
   +- 'SubqueryAlias b
      +- 'UnresolvedRelation `aa`

== Analyzed Logical Plan ==
(key * (4 + 5)): int, value: string
Project [(key#37 * (4 + 5)) AS (key * (4 + 5))#41, value#40]
+- Join Inner, ((key#37 = key#39) && (key#37 > 10))
   :- SubqueryAlias a
   :  +- SubqueryAlias aa
   :     +- CatalogRelation `default`.`aa`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [key#37, value#38]
   +- SubqueryAlias b
      +- SubqueryAlias aa
         +- CatalogRelation `default`.`aa`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [key#39, value#40]

== Optimized Logical Plan ==
Project [(key#37 * 9) AS (key * (4 + 5))#41, value#40]
+- Join Inner, (key#37 = key#39)
   :- Project [key#37]
   :  +- Filter (isnotnull(key#37) && (key#37 > 10))
   :     +- CatalogRelation `default`.`aa`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [key#37, value#38]
   +- Filter ((key#39 > 10) && isnotnull(key#39))
      +- CatalogRelation `default`.`aa`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [key#39, value#40]

== Physical Plan ==
*Project [(key#37 * 9) AS (key * (4 + 5))#41, value#40]
+- *SortMergeJoin [key#37], [key#39], Inner
   :- *Sort [key#37 ASC NULLS FIRST], false, 0
   :  +- Exchange hashpartitioning(key#37, 200)
   :     +- *Filter (isnotnull(key#37) && (key#37 > 10))
   :        +- HiveTableScan [key#37], CatalogRelation `default`.`aa`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [key#37, value#38]
   +- *Sort [key#39 ASC NULLS FIRST], false, 0
      +- Exchange hashpartitioning(key#39, 200)
         +- *Filter ((key#39 > 10) && isnotnull(key#39))
            +- HiveTableScan [key#39, value#40], CatalogRelation `default`.`aa`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [key#39, value#40]
Time taken: 1.218 seconds, Fetched 1 row(s)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值