查看Hql执行计划及关键步骤说明

1、查看执行计划方法

语法:explain [extended] Hiveql;

/*例子:*/
explain select count(distinct mobilename) from testtab_small;	

   
   
  • 1
  • 2

2、执行计划基本要素

  1. 主要步骤及依赖关系,从上到下
  2. 各主要步骤关键信息,包括:
关键信息关键字说明
Map或reduce操作Map Operator Tree、Reduce Operator Treemap、reduce阶段
扫描表TableScan要查询的表
表数据量统计Statistics包括行数、数据大小
查询算子Select Operator要检索的字段
分区算子Group By Operator聚合如count()等需要
排序sort order是否排序,+表排序,空不排序
是否本地任务Local Work、Local Tables、Map Local Operator Tree见于Map端的连接,有小表参与连接、且auto.convert.join=true
连接算子Join Operator连接
连接条件condition map连接条件:Left Outer Join0 to 1

2、explain select count(distinct mobilename) from testtab_small

完整的执行计划示例。之后的主要体现差异,不是全部,比如删除Stage0部分。

/*整体步骤,从上往下看*/
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
Stage: Stage-1
Map Reduce
Map Operator Tree:
/扫描的表/
TableScan
alias: testtab_small
/表数据量统计,13条,629字节?/
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Select Operator
/检索的字段是mobilename/
expressions: mobilename (type: string)
outputColumnNames: mobilename
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Group By Operator
/聚合操作/
aggregations: count(DISTINCT mobilename)
keys: mobilename (type: string)
mode: hash
outputColumnNames: _col0, _col1
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string)
/+ 需要排序/
sort order: +
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Reduce Operator Tree:
Group By Operator
aggregations: count(DISTINCT KEY._col0:0._col0)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50

3、explain select count(mobilename) from testtab_small;

  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: testtab_small
            Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
            Select Operator
              expressions: mobilename (type: string)
              outputColumnNames: mobilename
              Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
              Group By Operator
                aggregations: count(mobilename)
                mode: hash
                outputColumnNames: _col0
                Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
                Reduce Output Operator
                  sort order:
                  Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
                  value expressions: _col0 (type: bigint)
      Reduce Operator Tree:
        Group By Operator
          aggregations: count(VALUE._col0)
          mode: mergepartial
          outputColumnNames: _col0
          Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
          File Output Operator
            compressed: false
            Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
            table:
                input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

 
 
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32

4、全局排序

explain select * from testtab_small order by mobilename;

 
 
  • 1
  1. 无聚合算子
  2. 在Map Operator Tree下的Reduce Output Operator下就有排序了(sort order: +)
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: testtab_small
            Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
            Select Operator
              expressions: mobilename (type: string), testrecordid (type: string)
              outputColumnNames: _col0, _col1
              Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
              Reduce Output Operator
                key expressions: _col0 (type: string)
                sort order: +
                Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
                value expressions: _col1 (type: string)
      Reduce Operator Tree:
        Select Operator
          expressions: KEY.reducesinkkey0 (type: string), VALUE._col0 (type: string)
          outputColumnNames: _col0, _col1
          Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
          File Output Operator
            compressed: false
            Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
            table:
                input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

 
 
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27

5、关联小表,Map端连接

hive.auto.convert.join=true /默认值/

hive> explain select a.testrecordid,a.mobilename,b.mobilename from testtab_small a left join testtab_small2 b on a.testrecordid=b.testrecordid;	

 
 
  • 1
  1. 总步骤增至3个
  2. 多了本地任务,Map Reduce Local Work。即hadoop的分布式缓存技术
  3. Map Join Operator
STAGE DEPENDENCIES:
  Stage-4 is a root stage
  Stage-3 depends on stages: Stage-4
  Stage-0 depends on stages: Stage-3

STAGE PLANS:
Stage: Stage-4
Map Reduce Local Work
Alias -> Map Local Tables:
h d t hdt hdt_1:b
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
h d t hdt hdt_1:b
TableScan
alias: b
Statistics: Num rows: 13 Data size: 655 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: mobilename (type: string), testrecordid (type: string)
outputColumnNames: _col0, _col1
Statistics: Num rows: 13 Data size: 655 Basic stats: COMPLETE Column stats: NONE
HashTable Sink Operator
keys:
0 _col1 (type: string)
1 _col1 (type: string)

Stage: Stage-3
Map Reduce
Map Operator Tree:
TableScan
alias: a
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: mobilename (type: string), testrecordid (type: string)
outputColumnNames: _col0, _col1
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Map Join Operator
condition map:
Left Outer Join0 to 1
keys:
0 _col1 (type: string)
1 _col1 (type: string)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 14 Data size: 691 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: _col1 (type: string), _col0 (type: string), _col2 (type: string)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 14 Data size: 691 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 14 Data size: 691 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Local Work:
Map Reduce Local Work

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57

6、Reduce端连接

hive.auto.convert.join=false
与5相同的语句,Reduce端连接执行计划:

  1. Map Operator Tree里有2个评级的TableScan,对应Mapper多个数据来源
  2. Reduce Operator Tree下Join Operator:reduce端连接。
STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: a
            Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
            Select Operator
              expressions: mobilename (type: string), testrecordid (type: string)
              outputColumnNames: _col0, _col1
              Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
              Reduce Output Operator
                key expressions: _col1 (type: string)
                sort order: +
                Map-reduce partition columns: _col1 (type: string)
                Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
                value expressions: _col0 (type: string)
          TableScan
            alias: b
            Statistics: Num rows: 13 Data size: 655 Basic stats: COMPLETE Column stats: NONE
            Select Operator
              expressions: mobilename (type: string), testrecordid (type: string)
              outputColumnNames: _col0, _col1
              Statistics: Num rows: 13 Data size: 655 Basic stats: COMPLETE Column stats: NONE
              Reduce Output Operator
                key expressions: _col1 (type: string)
                sort order: +
                Map-reduce partition columns: _col1 (type: string)
                Statistics: Num rows: 13 Data size: 655 Basic stats: COMPLETE Column stats: NONE
                value expressions: _col0 (type: string)
      Reduce Operator Tree:
        Join Operator
          condition map:
               Left Outer Join0 to 1
          keys:
            0 _col1 (type: string)
            1 _col1 (type: string)
          outputColumnNames: _col0, _col1, _col2
          Statistics: Num rows: 14 Data size: 691 Basic stats: COMPLETE Column stats: NONE
          Select Operator
            expressions: _col1 (type: string), _col0 (type: string), _col2 (type: string)
            outputColumnNames: _col0, _col1, _col2
            Statistics: Num rows: 14 Data size: 691 Basic stats: COMPLETE Column stats: NONE
            File Output Operator
              compressed: false
              Statistics: Num rows: 14 Data size: 691 Basic stats: COMPLETE Column stats: NONE
              table:
                  input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                  output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
                                </div><div><div></div></div>
            <link href="https://csdnimg.cn/release/phoenix/mdeditor/markdown_views-60ecaf1f42.css" rel="stylesheet">
                            </div>
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值