关键信息 | 关键字 | 说明 |
---|---|---|
Map或reduce操作 | Map Operator Tree、Reduce Operator Tree | map、reduce阶段 |
扫描表 | TableScan | 要查询的表 |
表数据量统计 | Statistics | 包括行数、数据大小 |
查询算子 | Select Operator | 要检索的字段 |
分区算子 | Group By Operator | 聚合如count()等需要 |
排序 | sort order | 是否排序,+表排序,空不排序 |
是否本地任务 | Local Work、Local Tables、Map Local Operator Tree | 见于Map端的连接,有小表参与连接、且auto.convert.join=true |
连接算子 | Join Operator | 连接 |
连接条件 | condition map | 连接条件:Left Outer Join0 to 1 |
2、explain select count(distinct mobilename) from testtab_small
完整的执行计划示例。之后的主要体现差异,不是全部,比如删除Stage0部分。
/*整体步骤,从上往下看*/
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
STAGE PLANS:
Stage: Stage-1
Map Reduce
Map Operator Tree:
/扫描的表/
TableScan
alias: testtab_small
/表数据量统计,13条,629字节?/
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Select Operator
/检索的字段是mobilename/
expressions: mobilename (type: string)
outputColumnNames: mobilename
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Group By Operator
/聚合操作/
aggregations: count(DISTINCT mobilename)
keys: mobilename (type: string)
mode: hash
outputColumnNames: _col0, _col1
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string)
/+ 需要排序/
sort order: +
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Reduce Operator Tree:
Group By Operator
aggregations: count(DISTINCT KEY._col0:0._col0)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
3、explain select count(mobilename) from testtab_small;
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: testtab_small
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: mobilename (type: string)
outputColumnNames: mobilename
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Group By Operator
aggregations: count(mobilename)
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: bigint)
Reduce Operator Tree:
Group By Operator
aggregations: count(VALUE._col0)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
4、全局排序
explain select * from testtab_small order by mobilename;
- 1
- 无聚合算子
- 在Map Operator Tree下的Reduce Output Operator下就有排序了(sort order: +)
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: testtab_small
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: mobilename (type: string), testrecordid (type: string)
outputColumnNames: _col0, _col1
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string)
sort order: +
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: string)
Reduce Operator Tree:
Select Operator
expressions: KEY.reducesinkkey0 (type: string), VALUE._col0 (type: string)
outputColumnNames: _col0, _col1
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
5、关联小表,Map端连接
hive.auto.convert.join=true /默认值/
hive> explain select a.testrecordid,a.mobilename,b.mobilename from testtab_small a left join testtab_small2 b on a.testrecordid=b.testrecordid;
- 1
- 总步骤增至3个
- 多了本地任务,Map Reduce Local Work。即hadoop的分布式缓存技术
- Map Join Operator
STAGE DEPENDENCIES:
Stage-4 is a root stage
Stage-3 depends on stages: Stage-4
Stage-0 depends on stages: Stage-3
STAGE PLANS:
Stage: Stage-4
Map Reduce Local Work
Alias -> Map Local Tables:
h
d
t
hdt
hdt_1:b
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
h
d
t
hdt
hdt_1:b
TableScan
alias: b
Statistics: Num rows: 13 Data size: 655 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: mobilename (type: string), testrecordid (type: string)
outputColumnNames: _col0, _col1
Statistics: Num rows: 13 Data size: 655 Basic stats: COMPLETE Column stats: NONE
HashTable Sink Operator
keys:
0 _col1 (type: string)
1 _col1 (type: string)
Stage: Stage-3
Map Reduce
Map Operator Tree:
TableScan
alias: a
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: mobilename (type: string), testrecordid (type: string)
outputColumnNames: _col0, _col1
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Map Join Operator
condition map:
Left Outer Join0 to 1
keys:
0 _col1 (type: string)
1 _col1 (type: string)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 14 Data size: 691 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: _col1 (type: string), _col0 (type: string), _col2 (type: string)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 14 Data size: 691 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 14 Data size: 691 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Local Work:
Map Reduce Local Work
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
6、Reduce端连接
hive.auto.convert.join=false
与5相同的语句,Reduce端连接执行计划:
- Map Operator Tree里有2个评级的TableScan,对应Mapper多个数据来源
- Reduce Operator Tree下Join Operator:reduce端连接。
STAGE PLANS:
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: a
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: mobilename (type: string), testrecordid (type: string)
outputColumnNames: _col0, _col1
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col1 (type: string)
sort order: +
Map-reduce partition columns: _col1 (type: string)
Statistics: Num rows: 13 Data size: 629 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: string)
TableScan
alias: b
Statistics: Num rows: 13 Data size: 655 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: mobilename (type: string), testrecordid (type: string)
outputColumnNames: _col0, _col1
Statistics: Num rows: 13 Data size: 655 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col1 (type: string)
sort order: +
Map-reduce partition columns: _col1 (type: string)
Statistics: Num rows: 13 Data size: 655 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: string)
Reduce Operator Tree:
Join Operator
condition map:
Left Outer Join0 to 1
keys:
0 _col1 (type: string)
1 _col1 (type: string)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 14 Data size: 691 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: _col1 (type: string), _col0 (type: string), _col2 (type: string)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 14 Data size: 691 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 14 Data size: 691 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
</div><div><div></div></div>
<link href="https://csdnimg.cn/release/phoenix/mdeditor/markdown_views-60ecaf1f42.css" rel="stylesheet">
</div>