Hive查看执行计划

最新推荐文章于 2023-04-30 17:05:01 发布

鸭梨山大哎

最新推荐文章于 2023-04-30 17:05:01 发布

阅读量2.2k

点赞数

分类专栏： hive 文章标签： hive 执行计划

本文链接：https://blog.csdn.net/u010711495/article/details/114125779

版权

hive 专栏收录该内容

114 篇文章 14 订阅

订阅专栏

可以用explain查看执行计划
比如

explain select deptno `dept`,
       year(hiredate) `year`,
       sum(sal)
from tb_emp
group by deptno, year(hiredate);

1 可以先看有几个stage

比如这个例子有2个

+------------------------------------+
|Explain                             |
+------------------------------------+
|STAGE DEPENDENCIES:                 |
|  Stage-1 is a root stage           |
|  Stage-0 depends on stages: Stage-1|
+------------------------------------+

stage 0 依赖于stage1,就是说先执行stage1,再执行stage 0

1查看stage1的map阶段

可以看出map阶段主要做了

表的扫描
表数据量的统计
检索的字段就是expressions那块
aggregations

+-------------------------------------------------------------------------------------------------+
|Explain                                                                                          |
+-------------------------------------------------------------------------------------------------+
|    Map Reduce                                                                                   |
|      Map Operator Tree:                                                                         |
|          TableScan                                                                              |
|            alias: tb_emp                                                                        |
|            Statistics: Num rows: 6 Data size: 718 Basic stats: COMPLETE Column stats: NONE      |
|            Select Operator                                                                      |
|              expressions: deptno (type: int), year(hiredate) (type: int), sal (type: float)     |
|              outputColumnNames: _col0, _col1, _col2                                             |
|              Statistics: Num rows: 6 Data size: 718 Basic stats: COMPLETE Column stats: NONE    |
|              Group By Operator                                                                  |
|                aggregations: sum(_col2)                                                         |
|                keys: _col0 (type: int), _col1 (type: int)                                       |
|                mode: hash                                                                       |
|                outputColumnNames: _col0, _col1, _col2                                           |
|                Statistics: Num rows: 6 Data size: 718 Basic stats: COMPLETE Column stats: NONE  |
|                Reduce Output Operator                                                           |
|                  key expressions: _col0 (type: int), _col1 (type: int)                          |
|                  sort order: ++                                                                 |
|                  Map-reduce partition columns: _col0 (type: int), _col1 (type: int)             |
|                  Statistics: Num rows: 6 Data size: 718 Basic stats: COMPLETE Column stats: NONE|
|                  value expressions: _col2 (type: double)                                        |
+-------------------------------------------------------------------------------------------------+

3看reduce阶段

确定输入与输出格式

+-------------------------------------------------------------------------------------------+
|Explain                                                                                    |
+-------------------------------------------------------------------------------------------+
|      Reduce Operator Tree:                                                                |
|        Group By Operator                                                                  |
|          aggregations: sum(VALUE._col0)                                                   |
|          keys: KEY._col0 (type: int), KEY._col1 (type: int)                               |
|          mode: mergepartial                                                               |
|          outputColumnNames: _col0, _col1, _col2                                           |
|          Statistics: Num rows: 3 Data size: 359 Basic stats: COMPLETE Column stats: NONE  |
|          File Output Operator                                                             |
|            compressed: false                                                              |
|            Statistics: Num rows: 3 Data size: 359 Basic stats: COMPLETE Column stats: NONE|
|            table:                                                                         |
|                input format: org.apache.hadoop.mapred.SequenceFileInputFormat             |
|                output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat   |
|                serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe                  |
+-------------------------------------------------------------------------------------------+

参考

Hive实验5：查看Hql执行计划及关键步骤说明_heroicpoem的专栏-CSDN博客_hive查看执行计划

LanguageManual Explain - Apache Hive - Apache Software Foundation

鸭梨山大哎

关注

0
点赞
踩
6

收藏

觉得还不错? 一键收藏
0
评论
Hive查看执行计划

可以用explain查看执行计划比如explain select deptno `dept`, year(hiredate) `year`, sum(sal)from tb_empgroup by deptno, year(hiredate);1 可以先看有几个stage比如这个例子有2个+------------------------------------+|Explain |+----------
复制链接

扫一扫