HIVE执行过程中java.lang.ArrayIndexOutOfBoundsException【待解决】

 

HIVE执行过程中java.lang.ArrayIndexOutOfBoundsException

表tmp_1

+-----------------------+-------------+
| tmp_1.id  | tmp_1.cl  | tmp_1.plat  |
+-----------------------+-------------+
| 293       | IOS_i01   | IOS         |
| 553       | IOS_i01   | IOS         |
| 559       | AND_a01   | AND         |
| 711       | AND_a01   | AND         |
+---------------+---------------------+

简单通过plat进行id关联

=============
示例1
=============
select 
a.plat,a.cnt,b.cnt
from 
(select
plat,count(*) cnt
from tmp_1
group by plat)a
left join
(select split(cl,'_')[0] as plat,count(*) cnt
from tmp_1 group by split(cl,'_')[0])b
on a.plat=b.plat;

 查询结果:
 +-------------+---------+---------+--+
| a.plat       |  a.cnt  |  b.cnt  |
+-------------+---------+---------+--+
| ALIPAY      | 371     | 371     |
| AND         | 783199  | 783199  |
| IOS         | 659319  | 659319  |
| WECHAT      | 2054    | 2054    |
+-------------+---------+---------+--+

执行正常,且查看执行过程

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
|                                                                                                           Explain                                                                                                           |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| Plan not optimized by CBO.                                                                                                                                                                                                  |                                                                                                                                                                                                                            |
| Vertex dependency in root stage                                                                                                                                                                                             |
| Reducer 2 <- Map 1 (SIMPLE_EDGE), Reducer 4 (BROADCAST_EDGE)                                                                                                                                                                |
| Reducer 4 <- Map 3 (SIMPLE_EDGE)                                                                                                                                                                                            |                                                                                                                                                                                                                            |
| Stage-0                                                                                                                                                                                                                     |
|    Fetch Operator                                                                                                                                                                                                           |
|       limit:-1                                                                                                                                                                                                              |
|       Stage-1                                                                                                                                                                                                               |
|          Reducer 2                                                                                                                                                                                                          |
|          File Output Operator [FS_6313316]                                                                                                                                                                                  |
|             compressed:false                                                                                                                                                                                                |
|             Statistics:Num rows: 794718 Data size: 27599734 Basic stats: COMPLETE Column stats: NONE                                                                                                                        |
|             table:{"input format:":"org.apache.hadoop.mapred.TextInputFormat","output format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat","serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"}  |
|             Select Operator [SEL_6313315]                                                                                                                                                                                   |
|                outputColumnNames:["_col0","_col1","_col2"]                                                                                                                                                                  |
|                Statistics:Num rows: 794718 Data size: 27599734 Basic stats: COMPLETE Column stats: NONE                                                                                                                     |
|                Map Join Operator [MAPJOIN_6313319]                                                                                                                                                                          |
|                |  condition map:[{"":"Left Outer Join0 to 1"}]                                                                                                                                                              |
|                |  HybridGraceHashJoin:true                                                                                                                                                                                  |
|                |  keys:{"Reducer 2":"_col0 (type: string)","Reducer 4":"_col0 (type: string)"}                                                                                                                              |
|                |  outputColumnNames:["_col0","_col1","_col3"]                                                                                                                                                               |
|                |  Statistics:Num rows: 794718 Data size: 27599734 Basic stats: COMPLETE Column stats: NONE                                                                                                                  |
|                |<-Reducer 4 [BROADCAST_EDGE]                                                                                                                                                                                |
|                |  Reduce Output Operator [RS_6313313]                                                                                                                                                                       |
|                |     key expressions:_col0 (type: string)                                                                                                                                                                   |
|                |     Map-reduce partition columns:_col0 (type: string)                                                                                                                                                      |
|                |     sort order:+                                                                                                                                                                                           |
|                |     Statistics:Num rows: 722471 Data size: 25090667 Basic stats: COMPLETE Column stats: NONE                                                                                                               |
|                |     value expressions:_col1 (type: bigint)                                                                                                                                                                 |
|                |     Group By Operator [GBY_6313310]                                                                                                                                                                        |
|                |     |  aggregations:["count(VALUE._col0)"]                                                                                                                                                                 |
|                |     |  keys:KEY._col0 (type: string)                                                                                                                                                                       |
|                |     |  outputColumnNames:["_col0","_col1"]                                                                                                                                                                 |
|                |     |  Statistics:Num rows: 722471 Data size: 25090667 Basic stats: COMPLETE Column stats: NONE                                                                                                            |
|                |     |<-Map 3 [SIMPLE_EDGE]                                                                                                                                                                                 |
|                |        Reduce Output Operator [RS_6313309]                                                                                                                                                                 |
|                |           key expressions:_col0 (type: string)                                                                                                                                                             |
|                |           Map-reduce partition columns:_col0 (type: string)                                                                                                                                                |
|                |           sort order:+                                                                                                                                                                                     |
|                |           Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                        |
|                |           value expressions:_col1 (type: bigint)                                                                                                                                                           |
|                |           Group By Operator [GBY_6313308]                                                                                                                                                                  |
|                |              aggregations:["count()"]                                                                                                                                                                      |
|                |              keys:split(cl, '_')[0] (type: string)                                                                                                                                              |
|                |              outputColumnNames:["_col0","_col1"]                                                                                                                                                           |
|                |              Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                     |
|                |              Select Operator [SEL_6313307]                                                                                                                                                                 |
|                |                 outputColumnNames:["cl"]                                                                                                                                                        |
|                |                 Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                  |
|                |                 TableScan [TS_6313306]                                                                                                                                                                     |
|                |                    alias:tmp_1                                                                                                                                                                             |
|                |                    Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                               |
|                |<-Group By Operator [GBY_6313304]                                                                                                                                                                           |
|                   |  aggregations:["count(VALUE._col0)"]                                                                                                                                                                    |
|                   |  keys:KEY._col0 (type: string)                                                                                                                                                                          |
|                   |  outputColumnNames:["_col0","_col1"]                                                                                                                                                                    |
|                   |  Statistics:Num rows: 722471 Data size: 25090667 Basic stats: COMPLETE Column stats: NONE                                                                                                               |
|                   |<-Map 1 [SIMPLE_EDGE]                                                                                                                                                                                    |
|                      Reduce Output Operator [RS_6313303]                                                                                                                                                                    |
|                         key expressions:_col0 (type: string)                                                                                                                                                                |
|                         Map-reduce partition columns:_col0 (type: string)                                                                                                                                                   |
|                         sort order:+                                                                                                                                                                                        |
|                         Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                           |
|                         value expressions:_col1 (type: bigint)                                                                                                                                                              |
|                         Group By Operator [GBY_6313302]                                                                                                                                                                     |
|                            aggregations:["count()"]                                                                                                                                                                         |
|                            keys:platform (type: string)                                                                                                                                                                     |
|                            outputColumnNames:["_col0","_col1"]                                                                                                                                                              |
|                            Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                        |
|                            Select Operator [SEL_6313301]                                                                                                                                                                    |
|                               outputColumnNames:["plat"]                                                                                                                                                                |
|                               Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                     |
|                               TableScan [TS_6313300]  alias:tmp_1 Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                  |
|                                                                                                                                                                                                                             |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+

示例2 ,依然进行plat进行关联,只不过多加一个子句---会报错,越界的错误,很奇怪

=============
示例2
=============
select 
a.plat,a.cnt,b.cnt,c.cnt
from 
(select
plat,count(*) cnt
from tmp_db.tmp_1
group by plat)a
left join
(select split(cl,'_')[0] as plat,count(*) cnt
from tmp_db.tmp_1 group by split(cl,'_')[0])b
on a.plat=b.plat
left join
(select split(cl,'_')[0] as plat,count(*) cnt
from tmp_db.tmp_1 group by split(cl,'_')[0])c
on a.plat=c.plat
;

报错:
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=1) {"key":{"_col0":"ALIPAY"},"value":{"_col0":27}}
        at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:284)
        at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:408)
        ... 26 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=1) {"key":{"_col0":"ALIPAY"},"value":{"_col0":27}}
        at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
        at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
        ... 27 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
        at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:714)
        at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
        ... 28 more

示例2的执行计划

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
|                                                                                                           Explain                                                                                                           |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| Plan not optimized by CBO due to missing statistics. Please check log for more details.                                                                                                                                     |
|                                                                                                                                                                                                                             |
| Vertex dependency in root stage                                                                                                                                                                                             |
| Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (SIMPLE_EDGE), Map 5 (SIMPLE_EDGE)                                                                                                                                                  |
|                                                                                                                                                                                                                             |
| Stage-0                                                                                                                                                                                                                     |
|    Fetch Operator                                                                                                                                                                                                           |
|       limit:-1                                                                                                                                                                                                              |
|       Stage-1                                                                                                                                                                                                               |
|          Reducer 2                                                                                                                                                                                                          |
|          File Output Operator [FS_6313426]                                                                                                                                                                                  |
|             compressed:false                                                                                                                                                                                                |
|             Statistics:Num rows: 1589436 Data size: 55199468 Basic stats: COMPLETE Column stats: NONE                                                                                                                       |
|             table:{"input format:":"org.apache.hadoop.mapred.TextInputFormat","output format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat","serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"}  |
|             Select Operator [SEL_6313425]                                                                                                                                                                                   |
|                outputColumnNames:["_col0","_col1","_col2","_col3"]                                                                                                                                                          |
|                Statistics:Num rows: 1589436 Data size: 55199468 Basic stats: COMPLETE Column stats: NONE                                                                                                                    |
|                Merge Join Operator [MERGEJOIN_6313431]                                                                                                                                                                      |
|                |  condition map:[{"":"Left Outer Join0 to 1"},{"":"Left Outer Join0 to 2"}]                                                                                                                                 |
|                |  keys:{"0":"_col0 (type: string)","1":"_col0 (type: string)","2":"_col0 (type: string)"}                                                                                                                   |
|                |  outputColumnNames:["_col0","_col1","_col3","_col5"]                                                                                                                                                       |
|                |  Statistics:Num rows: 1589436 Data size: 55199468 Basic stats: COMPLETE Column stats: NONE                                                                                                                 |
|                |                                                                                                                                                                                                            |
|                |<-Group By Operator [GBY_6313413]                                                                                                                                                                           |
|                |     aggregations:["count(VALUE._col0)"]                                                                                                                                                                    |
|                |     keys:KEY._col0 (type: string)                                                                                                                                                                          |
|                |     outputColumnNames:["_col0","_col1"]                                                                                                                                                                    |
|                |     Statistics:Num rows: 722471 Data size: 25090667 Basic stats: COMPLETE Column stats: NONE                                                                                                               |
|                |                                                                                                                                                                                                            |
|                |<-Group By Operator [GBY_6313419]                                                                                                                                                                           |
|                |     aggregations:["count(VALUE._col0)"]                                                                                                                                                                    |
|                |     keys:KEY._col0 (type: string)                                                                                                                                                                          |
|                |     outputColumnNames:["_col0","_col1"]                                                                                                                                                                    |
|                |     Statistics:Num rows: 722471 Data size: 25090667 Basic stats: COMPLETE Column stats: NONE                                                                                                               |
|                |<-Group By Operator [GBY_6313407]                                                                                                                                                                           |
|                   |  aggregations:["count(VALUE._col0)"]                                                                                                                                                                    |
|                   |  keys:KEY._col0 (type: string)                                                                                                                                                                          |
|                   |  outputColumnNames:["_col0","_col1"]                                                                                                                                                                    |
|                   |  Statistics:Num rows: 722471 Data size: 25090667 Basic stats: COMPLETE Column stats: NONE                                                                                                               |
|                   |<-Map 1 [SIMPLE_EDGE]                                                                                                                                                                                    |
|                   |  Reduce Output Operator [RS_6313406]                                                                                                                                                                    |
|                   |     key expressions:_col0 (type: string)                                                                                                                                                                |
|                   |     Map-reduce partition columns:_col0 (type: string)                                                                                                                                                   |
|                   |     sort order:+                                                                                                                                                                                        |
|                   |     Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                           |
|                   |     value expressions:_col1 (type: bigint)                                                                                                                                                              |
|                   |     Group By Operator [GBY_6313405]                                                                                                                                                                     |
|                   |        aggregations:["count()"]                                                                                                                                                                         |
|                   |        keys:plat (type: string)                                                                                                                                                                     |
|                   |        outputColumnNames:["_col0","_col1"]                                                                                                                                                              |
|                   |        Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                        |
|                   |        Select Operator [SEL_6313404]                                                                                                                                                                    |
|                   |           outputColumnNames:["plat"]                                                                                                                                                                |
|                   |           Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                     |
|                   |           TableScan [TS_6313403]                                                                                                                                                                        |
|                   |              alias:tmp_1                                                                                                                                                                                |
|                   |              Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                  |
|                   |<-Map 3 [SIMPLE_EDGE]                                                                                                                                                                                    |
|                   |  Reduce Output Operator [RS_6313412]                                                                                                                                                                    |
|                   |     key expressions:_col0 (type: string)                                                                                                                                                                |
|                   |     Map-reduce partition columns:_col0 (type: string)                                                                                                                                                   |
|                   |     sort order:+                                                                                                                                                                                        |
|                   |     Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                           |
|                   |     value expressions:_col1 (type: bigint)                                                                                                                                                              |
|                   |     Group By Operator [GBY_6313411]                                                                                                                                                                     |
|                   |        aggregations:["count()"]                                                                                                                                                                         |
|                   |        keys:split(cl, '_')[0] (type: string)                                                                                                                                                 |
|                   |        outputColumnNames:["_col0","_col1"]                                                                                                                                                              |
|                   |        Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                        |
|                   |        Select Operator [SEL_6313410]                                                                                                                                                                    |
|                   |           outputColumnNames:["cl"]                                                                                                                                                           |
|                   |           Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                     |
|                   |           TableScan [TS_6313409]                                                                                                                                                                        |
|                   |              alias:tmp_1                                                                                                                                                                                |
|                   |              Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                  |
|                   |<-Map 5 [SIMPLE_EDGE]                                                                                                                                                                                    |
|                      Reduce Output Operator [RS_6313418]                                                                                                                                                                    |
|                         key expressions:_col0 (type: string)                                                                                                                                                                |
|                         Map-reduce partition columns:_col0 (type: string)                                                                                                                                                   |
|                         sort order:+                                                                                                                                                                                        |
|                         Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                           |
|                         value expressions:_col1 (type: bigint)                                                                                                                                                              |
|                         Group By Operator [GBY_6313417]                                                                                                                                                                     |
|                            aggregations:["count()"]                                                                                                                                                                         |
|                            keys:split(cl, '_')[0] (type: string)                                                                                                                                                 |
|                            outputColumnNames:["_col0","_col1"]                                                                                                                                                              |
|                            Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                        |
|                            Select Operator [SEL_6313416]                                                                                                                                                                    |
|                               outputColumnNames:["cl"]                                                                                                                                                           |
|                               Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                     |
|                               TableScan [TS_6313415]                                                                                                                                                                        |
|                                  alias:tmp_1                                                                                                                                                                                |
|                                  Statistics:Num rows: 1444943 Data size: 50181369 Basic stats: COMPLETE Column stats: NONE                                                                                                  |
|                                                                                                                                                                                                                             |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值