hive使用中遇到的问题2

遇到了奇葩的问题,如下:

select m9.serial_id, m9.max_trade_time
  from (select m0.serial_id,
               m0.round_id,
               m0.max_trade_time,
               m0.bet_change_times
          from base_game_analyze_lhdb m0
         where m0.dt = '20151001') m9,
       (select c.serial_id,
               c.round_id,
               c.bet_change_times,
               max(c.max_trade_time) as max_time
          from base_game_analyze_lhdb c
         where c.dt = '20151001'
           and c.serial_id = '5009664253594'
           and c.round_id = '4'
         group by c.serial_id, c.round_id, c.bet_change_times) n5
 where m9.serial_id = n5.serial_id
   and m9.round_id = n5.round_id
   and m9.max_trade_time = n5.max_time;

查询报错:

Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"_col0":"5009664253594","_col1":"4","_col3":""}
	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"_col0":"5009664253594","_col1":"4","_col3":""}
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518)
	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
	... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException
	at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:403)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
	at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)
	... 9 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
	at java.lang.System.arraycopy(Native Method)
	at org.apache.hadoop.io.Text.set(Text.java:225)
	at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
	at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:267)
	at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:204)
	at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
	at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
	at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
	at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
	at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.populateCachedDistributionKeys(ReduceSinkOperator.java:433)
	at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:342)
	... 13 more

但如果修改一下hql,如下:

select m9.serial_id, m9.max_trade_time
  from (select m0.serial_id,
               m0.round_id,
               m0.max_trade_time,
               m0.bet_change_times
          from base_game_analyze_lhdb m0
         where m0.dt = '20151001') m9,
       (select c.serial_id,
               c.round_id,
               c.bet_change_times,
               max(c.max_trade_time) as max_time
          from base_game_analyze_lhdb c
         where c.dt = '20151001'
           and c.serial_id = '5009664253594'
           and c.round_id = '4'
         group by c.serial_id, c.round_id, c.bet_change_times) n5
 where m9.serial_id = n5.serial_id
   and m9.round_id = n5.round_id
   and m9.max_trade_time = n5.max_time
   <span style="color:#ff0000;">and n5.bet_change_times = m9.bet_change_times;</span>

只是简单多了一个查询条件,则执行成功。

经过研究发现,此类sql,n5的group by 字段必须与m9的对应字段全部关联,才可查出数据,否则查询结果为空。在n5数据多时,则会报上面的错误(数组下标越界)。

但是还没理解错误的原因,查看表中数据或根据业务,添加的条件是可有可无的,而且,哪怕需要此条件,不加条件查询出笛卡尔积也就好了啊,但查询没有结果。


特此mark一下!

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值