hive报错:Execution failed with exit status: 3

在hive上执行了一个join的sql,运行时报如下错误:

2016-07-06 05:35:32     Processing rows:        1400000 Hashtable size: 1399999 Memory usage:   203699304       percentage:     0.396
2016-07-06 05:35:32     Processing rows:        1500000 Hashtable size: 1499999 Memory usage:   216443704       percentage:     0.42
2016-07-06 05:35:32     Processing rows:        1600000 Hashtable size: 1599999 Memory usage:   243416472       percentage:     0.473
2016-07-06 05:35:32     Processing rows:        1700000 Hashtable size: 1699999 Memory usage:   256160872       percentage:     0.498
2016-07-06 05:35:32     Processing rows:        1800000 Hashtable size: 1799999 Memory usage:   268905272       percentage:     0.522
2016-07-06 05:35:33     Processing rows:        1900000 Hashtable size: 1899999 Memory usage:   281649664       percentage:     0.547
2016-07-06 05:35:33     Processing rows:        2000000 Hashtable size: 1999999 Memory usage:   291845192       percentage:     0.567
Execution failed with exit status: 3
Obtaining error information

原因:默认情况下,hive是自动把左表当做小表加载到内存里,这里设置/*+mapjoin(tb2)*/,是想强制把tb2表当做小表放到内存里,但是在这里看起来不管用 。
解决方法:设置set hive.auto.convert.join=false;

官方FAQ解释:

Execution failed with exit status: 3

Execution failed with exit status: 3
FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask

Hive converted a join into a locally running and faster 'mapjoin', but ran out of memory while doing so. There are two bugs responsible for this.

Bug 1)

hives metric for converting joins miscalculated the required amount of memory. This is especially true for compressed files and ORC files, as hive uses the filesize as metric, but compressed tables require more memory in their uncompressed 'in memory representation'.

You could simply decrease 'hive.smalltable.filesize' to tune the metric, or increase 'hive.mapred.local.mem' to allow the allocation of more memory for map tasks.

The later option may lead to bug number two if you happen to have a affected hadoop version.

Bug 2)

Hive/Hadoop ignores 'hive.mapred.local.mem' ! (more exactly: bug in Hadoop 2.2 where hadoop-env.cmd sets the -xmx parameter multiple times, effectively overriding the user set hive.mapred.local.mem setting. see: https://issues.apache.org/jira/browse/HADOOP-10245

There are 3 workarounds for this bug:

  • 1) assign more memory to the local! Hadoop JVM client (this is not! mapred.map.memory) because map-join child jvm will inherit the parents jvm settings
    • In cloudera manager home, click on "hive" service,
    • then on the hive service page click on "configuration"
    • Gateway base group --(expand)--> Resource Management -> Client Java Heap Size in Bytes -> 1GB
  • 2) reduce "hive.smalltable.filesize" to ~1MB or below (depends on your cluster settings for the local JVM)
  • 3) turn off "hive.auto.convert.join" to prevent hive from converting the joins to a mapjoin.

2) & 3) can be set in Big-Bench/engines/hive/conf/hiveSettings.sql


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值