Hive任务运行常见报错及解决方式汇总

最新推荐文章于 2024-12-29 07:11:49 发布

豆沙糕

最新推荐文章于 2024-12-29 07:11:49 发布

阅读量8.6k

点赞数 2

分类专栏：大数据文章标签： Hive

本文链接：https://blog.csdn.net/qq_38646027/article/details/87880321

版权

大数据专栏收录该内容

8 篇文章

订阅专栏

有的时候hive任务运行到一半，会报错并强制结束，下面对工作中经常遇到的报错及解决措施进行一个汇总，因为都是平时遇到了临时简单记录一下，所以没有当时的报错截图，但是主要报错内容是有的。

以下报错内容均为从yarn任务监控页面(http://主机名:8088/cluster)中查到的运行日志中打印的具体报错，直接查看命令行或者其他运行日志，可能只能看到return code 1 或者 return code 2 等并不能显示具体问题的报错信息。

1. [Fatal Error] total number of created files now is 100385, which exceeds 100000. Killing the job.

表示Hive对创建文件的总数有限制，默认是十万(一般是运行的分区数超量导致)
解决方式：加上set参数和distribute by rand(),表示将相似的数据放在一个reduce里,只会创建(表数据总大小/5G)个文件 set hive.exec.reducers.bytes.per.reducer=5120000000; -- 加上该参数，表示5个G
insert overwrite table test2 partition(aaa)
select * from test1
distribute by rand(); -- 在最后加上distribute
如果还不行就将以上参数去掉,distribute也去掉,设置参数：hive.exec.max.created.files=200000 人为设置文件总数上限(这个方法对hadoop不友好)

2. [Fatal error] occurred when node tried to create too many dynamic partitions.
The maximum number of dynamic partitions is controlled by hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode.
Maximum was set to: 100
表示一个节点上的动态分区数量超过默认100的限制
解决方式，加上该参数：set hive.exec.max.dynamic.partitions.pernode=1000;
备注： hive.exec.max.dynamic.partitions.pernode 默认为100
hive.exec.max.dynamic.partitions 默认为1000

3.hive表数据文件大小不均，即hive表所对应的HDFS目录下的n个数据文件大小差异较大
解决方式1：set hive.merge.mapfiles=true;
set hive.merge.mapredfiles=true;
set hive.merge.size.per.task=300000000; -- 3G
set hive.merge.smallfiles.avgsize=300000000; -- 3G
sql最后面添加(分号之前)：distribute by rand() --平均分配
解决方式2：set mapred.reduce.tasks=50; -- 一个reduce对应一个数据文件，总共hive会有50个数据文件
sql最后面添加：distribute by rand()

4. OOM:java heap space

表示 java堆内存溢出了，加上如下参数：
set mapreduce.reduce.shuffle.input.buffer.percent=0.3;
set mapreduce.reduce.shuffle.parallelcopies=3;

5. OOM: GC overhead limit exceeded
加上参数：set hive.auto.convert.join=false;
即关闭自动 Map Jion

6. PriviledgedActionException as:hive (auth:SIMPLE) cause:java.io.IOException: java.io.IOException: unexpected end of stream
表示来源表中有数据文件损坏或者有问题
点击yarn监控页面中该任务的运行日志中：Click here for the full log. 点击 here, 如下图：

点进去找到包含HDFS目录及该目录下数据文件的日志记录，即表示具体是这个数据文件有问题。

解决方式有两种：
1.如果不在意这个数据文件的数据，删掉该数据文件，重新运行即可。
2. 如果一定需要这个已经损坏的数据文件，则看情况处理。

我遇到的情况是bzip2压缩文件损坏，所以修复即可：
一、把该文件拉下来
二、修复bzip2文件：bzip2recover test_s10_14_03.1542786910122.bz2
三、修复后，会变成n个小的bz2文件，然后全部解压，再聚合成一个文件，再压缩：
bunzip2 test*

cat test* > test_s10_14_03.1542786910122

bzip2 test_s10_14_03.1542786910122
四、将hdfs上的该文件删除,把修复好的压缩文件文件上传即可

7. .generic.GenericUDFIf.evaluate(GenericUDFIf.java:128)
说明使用自定义UDF函数报了错,基本上是传入的参数问题导致

点击yarn监控页面中该任务的运行日志中：Click here for the full log. 点击 here, 如下图：

向下翻会出现详细报错：Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public java.lang.String cn.cmvideo.test.hive.udf.UDFtest.evaluate(java.lang.String) throws org.apache.hadoop.hive.ql.exec.UDFArgumentException on object cn.cmvideo.test.hive.udf.UDFtest@607cfcfa of class cn.cmvideo.test.hive.udf.UDFtest with arguments {null} of size 1
即可锁定是 UDFtest() 这个UDF函数报了错，看看是否是因为空值或者null值或者其他特殊值导致使用该函数时报错。

不定时更新。