Hadoop报错Caused by: java.io.IOException: Stream closed问题

【一、问题现象】

这几天离线计算平台有个别计算作业运行失败,排查原因,查看平台日志,报错关键信息如下:

WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Error: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. java.io.IOException: Stream closed
at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:257)
at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:348)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:362)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: java.io.IOException: Stream closed
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2639)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:981)
at org.apache.hadoop.mapred.JobConf.checkAndWarnDeprecation(JobConf.java:2007)
at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:479)
at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:469)
at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:188)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:580)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:578)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapred.JobClient.getJobUsingCluster(JobClient.java:578)
at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:596)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:295)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:559)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:424)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:151)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1232)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:255)
        ... 11 more
Caused by: java.io.IOException: Stream closed
at java.util.zip.InflaterInputStream.ensureOpen(InflaterInputStream.java:67)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:142)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at org.apache.xerces.impl.XMLEntityManager$RewindableInputStream.read(Unknown Source)
at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
        at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2480)
        at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2468)
        at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2539)
        ... 37 more (state=08S01,code=1)
Closing: 0: jdbc:hive2://60.8.1.23:10000

【二、问题定位】

1、根据报错日志,指向为hadoop问题,于是在apache官网选择ALL ISSUES,搜索日志报错关键字:java.io.IOException: Stream closed,记录数较多,一 一比对,发现HADOOP-12404这个ISSUE与问题日志匹配

 

我们生产环境所用Hadoop版本为2.7,该ISSUE在Hadoop 2.8已被fixed,问题描述为:

从Configuration类中的URL加载资源时,请禁用JarURLConnection的缓存,以避免与其他用户共享JarFile。

Configuration类的parse方法将调用url.openStream来获取InputStream供DocumentBuilder进行解析。

根据JDK源代码,调用顺序为 url.openStream => handler.openConnection.getInputStream => new JarURLConnection => JarURLConnection.connect => factory.get(getJarFileURL(),getUseCaches())=> URLJarFile.getInputStream => JarFile.getInputStream => ZipFile.getInputStream

如果URLConnection类的getUseCaches方法返回值为true(默认情况下),则URLJarFile将为同一URL共享。 如果共享的URLJarFile被其他用户关闭,则URLJarFile类的getInputStream方法返回的所有InputStream都将基于文档关闭。

因此,在集群负载较高时,可能会发生该异常。

 

版本2.8对该问题进行了修复,设置用户缓存为false,同时修改了parse函数的返回参数。

 

diff --git hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java

index 0b45429..8801c6c 100644

--- hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java

+++ hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java

@@ -34,7 +34,9 @@

 import java.io.Writer;

 import java.lang.ref.WeakReference;

 import java.net.InetSocketAddress;

+import java.net.JarURLConnection;

 import java.net.URL;

+import java.net.URLConnection;

 import java.util.ArrayList;

 import java.util.Arrays;

 import java.util.Collection;

@@ -2531,7 +2533,14 @@ private Document parse(DocumentBuilder builder, URL url)

     if (url == null) {

       return null;

     }

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值