hadoop2.7.2集群运行HQL时,异常Job Submission failed with exception 'org.apache.hadoop.ipc.RemoteException

在运行了接近50天集群(期间集群没有重启过)之后,运行的是HQL脚本,就是一条简单的查询语句,集群报错,以下是报错的具体信息,最终的解决方案是:手动重启集群,解决了。

在重启集群时:发现不能运行sh stop-all.sh来关闭,会提示:

This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
Stopping namenodes on [192.168.1.190]
192.168.1.190: no namenode to stop
192.168.1.191: no datanode to stop
192.168.1.193: no datanode to stop
192.168.1.194: no datanode to stop
192.168.1.192: no datanode to stop
Stopping secondary namenodes [192.168.1.190]
192.168.1.190: no secondarynamenode to stop
stopping yarn daemons
no resourcemanager to stop
192.168.1.191: no nodemanager to stop
192.168.1.193: no nodemanager to stop
192.168.1.192: no nodemanager to stop
192.168.1.194: no nodemanager to stop
no proxyserver to stop

但是在各个从节点(datanode)上利用命令jps查看时,发现各个进行都是存在的,就是不能stop,

究其原因是因为,其余节点和主节点失去了通信,此时只能一台机器一台机器去手动kill响应的进程

然后重启集群,再测试一遍HQL,报错信息就不再出现了,是什么导致主从节点之间的通信断开,

在这一点上,目前只能归咎于hadoop集群的不稳定性。


===========以下是报错信息===========

Logging initialized using configuration in file:/opt/hive/apache-hive-1.2.1-bin/conf/hive-log4j.properties

OK
Time taken: 1.052 seconds
Query ID = hadoop_20161102163133_5e027a14-3452-4278-9057-e0a244a61952
Total jobs = 1
16/11/02 16:31:37 WARN conf.HiveConf: HiveConf of name hive.files.umask.value does not exist
Execution log at: /tmp/hadoop/hadoop_20161102163133_5e027a14-3452-4278-9057-e0a244a61952.log
2016-11-02 16:31:37     Starting to launch local task to process map join;      maximum memory = 508559360
2016-11-02 16:31:38     Dump the side-table for tag: 0 with group count: 103 into file: file:/tmp/hive/local/24a2fae4-017e-4555-a7c5-6bc9a13419e5/hive_2016-11-02_16-31-33_691_9110187415324308596-1/-local-10002/HashTable-Stage-4/MapJoin-mapfile00--.hashtable
2016-11-02 16:31:38     Uploaded 1 File to: file:/tmp/hive/local/24a2fae4-017e-4555-a7c5-6bc9a13419e5/hive_2016-11-02_16-31-33_691_9110187415324308596-1/-local-10002/HashTable-Stage-4/MapJoin-mapfile00--.hashtable (945102 bytes)
2016-11-02 16:31:38     End of local task; Time Taken: 1.215 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/hadoop-yarn/staging/hadoop/.staging/job_1476427217749_1066/libjars/janino-2.7.6.jar could only be replicated to 0 nodes instead of minReplication (=1).  There are 4 datanode(s) running and 4 node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1547)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3107)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:724)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

        at org.apache.hadoop.ipc.Client.call(Client.java:1475)
        at org.apache.hadoop.ipc.Client.call(Client.java:1412)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
        at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
        at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1459)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1255)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
J ob Submission failed with exception 'org.apache.hadoop.ipc.RemoteException(File /tmp/hadoop-yarn/staging/hadoop/.staging/job_1476427217749_1066/libjars/janino-2.7.6.jar could only be replicated to 0 nodes instead of minReplication (=1).  There are 4 datanode(s) running and 4 node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1547)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3107)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:724)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值