hive报错java.io.IOException: Could not find status of job:job_1470047186803_131111

           环境:hadoop2.7.2 + hive1.2.1

        最近集群环境下,有部分hive任务会报报如下错误:

        

java.io.IOException: Could not find status of job:job_1470047186803_131111
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:295)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:549)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:437)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1653)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1412)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Ended Job = job_1470047186803_131111 with exception 'java.io.IOException(Could not find status of job:job_1470047186803_131111)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
       具体现象为,一般在作业运行到100%之后,并且作业信息在historyserver中无法找到。并且在某些情况下,作业重跑是可以成功的,非常诡异。

        后来通过查找失败任务的AMapplicationMaster的日志:

        

2016-08-04 14:27:59,012 INFO [Thread-68] org.apache.hadoop.service.AbstractService: Service JobHistoryEventHandler failed in state STOPPED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$PathComponentTooLongException): The maximum path component name limit of job_1470047186803_131111-1470292057380-ide-create++table+temp.tem...%28%27%E5%B7%B2%E5%8F%96%E6%B6%88%27%2C%27%E6%8B%92%E6%94%B6%E5%85%A5%E5%BA%93%27%2C%27%E9%A9%B3%E5%9B%9E%27%29%28Stage-1470292073175-1-0-SUCCEEDED-root.data_platform-1470292063756.jhist_tmp in directory /tmp/hadoop-yarn/staging/history/done_intermediate/ide is exceeded: limit=255 length=258
	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxComponentLength(FSDirectory.java:911)
	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addLastINode(FSDirectory.java:976)
	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addINode(FSDirectory.java:838)
	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addFile(FSDirectory.java:426)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2575)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2450)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2334)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:623)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:397)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

        hive的运行原理:当任务执行结束后,对于每一个jobid,会根据一定规则,生成两个文件,一个是*.jhist,另一个是*.conf.这两个文件记录了这个job的所有执行信息,这两个文件是要写入到jobhistory所监控的路径的。

        作业的名字:  job_1470047186803_131111-1470292057380-ide-create++table+temp.tem...%28%27%E5%B7%B2%E5%8F%96%E6%B6%88%27%2C%27%E6%8B%92%E6%94%B6%E5%85%A5%E5%BA%93%27%2C%27%E9%A9%B3%E5%9B%9E%27%29%28Stage-1470292073175-1-0-SUCCEEDED-root.data_platform-1470292063756.jhist_tmp  

               

这个作业名中导致超长的原因主要是这部分:create++table+temp.tem...%28%27%E5%B7%B2%E5%8F%96%E6%B6%88%27%2C%27%E6%8B%92%E6%94%B6%E5%85%A5%E5%BA%93%27%2C%27%E9%A9%B3%E5%9B%9E%27%29%28Stage-1470292073175-1-。 这部分是根据hive的jobname来决定的,默认是从hql中的开头和结尾截取一部分,如果sql开头或结尾有中文注释的话,会被截取进来,并进行url编码。导致作业的信息名变的非常长,超过了namenode所允许的最大的文件命名长度。导致任务无法写入historyserver。hive在historyserver无法获得这个job的状态,报开头的错误。

      这里提供一个简单的解决办法:

          set  hive.jobname.length=10;

       

           

  • 4
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 3
    评论
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值