生产代码无任何变化,每天定时的distcp突然失败了。
WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:user (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: pathxx/aaa
报错之前有个警告,说B集群上没有对应的目录, 但是这个aaa是拷贝操作自己递归创建的
***
2018-05-06 23:05:30,560 ERROR [main] org.apache.hadoop.tools.util.RetriableCommand: Failure in Retriable command: Copying webhdfs://clusterA:10070/path to webhdfs://clusterB:50070/pathxx
java.io.IOException: Error writing request body to server
at sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3478)
at sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3461)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableFileCopyCommand.java:272)
at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToFile(RetriableFileCopyCommand.java:191)
at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:124)
at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:100)
at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:304)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:267)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:52)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:user (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: pathxx/aaa
报错之前有个警告,说B集群上没有对应的目录, 但是这个aaa是拷贝操作自己递归创建的
***
2018-05-06 23:05:30,560 ERROR [main] org.apache.hadoop.tools.util.RetriableCommand: Failure in Retriable command: Copying webhdfs://clusterA:10070/path to webhdfs://clusterB:50070/pathxx
java.io.IOException: Error writing request body to server
at sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3478)
at sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3461)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableFileCopyCommand.java:272)
at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToFile(RetriableFileCopyCommand.java:191)
at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:124)
at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:100)
at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:304)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:267)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:52)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
手动跑的时候把webhdfs 去掉了,然后突然爆了一个quota 异常;去hdfs和yarn上去找之前失败的记录,都没有提到qouta这个报错。
结论:
修改了对应的quota任务可以正常运行
应该是webhdfs 这个东没有 原有的报错栈打印出来