hive中Could not get block locations.

最新推荐文章于 2024-04-27 20:11:33 发布

shesarainbow

最新推荐文章于 2024-04-27 20:11:33 发布

阅读量5.1k

点赞数

本文链接：https://blog.csdn.net/zx605977881/article/details/107359032

版权

hive计算时找不到文件
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Hive Runtime Error while closing operators: java.io.IOException: Could not get block locations. Source file “/user/hive/warehouse/dwd.db/dwd_mpi_patient_info/.hive-staging_hive_2020-07-15_12-51-27_860_3463437152013258885-1/_task_tmp.-ext-10000/ppi=2019-04-18/_tmp.000098_2” - Aborting…blocknull
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:286)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:454)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:393)
at org.apache.hadoop.mapred.YarnChild $2.run(YarnChild.java:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Could not get block locations. Source file "/user/hive/warehouse/dwd.db/dwd_mpi_patient_info/.hive-staging_hive_2020-07-15_12-51-27_860_3463437152013258885-1/_task_tmp.-ext-10000/ppi=2019-04-18/_tmp.000098_2" - Aborting...block==null at org.apache.hadoop.hive.ql.exec.FileSinkOperator$ FSPaths.closeWriters(FileSinkOperator.java:198)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:1058)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:686)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:700)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:700)
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:277)
… 7 more
Caused by: java.io.IOException: Could not get block locations. Source file “/user/hive/warehouse/dwd.db/dwd_mpi_patient_info/.hive-staging_hive_2020-07-15_12-51-27_860_3463437152013258885-1/_task_tmp.-ext-10000/ppi=2019-04-18/_tmp.000098_2” - Aborting…blocknull
at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1477)
at org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1256)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:667)

Task attempt attempt_1594715484213_1313_r_000098_2 is done from TaskUmbilicalProtocol’s point of view. However, it stays in finishing state for too long
[2020-07-15 13:00:16.795]Container killed by the ApplicationMaster.
[2020-07-15 13:00:16.795]Sent signal OUTPUT_THREAD_DUMP (SIGQUIT) to pid 115208 as user ngariHZ for container container_1594715484213_1313_01_000230, result=success
[2020-07-15 13:00:16.809]Container killed on request. Exit code is 143
[2020-07-15 13:00:16.819]Container exited with a non-zero exit code 143.

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 26 Reduce: 99 Cumulative CPU: 903.72 sec HDFS Read: 6643488692 HDFS Write: 46424204 HDFS EC Read: 0 FAIL
Total MapReduce CPU Time Spent: 15 minutes 3 seconds 720 msec

原因：mapred.task.timeout设置时间过短，如上日志，在200秒左右任务状态没有任何变化，hadoop将该任务kill，并清理临时目录，后续遍找不到临时数据了。

修改参数

mapred.task.timeout 200000 The number of milliseconds before a task will be terminated if it neither reads an input, writes an output, nor updates its status string. mapred.task.timeout修改称10分钟600000即可。

shesarainbow

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
hive中Could not get block locations.

hive计算时找不到文件Diagnostic Messages for this Task:Error: java.lang.RuntimeException: Hive Runtime Error while closing operators: java.io.IOException: Could not get block locations. Source file “/user/hive/warehouse/dwd.db/dwd_mpi_patient_info/.hive-staging_h
复制链接

扫一扫