hive计算时找不到文件
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Hive Runtime Error while closing operators: java.io.IOException: Could not get block locations. Source file “/user/hive/warehouse/dwd.db/dwd_mpi_patient_info/.hive-staging_hive_2020-07-15_12-51-27_860_3463437152013258885-1/_task_tmp.-ext-10000/ppi=2019-04-18/_tmp.000098_2” - Aborting…blocknull
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:286)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:454)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:393)
at org.apache.hadoop.mapred.YarnChild
2.
r
u
n
(
Y
a
r
n
C
h
i
l
d
.
j
a
v
a
:
174
)
a
t
j
a
v
a
.
s
e
c
u
r
i
t
y
.
A
c
c
e
s
s
C
o
n
t
r
o
l
l
e
r
.
d
o
P
r
i
v
i
l
e
g
e
d
(
N
a
t
i
v
e
M
e
t
h
o
d
)
a
t
j
a
v
a
x
.
s
e
c
u
r
i
t
y
.
a
u
t
h
.
S
u
b
j
e
c
t
.
d
o
A
s
(
S
u
b
j
e
c
t
.
j
a
v
a
:
422
)
a
t
o
r
g
.
a
p
a
c
h
e
.
h
a
d
o
o
p
.
s
e
c
u
r
i
t
y
.
U
s
e
r
G
r
o
u
p
I
n
f
o
r
m
a
t
i
o
n
.
d
o
A
s
(
U
s
e
r
G
r
o
u
p
I
n
f
o
r
m
a
t
i
o
n
.
j
a
v
a
:
1875
)
a
t
o
r
g
.
a
p
a
c
h
e
.
h
a
d
o
o
p
.
m
a
p
r
e
d
.
Y
a
r
n
C
h
i
l
d
.
m
a
i
n
(
Y
a
r
n
C
h
i
l
d
.
j
a
v
a
:
168
)
C
a
u
s
e
d
b
y
:
o
r
g
.
a
p
a
c
h
e
.
h
a
d
o
o
p
.
h
i
v
e
.
q
l
.
m
e
t
a
d
a
t
a
.
H
i
v
e
E
x
c
e
p
t
i
o
n
:
j
a
v
a
.
i
o
.
I
O
E
x
c
e
p
t
i
o
n
:
C
o
u
l
d
n
o
t
g
e
t
b
l
o
c
k
l
o
c
a
t
i
o
n
s
.
S
o
u
r
c
e
f
i
l
e
"
/
u
s
e
r
/
h
i
v
e
/
w
a
r
e
h
o
u
s
e
/
d
w
d
.
d
b
/
d
w
d
m
p
i
p
a
t
i
e
n
t
i
n
f
o
/
.
h
i
v
e
−
s
t
a
g
i
n
g
h
i
v
e
2
020
−
07
−
1
5
1
2
−
51
−
2
7
8
6
0
3
463437152013258885
−
1
/
t
a
s
k
t
m
p
.
−
e
x
t
−
10000
/
p
p
i
=
2019
−
04
−
18
/
t
m
p
.
00009
8
2
"
−
A
b
o
r
t
i
n
g
.
.
.
b
l
o
c
k
=
=
n
u
l
l
a
t
o
r
g
.
a
p
a
c
h
e
.
h
a
d
o
o
p
.
h
i
v
e
.
q
l
.
e
x
e
c
.
F
i
l
e
S
i
n
k
O
p
e
r
a
t
o
r
2.run(YarnChild.java:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Could not get block locations. Source file "/user/hive/warehouse/dwd.db/dwd_mpi_patient_info/.hive-staging_hive_2020-07-15_12-51-27_860_3463437152013258885-1/_task_tmp.-ext-10000/ppi=2019-04-18/_tmp.000098_2" - Aborting...block==null at org.apache.hadoop.hive.ql.exec.FileSinkOperator
2.run(YarnChild.java:174)atjava.security.AccessController.doPrivileged(NativeMethod)atjavax.security.auth.Subject.doAs(Subject.java:422)atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)atorg.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)Causedby:org.apache.hadoop.hive.ql.metadata.HiveException:java.io.IOException:Couldnotgetblocklocations.Sourcefile"/user/hive/warehouse/dwd.db/dwdmpipatientinfo/.hive−staginghive2020−07−1512−51−278603463437152013258885−1/tasktmp.−ext−10000/ppi=2019−04−18/tmp.0000982"−Aborting...block==nullatorg.apache.hadoop.hive.ql.exec.FileSinkOperatorFSPaths.closeWriters(FileSinkOperator.java:198)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:1058)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:686)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:700)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:700)
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:277)
… 7 more
Caused by: java.io.IOException: Could not get block locations. Source file “/user/hive/warehouse/dwd.db/dwd_mpi_patient_info/.hive-staging_hive_2020-07-15_12-51-27_860_3463437152013258885-1/_task_tmp.-ext-10000/ppi=2019-04-18/_tmp.000098_2” - Aborting…blocknull
at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1477)
at org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1256)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:667)
Task attempt attempt_1594715484213_1313_r_000098_2 is done from TaskUmbilicalProtocol’s point of view. However, it stays in finishing state for too long
[2020-07-15 13:00:16.795]Container killed by the ApplicationMaster.
[2020-07-15 13:00:16.795]Sent signal OUTPUT_THREAD_DUMP (SIGQUIT) to pid 115208 as user ngariHZ for container container_1594715484213_1313_01_000230, result=success
[2020-07-15 13:00:16.809]Container killed on request. Exit code is 143
[2020-07-15 13:00:16.819]Container exited with a non-zero exit code 143.
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 26 Reduce: 99 Cumulative CPU: 903.72 sec HDFS Read: 6643488692 HDFS Write: 46424204 HDFS EC Read: 0 FAIL
Total MapReduce CPU Time Spent: 15 minutes 3 seconds 720 msec
原因:mapred.task.timeout设置时间过短,如上日志,在200秒左右任务状态没有任何变化,hadoop将该任务kill,并清理临时目录,后续遍找不到临时数据了。
修改参数
mapred.task.timeout 200000 The number of milliseconds before a task will be terminated if it neither reads an input, writes an output, nor updates its status string. mapred.task.timeout修改称10分钟600000即可。