环境:
Ubuntu16.04
Hadoop2.7.3 伪分布式
发现问题:
通过API创建hdfs 的文件(/test1.txt和/test3.txt) 时,使用appendToFile会报错,报错为DFSClient相关错误,错误如下:
hadoop@node1:~$ hdfs dfs -appendToFile 2.txt /test1.txt
appendToFile: Failed to APPEND_FILE /test1.txt for DFSClient_NONMAPREDUCE_-20123622_1 on 192.168.134.128 because lease recovery is in progress. Try again later.
hadoop@node1:~$
hadoop@node1:~$ hdfs dfs -appendToFile 2.txt /test3.txt
20/09/30 19:55:21 WARN hdfs.DFSClient: DataStreamer Exception
java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[192.168.134.128:50010,DS-f6e8c916-0ea6-42d1-b602-983e116d96d7,DISK]], original=[DatanodeInfoWithStorage[192.168.134.128:50010,DS-f6e8c916-0ea6-42d1-b602-983e116d96d7,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:914)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156)
分析原因:
根据错误提示,datanode出问题了,没有更多的datanode替换,可能的解决办法是通过修改dfs.client.block.write.replace-datanode-on-failure.policy的配置来解决
Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try.
The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
查看hdfs-default.xml配置:点击查看hdfs-default.xml配置
When the cluster size is extremely small, e.g. 3 nodes or less, cluster administrators may want to set the policy to NEVER in the default configuration file or disable this feature. Otherwise, users may experience an unusually high rate of pipeline failures since it is impossible to find new datanodes for replacement.
因为测试环境是Hadoop伪分布方式,只有1个datanode, datanode数量小于3,根据上图建议配置failure policy为NEVER,而failure policy默认为DEFAULT
解决方法:
修改hdfs-site.xml,添加如下属性
<property>
<name>dfs.client.block.write.replace-datanode-on-failure.policy</name>
<value>NEVER</value>
</property>
重启hadoop
stop-all.sh
start-all.sh
再次测试成功如下
hadoop@node1:~$ cat 2.txt
你好,我好,大家好!
hadoop@node1:~$ hdfs dfs -cat /test1.txt
你好啊,hello hdfs
hadoop@node1:~$ hdfs dfs -appendToFile 2.txt /test1.txt
hadoop@node1:~$ hdfs dfs -cat /test1.txt
你好啊,hello hdfs
你好,我好,大家好!
hadoop@node1:~$ hdfs dfs -cat /test3.txt
hello hdfs
你好啊!
hadoop@node1:~$ hdfs dfs -appendToFile 2.txt /test3.txt
hadoop@node1:~$ hdfs dfs -cat /test3.txt
hello hdfs
你好啊!
你好,我好,大家好!
完成! enjoy it!