HDFS appendToFile命令DFSClient问题

Hadoop_Liang

已于 2023-12-18 16:53:13 修改

阅读量5.9k

点赞数 5

分类专栏：问题 Hadoop 文章标签： appendToFile hdfs

于 2020-09-30 21:02:49 首次发布

本文链接：https://blog.csdn.net/qq_42881421/article/details/108889914

版权

Hadoop 同时被 2 个专栏收录

43 篇文章

订阅专栏

问题

14 篇文章

订阅专栏

本文介绍了在Ubuntu16.04上运行Hadoop2.7.3伪分布式时遇到的问题：通过API创建hdfs文件并尝试使用appendToFile操作时，由于datanode问题导致操作失败。分析原因是datanode数量不足，无法替换故障节点。解决方案是修改`dfs.client.block.write.replace-datanode-on-failure.policy`配置为`NEVER`，然后重启Hadoop服务。经过此设置，appendToFile操作成功执行。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

环境：

Ubuntu16.04

Hadoop2.7.3 伪分布式

发现问题：

通过API创建hdfs 的文件(/test1.txt和/test3.txt) 时，使用appendToFile会报错，报错为DFSClient相关错误，错误如下：

hadoop@node1:~$ hdfs dfs -appendToFile 2.txt /test1.txt
appendToFile: Failed to APPEND_FILE /test1.txt for DFSClient_NONMAPREDUCE_-20123622_1 on 192.168.134.128 because lease recovery is in progress. Try again later.
hadoop@node1:~$
hadoop@node1:~$ hdfs dfs -appendToFile 2.txt /test3.txt
20/09/30 19:55:21 WARN hdfs.DFSClient: DataStreamer Exception
java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[192.168.134.128:50010,DS-f6e8c916-0ea6-42d1-b602-983e116d96d7,DISK]], original=[DatanodeInfoWithStorage[192.168.134.128:50010,DS-f6e8c916-0ea6-42d1-b602-983e116d96d7,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:914)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156)

分析原因：

根据错误提示，datanode出问题了，没有更多的datanode替换，可能的解决办法是通过修改dfs.client.block.write.replace-datanode-on-failure.policy的配置来解决

Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try.

The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.

查看hdfs-default.xml配置：点击查看hdfs-default.xml配置

When the cluster size is extremely small, e.g. 3 nodes or less, cluster administrators may want to set the policy to NEVER in the default configuration file or disable this feature. Otherwise, users may experience an unusually high rate of pipeline failures since it is impossible to find new datanodes for replacement.

因为测试环境是Hadoop伪分布方式，只有1个datanode, datanode数量小于3，根据上图建议配置failure policy为NEVER，而failure policy默认为DEFAULT

解决方法：

修改hdfs-site.xml，添加如下属性

        <property>
                <name>dfs.client.block.write.replace-datanode-on-failure.policy</name>
                <value>NEVER</value>
        </property>

重启hadoop

stop-all.sh
start-all.sh

再次测试成功如下

hadoop@node1:~$ cat 2.txt
你好，我好，大家好！
hadoop@node1:~$ hdfs dfs -cat /test1.txt
你好啊,hello hdfs
hadoop@node1:~$ hdfs dfs -appendToFile 2.txt /test1.txt
hadoop@node1:~$ hdfs dfs -cat /test1.txt
你好啊,hello hdfs
你好，我好，大家好！

hadoop@node1:~$ hdfs dfs -cat /test3.txt
hello hdfs
你好啊！
hadoop@node1:~$ hdfs dfs -appendToFile 2.txt /test3.txt
hadoop@node1:~$ hdfs dfs -cat /test3.txt
hello hdfs
你好啊！
你好，我好，大家好！

完成！ enjoy it!