【异常】Spark写入HBase时写入DataNode失败：dfs.client.block.write.replace-datanode-on-failure.policy

最新推荐文章于 2021-11-13 13:34:50 发布

wangweislk

最新推荐文章于 2021-11-13 13:34:50 发布

阅读量6.4k

点赞数 1

本文链接：https://blog.csdn.net/wangweislk/article/details/78890163

版权

Spark 同时被 2 个专栏收录

10 篇文章 0 订阅

订阅专栏

Hadoop

2 篇文章 0 订阅

订阅专栏

问题描述：

在SparkStreaming长时间写入HBase的时候，会下面的异常问题：

2017-12-24 23:20:34  [ SparkListenerBus:540107357 ] - [ ERROR ]  Listener EventLoggingListener threw an exception
java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[ip:50010,DS-d9caacf5-a95a-45ab-8231-95decdbe4889,DISK], DatanodeInfoWithStorage[ip:50010,DS-7e2e14d9-3d8b-412d-bf38-3d2930a83d2f,DISK]], original=[DatanodeInfoWithStorage[ip:50010,DS-d9caacf5-a95a-45ab-8231-95decdbe4889,DISK], DatanodeInfoWithStorage[ip:50010,DS-7e2e14d9-3d8b-412d-bf38-3d2930a83d2f,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:1191)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1265)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1433)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1147)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:632)
2017-12-24 23:20:34  [ SparkListenerBus:540107357 ] - [ ERROR ]  Listener EventLoggingListener threw an exception
java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[ip:50010,DS-d9caacf5-a95a-45ab-8231-95decdbe4889,DISK], DatanodeInfoWithStorage[ip:50010,DS-7e2e14d9-3d8b-412d-bf38-3d2930a83d2f,DISK]], original=[DatanodeInfoWithStorage[ip:50010,DS-d9caacf5-a95a-45ab-8231-95decdbe4889,DISK], DatanodeInfoWithStorageip:50010,DS-7e2e14d9-3d8b-412d-bf38-3d2930a83d2f,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:1191)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1265)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1433)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1147)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:632)

根据异常栈的信息，DataNode写入策略问题导致失败

然后寻找源码位置在dfsclient中，发现是客户端在pipeline写数据块时候的问题，也出现了两个相关的参数：
dfs.client.block.write.replace-datanode-on-failure.enable=true

如果在写pipeline中存在一个DataNode或者网络故障时，那么DFSClient将尝试从pipeline中删除失败的DataNode，然后继续尝试剩下的DataNodes进行写入。结果，pipeline中的DataNodes的数量在减少。该特性是在pipeline中添加新的DataNodes。这是一个site-wide属性来enable/disable该特性。当集群规模非常小时，例如3个节点或更少时，集群管理员可能希望将策略设置为NEVER在默认配置文件或禁用该特性。否则，因为找不到新的DataNode来替换，用户可能会经历异常高的pipeline错误

dfs.client.block.write.replace-datanode-on-failure.policy=DEFAULT

这个属性只有在dfs.client.block.write.replace-datanode-on-failure.enable设置true时有效：

ALWAYS：当一个存在的DataNode被删除时，总是添加一个新的DataNode

NEVER：永远不添加新的DataNode

DEFAULT：副本数是r，DataNode的数时n，只要r >= 3时，或者floor(r/2)大于等于n时，r>n时再添加一个新的DataNode，并且这个块是hflushed/appended

conf.set("dfs.client.block.write.replace-datanode-on-failure.policy","NEVER");
conf.set("dfs.client.block.write.replace-datanode-on-failure.enable","true");

wangweislk

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
【异常】Spark写入HBase时写入DataNode失败：dfs.client.block.write.replace-datanode-on-failure.policy

问题描述：在SparkStreaming长时间写入HBase的时候，会下面的异常问题：2017-12-24 23:20:34 [ SparkListenerBus:540107357 ] - [ ERROR ] Listener EventLoggingListener threw an exceptionjava.io.IOException: Failed to replace
复制链接

扫一扫