【异常】Spark写入HBase时写入DataNode失败:dfs.client.block.write.replace-datanode-on-failure.policy

问题描述:

在SparkStreaming长时间写入HBase的时候,会下面的异常问题:

2017-12-24 23:20:34  [ SparkListenerBus:540107357 ] - [ ERROR ]  Listener EventLoggingListener threw an exception
java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[ip:50010,DS-d9caacf5-a95a-45ab-8231-95decdbe4889,DISK], DatanodeInfoWithStorage[ip:50010,DS-7e2e14d9-3d8b-412d-bf38-3d2930a83d2f,DISK]], original=[DatanodeInfoWithStorage[ip:50010,DS-d9caacf5-a95a-45ab-8231-95decdbe4889,DISK], DatanodeInfoWithStorage[ip:50010,DS-7e2e14d9-3d8b-412d-bf38-3d2930a83d2f,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:1191)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1265)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1433)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1147)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:632)
2017-12-24 23:20:34  [ SparkListenerBus:540107357 ] - [ ERROR ]  Listener EventLoggingListener threw an exception
java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[ip:50010,DS-d9caacf5-a95a-45ab-8231-95decdbe4889,DISK], DatanodeInfoWithStorage[ip:50010,DS-7e2e14d9-3d8b-412d-bf38-3d2930a83d2f,DISK]], original=[DatanodeInfoWithStorage[ip:50010,DS-d9caacf5-a95a-45ab-8231-95decdbe4889,DISK], DatanodeInfoWithStorageip:50010,DS-7e2e14d9-3d8b-412d-bf38-3d2930a83d2f,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:1191)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1265)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1433)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1147)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:632)

根据异常栈的信息,DataNode写入策略问题导致失败

然后寻找源码位置在dfsclient中,发现是客户端在pipeline写数据块时候的问题,也出现了两个相关的参数:
dfs.client.block.write.replace-datanode-on-failure.enable=true

如果在写pipeline中存在一个DataNode或者网络故障时,那么DFSClient将尝试从pipeline中删除失败的DataNode,然后继续尝试剩下的DataNodes进行写入。结果,pipeline中的DataNodes的数量在减少。该特性是在pipeline中添加新的DataNodes。这是一个site-wide属性来enable/disable该特性。当集群规模非常小时,例如3个节点或更少时,集群管理员可能希望将策略设置为NEVER在默认配置文件或禁用该特性。否则,因为找不到新的DataNode来替换,用户可能会经历异常高的pipeline错误


dfs.client.block.write.replace-datanode-on-failure.policy=DEFAULT

这个属性只有在dfs.client.block.write.replace-datanode-on-failure.enable设置true时有效:

ALWAYS:当一个存在的DataNode被删除时,总是添加一个新的DataNode

NEVER:永远不添加新的DataNode

DEFAULT:副本数是r,DataNode的数时n,只要r >= 3时,或者floor(r/2)大于等于n时,r>n时再添加一个新的DataNode,并且这个块是hflushed/appended

conf.set("dfs.client.block.write.replace-datanode-on-failure.policy","NEVER"); 
conf.set("dfs.client.block.write.replace-datanode-on-failure.enable","true"); 


  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值