如何修改Namenode侦测错误Datanode的时间

最新推荐文章于 2023-07-31 22:34:27 发布

矮胖挫

最新推荐文章于 2023-07-31 22:34:27 发布

阅读量818

点赞数

分类专栏： hadoop 文章标签： extension constants hadoop testing socket

hadoop 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

A question was asked of me recently, on how to force the namenode to detect failed datanodes more quickly.

This is very useful when testing, and can be quite risky on busy clusters.
In hadoop 19, the parameter heartbeat.recheck.interval is the primary control.
This value, in msec, is just under 1/2 of the timeout period for a Datanode.

In hadoop 0.19, in FSNamesystem.java:

long heartbeatInterval = conf.getLong("dfs.heartbeat.interval", 3) * 1000; / 3 seconds
this.heartbeatRecheckInterval = conf.getInt( "heartbeat.recheck.interval", 5 * 60 * 1000); // 5 minutes
this.heartbeatExpireInterval = 2 * heartbeatRecheckInterval + 10 * heartbeatInterval;

To change the timeout at your client/application level, the parameter
this.socketTimeout = conf.getInt(" dfs.socket.timeout ",
HdfsConstants.READ_TIMEOUT);
this.datanodeWriteTimeout = conf.getInt(" dfs.datanode.socket.write.timeout ",
HdfsConstants.WRITE_TIMEOUT);

dfs.socket.timeout controls the base timeout for read/connect operations against a datanode.

The constants are:
// Timeouts for communicating with DataNode for streaming writes/reads
public static int READ_TIMEOUT = 60 * 1000;
public static int WRITE_TIMEOUT = 8 * 60 * 1000;
public static int WRITE_TIMEOUT_EXTENSION = 5 * 1000; //for write pipeline

The write timeout for an individual dfs write operation is defined as
long writeTimeout = HdfsConstants.WRITE_TIMEOUT_EXTENSION * nodes.length +
datanodeWriteTimeout;
or 40 minutes times the replication factor.

The socket timeout for a dfs operation is defined as:
int timeoutValue = 3000 * nodes.length + socketTimeout;
which is roughly the replication factor times 3 minutes.

矮胖挫

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
如何修改Namenode侦测错误Datanode的时间

A question was asked of me recently, on how to force the namenode to detect failed datanodes more quickly.This is very useful when testing, and can be quite risky on busy clusters.In hadoop 19,
复制链接

扫一扫