Hadoop HA启动失败，反复打印连接失败。

最新推荐文章于 2024-05-05 19:12:30 发布

QuickHadoop

最新推荐文章于 2024-05-05 19:12:30 发布

阅读量718

点赞数

分类专栏： Hadoop 文章标签： hadoop zookeeper namenode HA RetryUpToMaximumCoun

本文链接：https://blog.csdn.net/tommy_guolin/article/details/38556625

版权

Hadoop 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

环境是Hadoop2.4.1+Zookeeper3.4.0+HA自动恢复。

前两天还可以正常启动，今天突然启动失败，NameNode进程自动关闭了！

日志打印如下：

2014-08-14 19:20:03,388 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: NameNode1/172.16.168.134:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-08-14 19:20:03,392 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: NameNode2/172.16.168.144:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-08-14 19:20:03,392 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: DataNode2/172.16.186.84:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-08-14 19:20:04,399 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: NameNode1/172.16.168.134:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-08-14 19:20:04,399 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: NameNode2/172.16.168.144:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-08-14 19:20:04,400 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: DataNode2/172.16.186.84:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-08-14 19:20:05,401 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: NameNode1/172.16.168.134:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-08-14 19:20:05,412 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: NameNode2/172.16.168.144:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-08-14 19:20:12,445 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=QJM to [172.16.168.134:8485, 172.16.168.144:8485, 172.16.177.183:8485, 172.16.186.84:8485], stream=null))
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 3/4.

发现是反复连接失败，但是有一个节点连接是正常的。

查了一下Hadoop官网上的ConnectionRefused说明，他列举了很多种原因，

后来发现是我的主机域名服务器被别人改了，被一台DHCP服务器自动改了。

解决办法：

修改ec/resolv.conf
nameserver 8.8.8.8

之后就正常了，立此文章以记之。