配置hadoopHA（高可用集群）常见错误解决办法

最新推荐文章于 2024-05-14 18:31:31 发布

盖世英雄来了

最新推荐文章于 2024-05-14 18:31:31 发布

阅读量6k

点赞数 4

文章标签： namenode启动错误 hadoop搭建

本文链接：https://blog.csdn.net/qq_40513633/article/details/88933056

版权

乾坤未定，你我皆是黑马。

在学习hadoop过程中，

1.在启动第二个节点的namenode时候，出现错误。

InconsistentFSStateException: Directory /opt/modules/hadoopha/hadoop-2.5.2/data/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.

原因分析：在core-sitexml中定义的存储位置下的versionID不符合导致的例如你设置的位置是下面这样，

<configuration>
  <property>
    <!--  hdfs 地址，ha中是连接到nameservice -->
    <name>fs.defaultFS</name>
    <value>hdfs://ns1</value>
  </property>
  <property>
    <!--  -->
    <name>hadoop.tmp.dir</name>
    <value>/opt/modules/hadoopha/hadoop-2.5.2/data/tmp</value>
  </property>
</configuration>

进入tmp目录下name里面删除version。然后进入hadoop-2.5.2/bin

目录下执行重新格式话

hadoop namenode -format

之后重新启动namenode，问题解决。

2.namdenode启动失败，错误原因如下

.QuorumException: Unable to check if JNs are ready for formatting. 1 exceptions thrown:

检查hdfs-site.xml中的下列配置

  <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://hadoop1:8485;hadoop2:8485;hadoop3/ns1</value>
  </property>

然后关闭hadoop集群

sbin/stop-all.sh 后，重新启动namenode

3.改变namenode状态为Active时出现错误

Operation failed: Failed on local exception: java.io.EOFException; Host Details : local host is：destination host is

本来我是将hadoop文件夹删除之后重新解压，配置变量，结果依然出现这个问题，

出现这个错误的原因是节点下多次格式化的导致的，具体的原因也太清楚。

解决办法：

进入你设置的namenode目录下，进入data/dfs/...目录下，删除name文件夹，

然后重新格式化： bin/hdfs namenode -format

格式化之后，使用reboot重启虚拟机后，重新打开namenode节点然后不用再次格式化，直接

使用bin/hdfs haadmin -transitionToActive nn1 改变状态

查看50070端口，成功！！！

4.格式化namenode出现拒绝连接错误，如下所示

org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 1 exceptions thrown:
192.168.129.128:8485: Call From bigdata-senior01/192.168.129.128 to bigdata-senior01:8485 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
	at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
	at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
	at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.hasSomeData(QuorumJournalManager.java:232)
	at org.apache.hadoop.hdfs.server.common.Storage.confirmFormat(Storage.java:875)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.confirmFormat(FSImage.java:171)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:922)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1354)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1473)
19/04/04 19:28:36 INFO ipc.Client: Retrying connect to server: bigdata-senior02/192.168.129.130:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:28:36 INFO ipc.Client: Retrying connect to server: bigdata-senior03/192.168.129.133:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:28:36 FATAL namenode.NameNode: Exception in namenode join
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 1 exceptions thrown:
192.168.129.128:8485: Call From bigdata-senior01/192.168.129.128 to bigdata-senior01:8485 failed on connection exception: java.n

我首先将haodop目录下的临时目录tmp下的data文件删除，然后将core-site.xml中设置的hadoop.tmp.dir文件目录删除之后，（另外我把日志的内容也清空了，个人感觉删不删都行）然后重新 bin/hdfs namenode -format 之后发现仍然有错误，查看资料之后，再格式化之前要启动journalnode

sbin/hadoop-daemon.sh start journalnode

然后执行格式化命令；成功格式化

5，在对第二个namenode进行-bootstrapStandby格式化时出现错误

:57:35 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
19/04/04 19:57:35 INFO namenode.NameNode: createNameNode [-bootstrapStandby]
19/04/04 19:57:37 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:38 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:39 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:40 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:41 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:42 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:43 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:44 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:45 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:46 INFO ipc.Client: Retrying connect to server: bigdata-senior01/192.168.129.128:8020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/04/04 19:57:46 FATAL ha.BootstrapStandby: Unable to fetch namespace information from active NN at bigdata-senior01/192.168.129.128:8020: Call From bigdata-senior02/192.168.129.130 to bigdata-senior01:8020 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
19/04/04 19:57:46 INFO util.ExitUtil: Exiting with status 2
19/04/04 19:57:46 INFO namenode.NameNode: SHUTDOWN_MSG:

首先要检查防火墙是否关闭，一般都设置为开机自动关闭的，另一个可能的问题时主机器没有开启namenode

所以首先要在第一个namenode节点开启，然后进行格式化 sbin/hdfs nameno -bootstrapStandby

看到出现 Stdorage directory /home/xxx/xxx/name has been successfully formatted,表示格式化成功！！

盖世英雄来了

关注

4
点赞
踩
12

收藏

觉得还不错? 一键收藏
3
评论
配置hadoopHA（高可用集群）常见错误解决办法

乾坤未定，你我皆是黑马。在学习hadoop过程中，1.在启动第二个节点的namenode时候，出现错误。InconsistentFSStateException:Directory/opt/modules/hadoopha/hadoop-2.5.2/data/tmp/dfs/nameisinaninconsistentstate:st...
复制链接

扫一扫