解决启动集群后namenode正常显示,datanode却没启动的问题
启动Hadoop2.8.3集群后,遇到namenode启动 , 但是datanode进程没启动,查看日志发现如下报错:
java.io.IOException: Incompatible clusterIDs in /home/casliyang/hadoop2/hadoop-2.2.0/metadata/data: namenode clusterID
= CID-2cc69ada-3730-4c79-8384-c725fa85859a; datanode clusterID
= CID-3e649eb6-cdb3-4a0c-aad8-5948c66bf282
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:391)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:191)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:837)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:808)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:280)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:222)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:664)
at java.lang.Thread.run(Thread.java:722)
有些文章说的解决办法是删掉数据文件,格式化,重启集群,但这办法实在太暴力,根本无法在生产环境实施,所以还是参考另一类文章的解决办法,修改clusterID
第一步 : 查看hdfs-site.xml,找到存namenode元数据和datanode元数据的路径
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop-2.8.3/data/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop-2.8.3/data/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.secondary.http.address</name>
<value>hadoop01:50090</value>
</property>
</configuration>
第二步 :
打开namenode路径下的current/VERSION文件:
#Fri Aug 17 23:37:02 CST 2018
namespaceID=1592472446
clusterID=CID-795fc547-449e-4762-833b-eb8b0d86f490
cTime=1534520222760
storageType=NAME_NODE
blockpoolID=BP-1907567558-192.168.59.121-1534520222760
layoutVersion=-63
打开datanode路径下的current/VERSION文件:
#Sat Aug 18 00:21:38 CST 2018
storageID=DS-341c33ea-bcdf-42b2-9125-6bd79ca3970b
clusterID=CID-795fc547-449e-4762-833b-eb8b0d86f490
cTime=0
datanodeUuid=f9e861c6-2604-46f7-acac-00b543fb4c4a
storageType=DATA_NODE
layoutVersion=-57
如果name节点元数据的clusterID和data节点元数据的clusterID不一致,并且和报错信息也是如此!
接下来将data节点的clusterID修改成和name节点的clusterID一致,重启集群即可。