今天启动hdfs时hdfs namenode -format了一下,导致再次启动时datanode都没动静,查看日志如下:
2018-08-30 10:22:59,831 WARN org.apache.hadoop.hdfs.server.common.Storage: java.io.IOException: Incompatible clusterIDs in /home/hadoop/hadoop/tmp/dfs/data: namenode clusterID = CID-7d9e3b94-040f-4021-8f37-394e3446574a; datanode clusterID = CID-ea1d2d26-d301-48a1-92cb-ca21e9b30a20
2018-08-30 10:22:59,832 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to hadoop1/192.168.25.100:9000. Exiting.
java.io.IOException: All specified directories are failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1338)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1304)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:314)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:226)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:867)
at java.lang.Thread.run(Thread.java:748)
从日志可以发现启动失败是由于namenode和datanode的clusterID不一致造成
namenode clusterID = CID-7d9e3b94-040f-4021-8f37-394e3446574a
datanode clusterID = CID-ea1d2d26-d301-48a1-92cb-ca21e9b30a20
解决方法:
实质就是将datanode中VERSION的clusterID和namenode的统一,这里我直接将datanode下${hadoop.tmp.dir}/dfs/data下的文件夹和数据全都删除,使得启动hdfs时重新生成datanode信息,然后在namenode上重新-format,最后start-hdfs.sh。