安装时namenode如果-format操作,即格式化操作,若此操作>1次,会导致datanode的clusterID与namenode的clusterID不一致。导致hadoop跑起来后datanode无法生成。
网上的教程都没有怎么讲如何进去namenode的VERSION,真实找半天。。
【如何查看这个情况的日志?】
日志放在hadoop安装目录下的/logs/下
在logs下 命令 ll 可以查看最近的日志,查看最近的datanode日志可以看到报错。
截取日志中 hadoop-root-datanode-centos151.log1 片段
2021-03-24 19:20:49,360 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
2021-03-24 19:20:49,657 INFO org.apache.hadoop.hdfs.server.common.Storage: Using 1 threads to upgrade data directories (dfs.datanode.parallel.volumes.load.threads.num=1, dataDirs=1)
2021-03-24 19:20:49,663 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /tmp/hadoop-root/dfs/data/in_use.lock acquired by nodename 3928@localhost
2021-03-24 19:20:49,669 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage directory [DISK]file:/tmp/hadoop-root/dfs/data/
java.io.IOException: Incompatible clusterIDs in /tmp/hadoop-root/dfs/data: namenode clusterID = CID-575eba8a-32a5-4295-932e-b7ca4cd91284; datanode clusterID = CID-94c1b110-34db-414b-8714-a298f4bc8d66
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:775)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:300)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:416)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:395)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:573)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1362)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1327)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:223)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:802)
at java.lang.Thread.run(Thread.java:745)
2021-03-24 19:20:49,671 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to centos151/127.0.0.1:9000. Exiting.
java.io.IOException: All specified directories are failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:574)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1362)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1327)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:223)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:802)
at java.lang.Thread.run(Thread.java:745)
2021-03-24 19:20:49,671 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to centos151/127.0.0.1:9000
2021-03-24 19:20:49,773 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid unassigned)
2021-03-24 19:20:51,774 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2021-03-24 19:20:51,775 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2021-03-24 19:20:51,837 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
可以看到有java.io.IOException报错
【解决】
将datanode的clusterID修改成和namenode一样
在hadoop下直接打开 /tmp/hadoop-root/dfs/name,找到current文件夹,进去就看见有VERSION文件,点开可以看到namecode的clusterID
/tmp/路径无法使用 ls 命令看到,估计是因为这个是临时文件的原因?
[root@centos151 hadoop-2.7.3]# cd /tmp/hadoop-root/dfs/name
[root@centos151 name]# ls
current in_use.lock
[root@centos151 name]# cd current/
可以看到有namenode的clusterID,复制下来。
然后打开/tmp/hadoop-root/dfs/data下的current文件夹下的VERSION文件,将datanode的clusterID修改成namenode的clusterID
然后 sbin/stop-all.sh关掉hadoop后 sbin/start-all.sh 重启服务即可。
重启后jps一下,可以看到现在有datanode了。