问题: 在初始化NameNode时,hadoop102 没有启动DataNode,但是log里又有DataNode,查看log提示:java.io.IOException: All specified directories have failed to load.
原因: 一开始初始化NameNode之后发现有问题,开始排查配置文件,修改完配置后,没有删除NameNode及DataNode相关数据,直接初始化NameNode导致clusterID不匹配;
解决方案一:
1.在以下路径找到NameNode的VERSION文件:
[xiaobai@hadoop102 current]$ pwd
/opt/module/hadoop-3.2.2/data/dfs/name/current
[xiaobai@hadoop102 current]$ vim VERSION
NameNode的VERSION文件长这个样子:
#Fri Jun 11 23:59:05 CST 2021
namespaceID=643636441
clusterID=CID-b15b1f6e-7e10-46e4-b39f-be02812c6765
cTime=1623427145615
storageType=NAME_NODE
blockpoolID=BP-1094756810-192.168.10.102-1623427145615
layoutVersion=-65
2.在以下路径找到DataNode的VERSION文件:
[xiaobai@hadoop102 current]$ pwd
/opt/module/hadoop-3.2.2/data/dfs/data/current
[xiaobai@hadoop102 current]$ vim VERSION
DataNode的VERSION文件长这个样子:
#Fri Jun 11 23:44:16 CST 2021
storageID=DS-76a2b31c-db46-4364-8acc-ea16f2bae593
clusterID=CID-062f154c-b852-49ff-9558-e8d0f9ac95b3
cTime=0
datanodeUuid=aaa74fa5-917f-49e7-9303-7bb1eef5ce55
storageType=DATA_NODE
layoutVersion=-57
3.用NameNode/VERSION中的clusterID替换DataNode/VERSION中的clusterID:
#Fri Jun 11 23:44:16 CST 2021
storageID=DS-76a2b31c-db46-4364-8acc-ea16f2bae593
clusterID=CID-b15b1f6e-7e10-46e4-b39f-be02812c6765
cTime=0
datanodeUuid=aaa74fa5-917f-49e7-9303-7bb1eef5ce55
storageType=DATA_NODE
layoutVersion=-57
4.重新启动NameNode:
[xiaobai@hadoop102 hadoop-3.2.2]$ sbin/start-dfs.sh
查看进程:
[xiaobai@hadoop102 hadoop-3.2.2]$ jps
23653 DataNode
23866 Jps
21805 NameNode
[xiaobai@hadoop103 hadoop-3.2.2]$ jps
13912 DataNode
14617 Jps
[xiaobai@hadoop104 opt]$ jps
13639 SecondaryNameNode
13544 DataNode
13677 Jps
解决方案二:
先删除NameNode/DataNode相关数据,再重新初始化NameNode;
重新格式化NameNode==>