求金赞,关注必互
方案亲测可用,附常见异常处理
1. 下载包
1.1. 下载hadoop地址:http://archive.apache.org/dist/hadoop/core/hadoop-3.0.0/
1.2. 下载winutils地址:https://github.com/steveloughran/winutils
2. 解压包
使用管理员权限解压hadoop-3.0.0.tar.gz到目录D:\hadoop
解压winutils-master.zip到目录D:\hadoop
3. 配置环境
配置HADOOP_HOME
配置PATH
4. 准备数据目录
可忽略,执行format时,会自动创建
创建目录:D:\hadoop\hadoop-3.0.0\data\namenode
创建目录:D:\hadoop\hadoop-3.0.0\data\datanode
5. 修改配置
5.1. 修改D:\hadoop\hadoop-3.0.0\etc\hadoop\core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/D:/hadoop/hadoop-3.0.0/data/tmp</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/D:/hadoop/hadoop-3.0.0/data/name</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
5.2. 修改D:\hadoop\hadoop-3.0.0\etc\hadoop\hdfs-site.xml
<configuration>
<!-- 单机版hadoop副本为1即可 -->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<!-- 注意目录分隔符"/" -->
<property>
<name>dfs.namenode.name.dir</name>
<value>/D:/hadoop/hadoop-3.0.0/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/D:/hadoop/hadoop-3.0.0/data/datanode</value>
</property>
<property>
<name>fs.checkpoint.dir</name>
<value>/D:/hadoop/hadoop-3.0.0/data/snn</value>
</property>
<property>
<name>fs.checkpoint.edits.dir</name>
<value>/D:/hadoop/hadoop-3.0.0/data/snn</value>
</property>
</configuration>
5.3. 修改D:\hadoop\hadoop-3.0.0\etc\hadoop\mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
5.4. 修改D:\hadoop\hadoop-3.0.0\etc\hadoop\yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
6. 配置JAVA_HOME
修改D:\hadoop\hadoop-3.0.0\etc\hadoop\hadoop-env.cmd
set JAVA_HOME=%JAVA_HOME%
替换为
@rem set JAVA_HOME=%JAVA_HOME%
set JAVA_HOME=D:\java\jdk8_181
7. 替换bin目录
使用D:\hadoop\winutils-master\hadoop-3.0.0\bin
替换D:\hadoop\hadoop-3.0.0\bin
8. 同步hadoop.dll
拷贝D:\hadoop\hadoop-3.0.0\bin\hadoop.dll到目录C:\Windows\System32
9. 格式化namenode
进入D:\hadoop\hadoop-3.0.0\etc\hadoop目录,shift+鼠标右键,打开Windows PowerShel执行命令:hdfs namenode -format
10. 启动服务
进入D:\hadoop\hadoop-3.0.0\sbin目录,shift+鼠标右键,打开Windows PowerShel执行命令: ./start-all.cmd
11. 验证服务
11.1. 访问地址:http://127.0.0.1:8088/
11.2. 访问地址:http://localhost:9870/
11.3. 验证功能
对hadoop执行文件命令,命令如下:
# hdfs dfs -mkdir /data
# hdfs dfs -ls /
12. 异常解决
12.1. 缺少JAVA_HOME配置
若缺少步骤6、配置JAVA_HOME,即使配置了系统默认的JAVA_HOME环境变量也会有异常:程序默认使用32位模式,执行start-all.cmd会报异常,异常详情如下:
2021-01-28 17:23:16,522 ERROR namenode.NameNode: Failed to start namenode.
java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:606)
at org.apache.hadoop.fs.FileUtil.canWrite(FileUtil.java:971)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:613)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:573)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:365)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:221)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1072)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:704)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:665)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:727)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:950)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:929)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1653)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1720)
2021-01-28 17:23:16,547 INFO util.ExitUtil: Exiting with status 1: java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
2021-01-28 17:23:16,550 INFO namenode.NameNode: SHUTDOWN_MSG:
12.1.1. 解决方案
按6、配置JAVA_HOME指定JDK目录,删除数据目录D:\hadoop\hadoop-3.0.0\data,重新依次执行命令:hdfs namenode -format、./start-all.cmd。
12.1.2. 异常重现
配置前,start-all.cmd启动失败,VM type = 32-bit
第一步执行命令:hdfs namenode -format
第二步执行命令:./start-all.cmd
配置后,启动成功,VM type = 64-bit
第一步执行命令:hdfs namenode -format
第二步执行命令:./start-all.cmd
12.2. 反复format namenode
若多次执行hdfs namenode -format在执行./start-all.cmd启动hadoop服务时会出现以下异常:
2021-01-29 11:17:26,489 WARN common.Storage: Failed to add storage directory [DISK]file:/D:/hadoop/hadoop-3.0.0/data/datanode
java.io.IOException: Incompatible clusterIDs in D:\hadoop\hadoop-3.0.0\data\datanode: namenode clusterID = CID-8d357e36-84d8-4826-88de-46eed7496841; datanode clusterID = CID-ea42610f-00fc-438a-9a85-180620d5c556
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:722)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:286)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:399)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:379)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:544)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1690)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1650)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:376)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
at java.lang.Thread.run(Thread.java:748)
2021-01-29 11:17:26,494 ERROR datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid 4b462ddc-e84c-4582-bd64-eb056507837e) service to localhost/127.0.0.1:9000. Exiting.
java.io.IOException: All specified directories have failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:545)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1690)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1650)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:376)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
at java.lang.Thread.run(Thread.java:748)
2021-01-29 11:17:26,500 WARN datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid 4b462ddc-e84c-4582-bd64-eb056507837e) service to localhost/127.0.0.1:9000
2021-01-29 11:17:26,602 INFO datanode.DataNode: Removed Block pool <registering> (Datanode Uuid 4b462ddc-e84c-4582-bd64-eb056507837e)
2021-01-29 11:17:28,603 WARN datanode.DataNode: Exiting Datanode
2021-01-29 11:17:28,609 INFO datanode.DataNode: SHUTDOWN_MSG:
12.2.1.解决方案
删除数据目录D:\hadoop\hadoop-3.0.0\data,重新依次执行命令:hdfs namenode -format、./start-all.cmd。
12.2.2. 异常重现
12.2.2.1. 重复执行命令:hdfs namenode -format
12.2.2.2. 启动hadoop服务:./start-all.cmd
目录
5.1. 修改D:\hadoop\hadoop-3.0.0\etc\hadoop\core-site.xml
5.2. 修改D:\hadoop\hadoop-3.0.0\etc\hadoop\hdfs-site.xml
5.3. 修改D:\hadoop\hadoop-3.0.0\etc\hadoop\mapred-site.xml
5.4. 修改D:\hadoop\hadoop-3.0.0\etc\hadoop\yarn-site.xml
11.1. 访问地址:http://127.0.0.1:8088/
11.2. 访问地址:http://localhost:9870/