准备操作:
- 创建三个虚拟机(master x1 service x2),我这里使用的是UbuntuServer。
- 安装jdk
# 安装jdk8 sudo apt-get install openjdk-8-jdk # 查看安装路径 sudo update-alternatives --config java # 设置java home sudo vim ~/.bashrc # 在末尾加入如下代码 export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 export PATH=$JAVA_HOME/bin:$PATH export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar # 查看安装结果 java -version
- 配置无密码登录ssh
Linux 配置 免密登录 学习笔记https://blog.csdn.net/HongzhuoO/article/details/123451766
- 下载Hadoop压缩包并创建一个文件夹用于存放。
- 使用命令解压,这里我使用的是3.3.1版本
tar -xzvf hadoop-3.3.1-aarch64.tar.gz
- 编辑全局配置文件,追加Hadoop路径
# 编辑用户下的文件 sudo vim ~/.bashrc # 追加以下配置项目 export HADOOP_HOME=$HOME/hadoop_ss/hadoop-3.3.1 export PATH=$PATH:$HADOOP_HOME/bin export PATH=$PATH:$HADOOP_HOME/sbin 编辑完成后建议重启一次 ,或者执行下面的命令 . ~/.bashrc
- 进入hadoop主目录创建存储文件夹
# 进入hadoop主目录 cd $HADOOP_HOME # 创建dfs 和 tmp 文件夹 mkdir dfs tmp # 进入刚刚创建好的dfs文件夹 cd dfs # 创建 data 和 name 文件夹 mkdir data name
- 切换到hadoop 配置文件目录准备开始配置
# 切换到 hadoop 配置文件目录 cd $HADOOP_HOME/etc/hadoop
- 编辑 workers 文件 (hadoop 2.x 版本叫 slaves 3.x 叫 workers)
# 编辑 workers vim workers # 删除默认的 localhost 并添加所有主机名 (就是之前 hosts 文件中配置的那些) # 例如 hdfs_master hdfs_name1 hdfs_name2
- 编辑 hadoop-env.sh 脚本,填写 JAVA_HOME路径
# 找到 JAVA_HOME 解开注释并填写为主机正确的Javahome 路径 ## ## THIS FILE ACTS AS THE MASTER FILE FOR ALL HADOOP PROJECTS. ## SETTINGS HERE WILL BE READ BY ALL HADOOP COMMANDS. THEREFORE, ## ONE CAN USE THIS FILE TO SET YARN, HDFS, AND MAPREDUCE ## CONFIGURATION OPTIONS INSTEAD OF xxx-env.sh. ## ## Precedence rules: ## ## {yarn-env.sh|hdfs-env.sh} > hadoop-env.sh > hard-coded defaults ## ## {YARN_xyz|HDFS_xyz} > HADOOP_xyz > hard-coded defaults ## # Many of the options here are built from the perspective that users # may want to provide OVERWRITING values on the command line. # For example: # JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 # # Therefore, the vast majority (BUT NOT ALL!) of these defaults # are configured for substitution and not append. If append # is preferable, modify this file accordingly.
- 编辑 core-site.xml 配置文件
# 在configuration节点中进行配置,具体如下 <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://devhdfsmaster(这是主节点名称):9000</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:(hadoop 下的tmp文件夹路径。如果是在用户目录下,那么需要加上用户目录路径)/hadoop_ss/hadoop-3.3.1/tmp</value> </property> </configuration>
- 编辑 hdfs-site.xml 配置文件
# 修改configuration 节点内容,如下所示。 <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> ######################################################################### ###### 该配置项目仅可开发测试时临时加上,不可用于生产环境和预发布测试 ####### <property> <name>dfs.permissions.enabled</name> <value>false</value> </property> ######################################################################### <property> <name>dfs.namenode.name.dir</name> <value>file:(hadoop的name文件夹路径,如果是用户目录下 那么需要加上用户目录)/hadoop_ss/hadoop-3.3.1/dfs/name</value> </property> <property> <name>dfs.namenode.data.dir</name> <value>file:(hadoop的data文件夹路径,如果是用户目录下 那么需要加上用户目录)/hadoop_ss/hadoop-3.3.1/dfs/data</value> </property> <property> <name>dfs.replication</name> <!-- 从结点数 --> <value>2</value> </property> <property> <name>dfs.http.address</name> <value>主节点主机名:50070</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>主节点主机名:50090</value> </property> </configuration>
- 修改 mapred-site.xml 配置文件
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>Master主机名:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>Master主机名:19888</value> </property> </configuration>
- 修改 yarn-site.xml 配置文件
<?xml version="1.0"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.resourcemanager.address</name> <value>Master主机名:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>Master主机名:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>Master主机名:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>Master主机名:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>Master主机名:8088</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>
- 将配置好的Hadoop文件夹发送到其他节点
scp -r hadoop文件夹 用户名@主机名:
不要忘了 在其他节点也要执行【步骤6】编辑全局配置文件,追加Hadoop路径
- 修改每个节点上的Hadoop文件夹权限
chmod -R a+w hadoop_ss/hadoop-3.3.1
- 格式化Namenode
$HADOOP_HOME/bin/hdfs namenode -format
- 常用命令
# 启动 start-all.sh # 停止 stop-all.sh # 上传一张照片到 TestPhoto 目录 hadoop fs -put test.png /TestPhoto
- 在网页查看控制台 访问 http://主节点IP:50070
注意事项:
如果在网页或者程序中出现上传或下载错误以及找不到主机这样的问题,可以尝试把Hadoop服务器的Hosts文件给程序或网站服务器一份。(Windows 的host路径为 C:\Windows\System32\drivers\etc )。但这可能仍然不是一个好的解决方法,期待后续解决。