一、环境准备
1.1、准备集群机器
IP | 主机名 | 角色 | 备注 |
---|---|---|---|
192.168.110.42 | hadoop-master | master | |
192.168.110.43 | hadoop-data01 | slave | |
192.168.110.44 | hadoop-data02 | slave |
1.2、关闭防火墙、selinux
注!!! 无特殊说明,所有主机均需操作
systemctl stop firewalld.service && systemctl disable firewalld.service
1.3、修改/etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.110.42 hadoop-master
192.168.110.43 hadoop-data01
192.168.110.44 hadoop-data02
1.4、配置数据目录,在主节点操作
完整路径 | 说明 |
---|---|
/opt/hadoop | hadoop 的程序安装主目录 |
/home/hadoop/hd_space/temp | 临时目录 |
/home/hadoop/hd_space/hdfs/name | NameNode 上存储 HDFS 名字空间元数据 |
/home/hadoop/hd_space/hdfs/data | DataNode 上数据块的物理存储位置 |
/home/hadoop/hd_space/mapreduce/local | tasktracker 上执行 MapReduce 程序时的本 地目录 |
/home/hadoop/hd_space/mapreduce/system | HDFS 中的 |
rm -rf /home/hd_space
mkdir -p /home/hadoop/hd_space/temp
mkdir -p /home/hadoop/hd_space/hdfs/name
mkdir -p /home/hadoop/hd_space/hdfs/data
mkdir -p /home/hadoop/hd_space/mapreduce/local
mkdir -p /home/hadoop/hd_space/mapreduce/system
1.5、配置ssh免秘钥登录,配置主节点到数据节点和各自节点自身免秘钥登录
ssh-keygen -t rsa
二、配置Java环境
2.1、将jdk-8u161-linux-x64.tar.gz
上传到各节点的 /root
目录下 ,并解压
tar zxvf jdk-8u161-linux-x64.tar.gz -C /usr/local
mv /usr/local/jdk-8u161-linux-x64 /usr/local/jdk8
2.2、配置各节点环境变量
vim /etc/profile
export JAVA_HOME=/usr/local/jdk8
export JRE_HOME=/usr/local/jdk8/jre
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
2.3、加载并验证环境变量
source /etc/profile
$ java -version 打印出如下版本说明安装成功
java version "1.8.0_241"
Java(TM) SE Runtime Environment (build 1.8.0_241-b07)
Java HotSpot(TM) 64-Bit Server VM (build 25.241-b07, mixed mode)
三、安装配置hadoop
3.1、下载、解压hadoop
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-2.9.2/hadoop-2.9.2.tar.gz
tar -zxvf hadoop-2.9.2.tar.gz -C /opt
3.2、配置环境变量
$ vi /etc/profile
# set hadoop environment
export HADOOP_HOME=/opt/hadoop-2.9.2
export HADOOP_CONF_DIR=/opt/hadoop-2.9.2/etc/hadoop
export YARN_CONF_DIR=/opt/hadoop-2.9.2/etc/hadoop
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
$ source /etc/profile
3.3、修改hadoop配置
以下修改配置操作在主节点操作,然后将配置文件复制到其他节点即可
- 配置 hadoop-env.sh
$ vim /opt/hadoop-2.9.2/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/local/jdk8
- 配置 yarn-env.sh
$ vim /opt/hadoop-2.9.2/etc/hadoop/yarn-env.sh
export JAVA_HOME=/usr/local/jdk8
- 配置 core-site.xml
$ vim /opt/hadoop-2.9.2/etc/hadoop/core-site.xml
<!-- 添加以下配置 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop-master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hd_space/temp</value>
</property>
- 配置 hdfs-site.xml
$ vim /opt/hadoop-2.9.2/etc/hadoop/hdfs-site.xml
<!-- 添加以下配置 -->
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/hd_space/hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/hd_space/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop-data01:50090</value>
<description>The secondary namenode http server address and port.
</description>
</property>
- 配置 mapred-site.xml
此文件不存在的话拷贝一份
cd /opt/hadoop-2.9.2/etc/hadoop
cp mapred-site.xml.template mapred-site.xml
$ vim /opt/hadoop-2.9.2/etc/hadoop/mapred-site.xml
<!-- 添加以下配置 -->
<property>
<name>mapreduce.cluster.local.dir</name>
<value>/home/hadoop/hd_space/mapred/local</value></property>
<property>
<name>mapreduce.cluster.system.dir</name>
<value>/home/hadoop/hd_space/mapred/system</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop-master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop-master:19888</value>
</property>
- 配置 masters
# 删除localhost,把localhost修改为 NameNode 的主机名,如果没有则新建一个 masters
文件
$ vim /opt/hadoop-2.9.2/etc/hadoop/masters
hadoop-master
- 配置 slaves
# 删除localhost,加入所有 DataNode 的主机名
$ vim /opt/hadoop-2.9.2/etc/hadoop/slaves
hadoop-data01
hadoop-data02
- 配置 yarn-site.xml
$ vim /opt/hadoop-2.9.2/etc/hadoop/yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop-master</value>
</property>
将主节点的所有配置文件复制到数据节点上
scp -r /opt/hadoop-2.9.2 root@hadoo-data01:/opt
scp -r /opt/hadoop-2.9.2 root@hadoo-data02:/opt
四、启动验证hadoop
以下操作均在主节点执行
4.1、初始化hadoop
cd /opt/hadoop-2.9.2/bin
hadoop namenode -format
4.2、启动hadoop
cd /opt/hadoop-2.9.2/sbin
start-all.sh
4.3、确认hadoop启动是否正常
查看HDFS: http://192.168.110.43:50070/explorer.html#/