一、准备三台服务器
需要安装JDK:
https://blog.csdn.net/qq_39680564/article/details/82768938
需要配置免秘钥与修改主机名:
https://blog.csdn.net/qq_39680564/article/details/89498678
需要安装zookeeper集群
https://blog.csdn.net/qq_39680564/article/details/89500281
IP | 主机名 | HDFS | MapReduce/Yarn |
---|---|---|---|
192.168.1.159 | server1 | NameNode | ResourceManager |
192.168.1.198 | server2 | DataNode | NodeManager |
192.168.1.199 | server3 | DataNode | NodeManager |
二、下载解压Hadoop包(三台服务器同步操作)
cd /opt/
wget http://archive.apache.org/dist/hadoop/core/hadoop-3.0.3/hadoop-3.0.3.tar.gz
tar -zxvf hadoop-3.0.3.tar.gz
三、配置环境变量(三台服务器同步操作)
vim ~/.bashrc
新增内容
export HADOOP_HOME=/opt/hadoop-3.0.3
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH
刷新查看是否设置成功
[root@server1 opt]# source ~/.bashrc
[root@server1 opt]# echo $HADOOP_HOME
/opt/hadoop-3.0.3
四、修改配置文件(三台服务器同步操作)
除了4.5、4.6、4.7三步之外,其他操作三台同步操作
4.1 修改hadoop-env.sh文件
vim /opt/hadoop-3.0.3/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/opt/jdk-1.8
export HADOOP_HOME=/opt/hadoop-3.0.3
如图:
4.2 修改core-site.xml文件
vim /opt/hadoop-3.0.3/etc/hadoop/core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://server1:9000</value>
</property>
如图:
4.3 修改yarn-site.xml文件
vim /opt/hadoop-3.0.3/etc/hadoop/yarn-site.xml
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>server1:8025</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>server1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>server1:8050</value>
</property>
如图:
4.4 修改mapred-site.xml文件
vim /opt/hadoop-3.0.3/etc/hadoop/mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>server1:54311</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
如图:
4.5 修改hdfs-site.xml文件(server1)
mkdir -p /data/hadoop/hadoop_data/hdfs/namenode
chown -R root:root /data/hadoop
vim /opt/hadoop-3.0.3/etc/hadoop/hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value> file:/data/hadoop/hadoop_data/hdfs/namenode</value>
</property>
如图:
4.6 修改hdfs-site.xml文件(server2)
mkdir -p /data/hadoop/hadoop_data/hdfs/datanode
chown -R root:root /data/hadoop
vim /opt/hadoop-3.0.3/etc/hadoop/hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value> file:/data/hadoop/hadoop_data/hdfs/datanode</value>
</property>
如图:
4.7 修改hdfs-site.xml文件(server3)
mkdir -p /data/hadoop/hadoop_data/hdfs/datanode
chown -R root:root /data/hadoop
vim /opt/hadoop-3.0.3/etc/hadoop/hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value> file:/data/hadoop/hadoop_data/hdfs/datanode</value>
</property>
如图:
4.8 修改workers文件
vim /opt/hadoop-3.0.3/etc/hadoop/workers
server2
server3
如图:
4.9 修改start-dfs.sh和stop-dfs.sh文件
vim /opt/hadoop-3.0.3/sbin/start-dfs.sh
vim /opt/hadoop-3.0.3/sbin/stop-dfs.sh
分别在文首添加
HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
如图:
4.10 修改start-yarn.sh和stop-yarn.sh文件
vim /opt/hadoop-3.0.3/sbin/start-yarn.sh
vim /opt/hadoop-3.0.3/sbin/stop-yarn.sh
分别在文首添加
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
如图:
五、启动Hadoop集群
5.1初始化NameNode
/opt/hadoop-3.0.3/bin/hadoop namenode -format
初始化成功,为0成功
5.2启动HDFS(在server1上启动)
/opt/hadoop-3.0.3/sbin/start-dfs.sh
[root@server1 hadoop-3.0.3]# /opt/hadoop-3.0.3/sbin/start-dfs.sh
Starting namenodes on [server1]
上一次登录:四 4月 25 09:05:00 CST 2019从 192.168.1.168pts/0 上
server1: Warning: Permanently added 'server1,192.168.1.159' (ECDSA) to the list of known hosts.
Starting datanodes
上一次登录:四 4月 25 10:48:37 CST 2019pts/0 上
server2: WARNING: /opt/hadoop-3.0.3/logs does not exist. Creating.
server3: WARNING: /opt/hadoop-3.0.3/logs does not exist. Creating.
Starting secondary namenodes [server1]
上一次登录:四 4月 25 10:48:40 CST 2019pts/0 上
2019-04-25 10:48:51,818 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
如图
5.3启动yarn(在server1上启动)
/opt/hadoop-3.0.3/sbin/start-yarn.sh
[root@server1 hadoop-3.0.3]# /opt/hadoop-3.0.3/sbin/start-yarn.sh
Starting resourcemanager
上一次登录:四 4月 25 10:48:45 CST 2019pts/0 上
Starting nodemanagers
上一次登录:四 4月 25 10:58:25 CST 2019pts/0 上
如图
5.4 查看进程
jps分别查看server1、server2、server3的进程
server1
server2
server3
QuorumPeerMain zookeeper的进程
ResourceManager yarn进程
NodeManager yarn进程
SecondaryNameNode HDFS进程
NameNode HDFS进程
DataNode HDFS进程
5.5 访问
http://192.168.1.159:9870/dfshealth.html#tab-overview
http://192.168.1.198:9864/datanode.html
http://192.168.1.199:9864/datanode.html
http://192.168.1.159:8088/cluster
http://192.168.1.198:8042/node
http://192.168.1.199:8042/node