title: 2.搭建Hadoop HA
categories: Big Data learning
tags: [hadoop,HDFS,YARN]
准备工作
- 1.准备好3台虚拟机
- 2.3台虚拟机互信
- 3.准备好安装包
- jdk-8u161-linux-x64.tar.gz
- hadoop-2.6.0-cdh5.7.0.tar.gz
- zookeeper-3.4.12.tar.gz
- 4.Xshell5
新增用户hadoop
在3台虚拟机上面都新建一个hadoop用户
# useradd hadoop
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-mLrTwXIh-1587624948530)(2.搭建Hadoop HA\1.png)]
在3台虚拟机上面都新建目录
# su - hadoop
$ mkdir app source software data tmp
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Hms93pxt-1587624948532)(2.搭建Hadoop HA\5.png)]
安装JDK
创建jdk存放路径
# mkdir -p /usr/java
# cd /usr/java
上传jdk压缩包
# rz
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-QDhvxkQQ-1587624948535)(2.搭建Hadoop HA\2.png)]
解压jdk压缩包
# tar -xzvf jdk-8u161-linux-x64.tar.gz
# rm -f jdk-8u161-linux-x64.tar.gz
传输到其他两台
# scp -r jdk1.8.0_161 root@hadoop002:/usr/java
# scp -r jdk1.8.0_161 root@hadoop003:/usr/java
修改三台机子上的jdk的持有者
# chown -R root:root /usr/java
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-bOj2KHbP-1587624948537)(2.搭建Hadoop HA\3.png)]
配置JDK环境变量
# vi /etc/profile
在文件的末尾加入,以下语句,并保存
#env
export JAVA_HOME=/usr/java/jdk1.8.0_161
export PATH=$JAVA_HOME/bin:$PATH
使配置生效
# source /etc/profile
查看jdk是否成功安装
# java -version
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-IOvT9oZp-1587624948539)(2.搭建Hadoop HA\4.png)]
在另外两台机子也一样做一遍,至此我们的jdk就安装完成了!
集群规划
IP | HOST | 安装软件 | 进程 |
---|---|---|---|
192.168.137.190 | hadoop001 | Hadoop、Zookeeper | NameNode DFSZKFailoverController JournalNode DataNode ResourceManager JobHistoryServer NodeManager QuorumPeerMain |
192.168.137.191 | hadoop002 | Hadoop、Zookeeper | NameNode DFSZKFailoverController JournalNode DataNode ResourceManager NodeManager QuorumPeerMain |
192.168.137.192 | hadoop003 | Hadoop、Zookeeper | JournalNode DataNode QuorumPeerMain NodeManager |
安装ZK
上传压缩包
上传ZK的压缩文件文件到hadoop001的software里面
$ cd software
$ rz
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-sCEjFy8S-1587624948541)(2.搭建Hadoop HA\6.png)]
传给另外两台机子
$ scp ~/software/zookeeper-3.4.12.tar.gz hadoop@hadoop002:/home/hadoop/software/
$ scp ~/software/zookeeper-3.4.12.tar.gz hadoop@hadoop003:/home/hadoop/software/
在所有机子上一起运行解压ZK压缩文件到app文件夹里面
$ cd ~/software
$ tar -xzvf zookeeper-3.4.12.tar.gz -C ../app/
配置ZK_HOME
$ vi ~/.bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
export ZK_HOME=/home/hadoop/app/zookeeper-3.4.12
export PATH=$ZK_HOME/bin:$PATH
$ source ~/.bash_profile
修改配置
$ cd ~/app/zookeeper-3.4.12/conf
$ cp zoo_sample.cfg zoo.cfg
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-AeJxHa6n-1587624948542)(2.搭建Hadoop HA\7.png)]
$ vi zoo.conf
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/home/hadoop/data/zookeeper
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=hadoop001:2888:3888
server.2=hadoop002:2888:3888
server.3=hadoop003:2888:3888
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-IEoxqnNq-1587624948543)(2.搭建Hadoop HA\8.png)]
传给其他节点
$ scp zoo.cfg hadoop@hadoop002:/home/hadoop/app/zookeeper-3.4.12/conf/
$ scp zoo.cfg hadoop@hadoop002:/home/hadoop/app/zookeeper-3.4.12/conf/
在三个节点上
切换到data目录,并创建一个zookeeper目录
$ cd ~/data
$ mkdir zookeeper
在zookeeper目录里面,创建一个myid文件
$ touch myid
各个节点分开做
然后在hadoop001节点上,在myid文件写入1
$ echo 1 >myid
在hadoop002节点上,在myid文件写入2
$ echo 2 >myid
在hadoop003节点上,在myid文件写入3
$ echo 3 >myid
启动集群
$ zkServer.sh start
安装Hadoop
上传hadoop的压缩包
在hadoop001机子上使用hadoop用户在software目录上传hadoop的压缩包
$ cd ~/software
$ rz
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-GddyoBtp-1587624948545)(2.搭建Hadoop HA\11.png)]
拷贝到其他两台机子上
$ scp hadoop-2.6.0-cdh5.7.0.tar.gz hadoop@hadoop002:/home/hadoop/software/
$ scp hadoop-2.6.0-cdh5.7.0.tar.gz hadoop@hadoop003:/home/hadoop/software/
解压缩包
三台自己一起执行
$ cd ~/software
$ tar -zxvf hadoop-2.6.0-cdh5.7.0.tar.gz -C ../app
配置HADOOP_HOME
在hadoop001上面配置一下HADOOP_HOME
$ vi ~/.bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
export ZK_HOME=/home/hadoop/app/zookeeper-3.4.12
export HADOOP_HOME=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZK_HOME/bin:$PATH
$ source ~/.bash_profile
修改hadoop-env.sh
$ vi /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/etc/hadoop/hadoop-env.sh
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Dxc43OF4-1587624948546)(2.搭建Hadoop HA\12.png)]
修改core-site.xml
$ vi /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/etc/hadoop/core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!--Yarn 需要使用 fs.defaultFS 指定NameNode URI -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://zzfHadoop</value>
</property>
<!-- ==============================Trash机制======================================= -->
<property>
<!--多长时间创建CheckPoint NameNode截点上运行的CheckPointer 从Current文件夹创建CheckPoint;默认:0 由fs.trash.interval项指定 -->
<name>fs.trash.checkpoint.interval</name>
<value>0</value>
</property>
<property>
<!--多少分钟.Trash下的CheckPoint目录会被删除,该配置服务器设置优先级大于客户端,默认:0 不删除 -->
<name>fs.trash.interval</name>
<value>1440</value>
</property>
<!--指定hadoop临时目录, hadoop.tmp.dir 是hadoop文件系统依赖的基础配置,很多路径都依赖它。如果hdfs-site.xml中不>配 置namenode和datanode的存放位置,默认就放在这>个路径中 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/tmp</value>
</property>
<!-- 指定zookeeper地址 -->
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
</property>
<!--指定ZooKeeper超时间隔,单位毫秒 -->
<property