Hadoop高可用配置(HA)

最新推荐文章于 2024-08-28 00:06:41 发布

DejaVo-

最新推荐文章于 2024-08-28 00:06:41 发布

阅读量1.7k

点赞数 2

分类专栏：笔记文章标签： hadoop 大数据 hdfs

本文链接：https://blog.csdn.net/qq_52299578/article/details/126818611

版权

本文详细指导如何在四节点环境中搭建Hadoop高可用(HA)集群，包括Zookeeper配置、HDFS和YARN的配置，以及启动、停止和测试步骤，重点介绍了Zookeeper quorum、NameNode HA和ResourceManager HA的设置。

摘要由CSDN通过智能技术生成

Hadoop高可用搭建

一、集群规划

四台主机，主机映射如下图

[root@mast conf]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
172.20.10.7 mast
172.20.10.4 node1
172.20.10.5 node2
172.20.10.6 node3

二、配置Zookeeper（mast节点上）

配置过程：

1.在主节点node1上配置Zookeeper

2.解压Zookeeper安装包及前期准备

# 切换root账号
su
# 解压zookeeper
tar -zxvf zookeeper-3.4.5.tar.gz -C /export/server/
# 创建zookeeper的相关目录
mkdir /export/server/zookeeper-3.4.5/{
   zkdata,logs}

# 配置环境变量
vi /etc/profile

# 配置Zookeeper
export ZOOKEEPER_HOME=/export/server/zookeeper-3.4.5
export PATH=.:$PATH:$ZOOKEEPER_HOME/bin



# 分发hosts文件到各个节点(node1)
scp /etc/hosts node1:/etc/
scp /etc/hosts node2:/etc/
scp /etc/hosts node3:/etc/

3.配置zoo.cfg文件

[root@mast conf]# cat zoo.cfg 
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/export/server/zookeeper-3.4.5/zkdata
dataLogDir=/export/server/zookeeper-3.4.5/logs
# the port at which the clients will connect
clientPort=2181
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

# cluster
server.3=node2:2888:3888
server.4=node3:2888:3888

4.将配置好的Zookeeper发送到其他节点

# 将环境变量发送到其他节点
scp /etc/profile node1:/etc/
scp /etc/profile node2:/etc/
scp /etc/profile node3:/etc/

# 将配置好的zookeeper发送到其他节点
scp -r zookeeper-3.4.5/ node1:/export/server/
scp -r zookeeper-3.4.5/ node2:/export/server/
scp -r zookeeper-3.4.5/ node3:/export/server/

# 配置zookeeper的myid配置文件
# cluster中的server.{num}=node2:2888:3888 需要与myid中的数字一致

5.创建myid文件

注意：只需要在server结点下面创建myid文件，并且文件的内容就为zoo.cfg里面的几号机器的数字，如下，在node2里面创建的myid内容为3。

vi /export/server/zookeeper-3.4.5/zkdata/myid  添加命令
[root@node2 zkdata]# cat myid 
3
[root@node2 zkdata]#

三、配置Hadoop集群

1、修改配置hadoop-env.sh中JDK和Hadoop路径

# The java implementation to use.
export JAVA_HOME=/export/server/jdk1.8.0_171/

# The jsvc implementation to use. Jsvc is required to run secure datanodes
# that bind to privileged ports to provide authentication of data transfer
# protocol.  Jsvc is not required if SASL is configured for authentication of
# data transfer protocol using non-privileged ports.
#export JSVC_HOME=${JSVC_HOME}

#export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
export HADOOP_CONF_DIR=/export/server/hadoop-2.7.3/etc/hadoop

2、修改core-site.xml


	    <configuration>
	    <!--fs.default.name,fs.defaultFS二选一 -->
	    <property>
	    <name>fs.defaultFS</name>
	    <value>hdfs://lh1</value>
	    <description>HDFS的URL，HA下配置</description>
	    </property>
	    <property>
	    <name>hadoop.tmp.dir</name>
	    <value>/export/server/hadoop-2.7.3/tmp</value>
	    <description>节点上本地的hadoop临时文件夹</description>
	    </property>
	    <property>
	    <name>ha.zookeeper.quorum</name>
	    <value>node2:2181,node3:2181</value>
	    <description>指定HDFS HA配置</description>
	    </property>