Centos7 部署hadoop高可用集群

Hadoop介绍

Hadoop是一个由Apache基金会所开发的分布式系统基础架构。用户可以在不了解分布式底层细节的情况下,开发分布式程序。充分利用集群的威力进行高速运算和存储。Hadoop实现了一个分布式文件系统( Distributed File System),其中一个组件是HDFS(Hadoop Distributed File System)。HDFS有高容错性的特点,并且设计用来部署在低廉的(low-cost)硬件上;而且它提供高吞吐量(high throughput)来访问应用程序的数据,适合那些有着超大数据集(large data set)的应用程序。HDFS放宽了(relax)POSIX的要求,可以以流的形式访问(streaming access)文件系统中的数据。Hadoop的框架最核心的设计就是:HDFS和MapReduce。HDFS为海量的数据提供了存储,而MapReduce则为海量的数据提供了计算 。

Hadoop组织框架

Hadoop主要包括两部分:

  1. 一部分是HDFS(Hadoop Distributed File System),主要负责分布式存储和计算;
  2. 另一部分是YARN(Yet Another Resource Negotiator, 从Hadoop2.0开始引入),主要负责集群的资源管理和调度。

Hdfs架构

  • 架构图
    在这里插入图片描述
  1. Active Name Node
    主Master,整个Hadoop集群只能有一个
    管理HDFS文件系统的命名空间
    维护元数据信息
    管理副本的配置和信息(默认三个副本)
    处理客户端读写请求
  1. Standby Name Node
    Active Name Node的热备节点
    Active Name Node故障时可快速切换成新的Active Name Node
    周期性同步edits编辑日志,定期合并fsimage与edits到本地磁盘
  1. Journal Node
    可以被Active Name Node和StandBy Name Node同时访问,用以支持Active Name Node高可用
    Active Name Node在文件系统被修改时,会向Journal Node写入操作日志(edits)
    Standby Name Node同步Journal Node edits日志,使集群中的更新操作可以被共享和同步。
  1. Data Node
    Slave 工作节点,集群一般会启动多个
    负责存储数据块和数据块校验
    执行客户端的读写请求
    通过心跳机制定期向NameNode汇报运行状态和本地所有块的列表信息
    在集群启动时DataNode项NameNode提供存储Block块的列表信息
  1. Block数据块
    HDSF固定的最小的存储单元(默认128M,可配置修改)
    写入到HDFS的文件会被切分成Block数据块(若文件大小小于数据块大小,则不会占用整个数据块)
    默认配置下,每个block有三个副本
  1. Client
    与Name Node交互获取文件的元数据信息
    与Data Node,读取或者写入数据
    通过客户端可以管理HDFS

Yarn架构

  • 架构图:
    在这里插入图片描述
  1. Resource Manager
    整个集群只有一个Master。Slave可以有多个,支持高可用
    处理客户端Client请求
    启动/管理/监控ApplicationMaster
    监控NodeManager
    资源的分配和调度
  1. Node Manager
    每个节点只有一个,一般与Data Node部署在同一台机器上且一一对应
    定时向Resource Manager汇报本机资源的使用状况
    处理来自Resource Manager的作业请求,为作业分配Container
    处理来自Application Master的请求,启动和停止Container
  1. Application Master
    每个任务只有一个,负责任务的管理,资源的申请和任务调度
    与Resource Manager协商,为任务申请资源
    与Node Manager通信,启动/停止任务
    监控任务的运行状态和失败处理
  1. Container
    任务运行环境的抽象,只有在分配任务时才会抽象生成一个Container
    负责任务运行资源和环境的维护(节点,内存,CPU)
    负责任务的启动
  • 虽然在架构图中没有画出,但Hadoop高可用都是基于Zookeeper来实现的。如NameNode高可用,Block高可用,ResourceManager高可用等

部署环境介绍

  1. 系统:CentOS Linux release 7.5.1804 (Core)
  2. Hadoop:hadoop-2.7.3
  3. Zookeeper:zookeeper-3.4.10
  4. Jdk: jdk1.8.0_171

软件准备

wget https://archive.apache.org/dist/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
wget https://archive.apache.org/dist/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz
http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

HA集群部署规划

主机地址系统用户服务进程安装软件
hadoop-1192.168.10.51hadoopNameNode(Active)、ResourceManager(StandBy)、ZKFC、JobHistoryServerJDK、Hadoop
hadoop-2192.168.10.52hadoopNameNode(StandBy)、ResourceManager(Active)、ZKFC、WebProxyServerJDK、Hadoop
hadoop-3192.168.10.53hadoopDataNode、NodeManager、JournalNode、QuorumPeerMainJDK、Hadoop 、Zookeeper
hadoop-4192.168.10.54hadoopDataNode、NodeManager、JournalNode、QuorumPeerMainrJDK、Hadoop 、Zookeeper
hadoop-5192.168.10.55hadoopDataNode、NodeManager、JournalNode、QuorumPeerMainJDK、Hadoop、Zookeeper

规划说明:

  • HDFS HA通常由两个NameNode组成,一个处于Active状态,另一个处于Standby状态。
  • Active NameNode对外提供服务,而Standby NameNode则不对外提供服务,仅同步Active NameNode的状态,以便能够在它失败时快速进行切换。
  • Hadoop 2.0官方提供了两种HDFS HA的解决方案,一种是NFS,另一种是QJM。这里我们使用简单的QJM。在该方案中,主备NameNode之间通过一组JournalNode同步元数据信息,一条数据只要成功写入多数JournalNode即认为写入成功。通常配置奇数个JournalNode,这里还配置了一个Zookeeper集群,用于ZKFC故障转移,当Active NameNode挂掉了,会自动切换Standby NameNode为Active状态。
  • YARN的ResourceManager也存在单点故障问题,这个问题在hadoop-2.4.1得到了解决:有两个ResourceManager,一个是Active,一个是Standby,状态由Zookeeper进行协调。
  • YARN框架下的MapReduce可以开启JobHistoryServer来记录历史任务信息,否则只能查看当前正在执行的任务信息。
  • Zookeeper的作用是负责HDFS中NameNode主备节点的选举,和YARN框架下ResourceManaer主备节点的选举。

添加hadoop用户并配置sudo权限(所有服务器操作,hadoop-1为例)

  1. 添加hadoop用户
[root@hadoop-1 ~]# useradd hadoop
[root@hadoop-1 ~]# passswd hadoop
  1. 为hadoop用户添加sudo权限
[root@hadoop-1 ~]# cat /etc/sudoers | grep hadoop
hadoop  ALL=(ALL)       ALL

服务器时间同步(所有服务器操作,hadoop-1为例)

  1. 安装ntpdate工具
    yum -y install ntp ntpdate
  2. 设置系统时间与网络时间同步
[hadoop@hadoop-1 ~]$ sudo ntpdate 0.asia.pool.ntp.org
26 Jul 15:21:55 ntpdate[1648]: adjust time server 211.233.40.78 offset -0.007385 sec
  1. 将系统时间写入硬件时间
[hadoop@hadoop-1 ~]$ sudo hwclock --systohc
  1. 强制系统时间写入CMOS中防止重启失
[hadoop@hadoop-1 root]$ sudo hwclock -w

关闭防火墙及selinux(所有服务器操作,hadoop-1为例)

[hadoop@hadoop-1 manager]$ sudo systemctl stop firewalld
[hadoop@hadoop-1 manager]$ sudo systemctl disable firewalld
[hadoop@hadoop-1 manager]$ setenforce 0 
[hadoop@hadoop-1 manager]$ sudo sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config
[hadoop@hadoop-1 manager]$ sudo grep 'SELINUX=disabled' /etc/selinux/config
SELINUX=disabled

更改hosts文件(所有服务器操作,hadoop-1为例)

[hadoop@hadoop-1 ~]$ sudo cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.10.51 hadoop-1
192.168.10.52 hadoop-2
192.168.10.53 hadoop-3
192.168.10.54 hadoop-4
192.168.10.55 hadoop-5

配置SSH免密登录(hadoop-1、hadoop-2操作,hadoop-1为例)

[hadoop@hadoop-1 ~]$ ssh-keygen -t rsa
[hadoop@hadoop-1 ~]$ for i in hadoop-1 hadoop-2 hadoop-3 hadoop-4 hadoop-5;ssh-copy-id -i $i;done

配置Java环境(所有服务器操作,hadoop-1为例)

  1. 上传jdk包解压到指定目录
drwxr-xr-x. 8 hadoop hadoop 255 329 2018 jdk1.8.0_171
[hadoop@hadoop-1 java]$ pwd
/data/java
  1. 配置环境变量
[hadoop@hadoop-1 java]$ sudo cat /etc/profile.d/hadoop.sh 
[sudo] hadoop 的密码:
export JAVA_HOME=/data/java/jdk1.8.0_171
export JRE_HOME=/data/java/jdk1.8.0_171/jre
export CLASSPATH=./:/data/java/jdk1.8.0_171/lib:/data/java/jdk1.8.0_171/jre/lib
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin:$JAVA_HOME/bin
  1. 检验安装情况
[hadoop@hadoop-1 java]$ source /etc/profile.d/hadoop.sh 
[hadoop@hadoop-1 java]$ java -version
java version "1.8.0_171"
Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)

Zookeeper集群安装(hadoop-3、hadoop-4、hadoop-5操作,hadoop-3为例)

  1. 将zookeeper包上传至所需服务器
  2. 解压并重命名
[hadoop@hadoop-3 data]$ cd /data/software/
[hadoop@hadoop-3 software]$ tar -zxf zookeeper-3.4.10.tar.gz -C /data/
[hadoop@hadoop-3 software]$ cd /data/
[hadoop@hadoop-3 data]$ mv zookeeper-3.4.10/ zookeeper
  1. 修改zookeeper配置文件
[hadoop@hadoop-3 ~]$ cd /data/zookeeper/conf/
[hadoop@hadoop-3 conf]$ mv zoo_sample.cfg zoo.cfg 
[hadoop@hadoop-3 conf]$ cat zoo.cfg 
tickTime=2000
# 允许心跳间隔的最大时间
initLimit=10
# 同步时限
syncLimit=5
# 数据存储目录
dataDir=/data/zookeeper/data
# 数据日志存储目录
dataLogDir=/data/zookeeper/logs
# 端口号
clientPort=2181
# 集群节点和服务端口配置
server.1=hadoop-3:2888:3888
server.2=hadoop-4:2888:3888
server.3=hadoop-5:2888:3888
# 以下为优化配置
# 服务器最大连接数,默认为10,改为0表示无限制
maxClientCnxns=0
# 快照数
autopurge.snapRetainCount=3
# 快照清理时间,默认为0
autopurge.purgeInterval=1
  1. 创建数据存储及日志目录
[hadoop@hadoop-3 conf]$ mkdir /data/zookeeper/{data,logs}
  1. 在文件夹下面创建一个文件,叫myid,并且在文件里写入server.X对应的X
# 集群节点和服务端口配置
server.1=hadoop-3:2888:3888
server.2=hadoop-4:2888:3888
server.3=hadoop-5:2888:3888

上述配置中hadoop-3是1,hadoop-4是2,hadoop-5是3,所以按如下操作
hadoop-3机器操作

echo "1" > /data/zookeeper/data/myid

hadoop-4机器操作

echo "2" > /data/zookeeper/data/myid

hadoop-5机器操作

echo "3" > /data/zookeeper/data/myid
  1. zookeeper属主属组设置为hadoop
[hadoop@hadoop-3 conf]$ chown -R hadoop.hadoop /data/zookeeper/

Zookeeper集群启动并测试(hadoop-3、hadoop-4、hadoop-5操作)

  1. zookeeper集群激动
[hadoop@hadoop-3 conf]$ cd /data/zookeeper/bin/
[hadoop@hadoop-3 bin]$ ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /data/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
  1. zookeeper集群状态查看

hadoop-3状态

[hadoop@hadoop-3 bin]$ ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /data/zookeeper/bin/../conf/zoo.cfg
Mode: follower

hadoop-4状态

[hadoop@hadoop-4 bin]$ ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /data/zookeeper/bin/../conf/zoo.cfg
Mode: leader

hadoop-4状态

[hadoop@hadoop-5 bin]$ ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /data/zookeeper/bin/../conf/zoo.cfg
Mode: follower

Hadoop集群安装(所有服务器操作,hadoop-1为例)

均使用hadoop用户操作,只需要在hadoop-1上修改

  1. 上传hadoop包并重命名
[hadoop@hadoop-1 ~]$ tar -zxf /data/software/hadoop-2.7.3.tar.gz -C /data
[hadoop@hadoop-1 data]$ mv hadoop-2.7.3/ hadoop
  1. 修改hadoop-env.sh文件
[hadoop@hadoop-1 hadoop]$ grep -n 'export JAVA_HOME' hadoop-env.sh 
25:# export JAVA_HOME=${JAVA_HOME}
26:export JAVA_HOME=/data/java/jdk1.8.0_171
  1. 修改core-site.xml文件
[hadoop@hadoop-1 hadoop]$ cat core-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->
 
<configuration>
        <!-- 指定hdfs的nameservices名称为mycluster,与hdfs-site.xml的HA配置相同 -->
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://mycluster</value>
        </property>
 
        <!-- 指定缓存文件存储的路径 -->
        <property>
                <name>hadoop.tmp.dir</name>
                <value>/data/hadoop/data</value>
        </property>
 
        <!-- 配置hdfs文件被永久删除前保留的时间(单位:分钟),默认值为0表明垃圾回收站功能关闭 -->
        <property>
                <name>fs.trash.interval</name>
                <value>1440</value>
        </property>
 
        <!-- 指定zookeeper地址,配置HA时需要 -->
        <property>
                <name>ha.zookeeper.quorum</name>
                <value>hadoop-3:2181,hadoop-4:2181,hadoop-5:2181</value>
        </property>
</configuration>
  1. 修改hdfs-site.xml文件
[root@hadoop-1 hadoop]# cat hdfs-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->
<configuration>
	<!-- 指定hdfs元数据存储的路径 -->
	<property>
		<name>dfs.namenode.name.dir</name>
		<value>/data/hadoop/data/namenode</value>
	</property>
 
	<!-- 指定hdfs数据存储的路径 -->
	<property>
		<name>dfs.datanode.data.dir</name>
		<value>/data/hadoop/data/datanode</value>
	</property>
 
	<!-- 数据备份的个数 -->
	<property>
		<name>dfs.replication</name>
		<value>3</value>
	</property>
 
	<!-- 关闭权限验证 -->
	<property>
		<name>dfs.permissions.enabled</name>
		<value>false</value>
	</property>
 
	<!-- 开启WebHDFS功能(基于REST的接口服务) -->
	<property>
		<name>dfs.webhdfs.enabled</name>
		<value>true</value>
	</property>
 
	<!-- //以下为HDFS HA的配置// -->
	<!-- 指定hdfs的nameservices名称为mycluster -->
	<property>
		<name>dfs.nameservices</name>
		<value>mycluster</value>
	</property>
 
	<!-- 指定mycluster的两个namenode的名称分别为nn1,nn2 -->
	<property>
		<name>dfs.ha.namenodes.mycluster</name>
		<value>nn1,nn2</value>
	</property>
 
	<!-- 配置nn1,nn2的rpc通信端口 -->
	<property>
		<name>dfs.namenode.rpc-address.mycluster.nn1</name>
		<value>hadoop-1:9000</value>
	</property>
	<property>
		<name>dfs.namenode.rpc-address.mycluster.nn2</name>
		<value>hadoop-2:9000</value>
	</property>
 
	<!-- 配置nn1,nn2的http通信端口 -->
	<property>
		<name>dfs.namenode.http-address.mycluster.nn1</name>
		<value>hadoop-1:50070</value>
	</property>
	<property>
		<name>dfs.namenode.http-address.mycluster.nn2</name>
		<value>hadoop-2:50070</value>
	</property>
 
	<!-- 指定namenode元数据存储在journalnode中的路径 -->
	<property>
		<name>dfs.namenode.shared.edits.dir</name>
		<value>qjournal://hadoop-3:8485;hadoop-4:8485;hadoop-5:8485/mycluster</value>
	</property>
 
	<!-- 指定journalnode日志文件存储的路径 -->
	<property>
		<name>dfs.journalnode.edits.dir</name>
		<value>/data/hadoop/journalnode/logs</value>
	</property>
 
	<!-- 指定HDFS客户端连接active namenode的java类 -->
	<property>
		<name>dfs.client.failover.proxy.provider.mycluster</name>
		<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
	</property>
 
	<!-- 配置隔离机制为ssh -->
	<property>
		<name>dfs.ha.fencing.methods</name>
		<value>sshfence</value>
	</property>
 
	<!-- 指定秘钥的位置 -->
	<property>
		<name>dfs.ha.fencing.ssh.private-key-files</name>
		<value>/home/hadoop/.ssh/id_rsa</value>
	</property>
 
	<!-- 开启自动故障转移 -->
	<property>
		<name>dfs.ha.automatic-failover.enabled</name>
		<value>true</value>
	</property>
</configuration>
  1. 修改mapred-site.xml文件
[root@hadoop-1 hadoop]# cp mapred-site.xml.template mapred-site.xml
[root@hadoop-1 hadoop]# cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->
<configuration>
	<!-- 指定MapReduce计算框架使用YARN -->
	<property>
		<name>mapreduce.framework.name</name>
		<value>yarn</value>
	</property>
 
	<!-- 指定jobhistory server的rpc地址 -->
	<property>
		<name>mapreduce.jobhistory.address</name>
		<value>hadoop-1:10020</value>
	</property>
 
	<!-- 指定jobhistory server的http地址 -->
	<property>
		<name>mapreduce.jobhistory.webapp.address</name>
		<value>hadoop-1:19888</value>
	</property>
 
	<!-- 开启uber模式(针对小作业的优化) -->
	<property>
		<name>mapreduce.job.ubertask.enable</name>
		<value>true</value>
	</property>
 
	<!-- 配置启动uber模式的最大map数 -->
	<property>
		<name>mapreduce.job.ubertask.maxmaps</name>
		<value>9</value>
	</property>
 
	<!-- 配置启动uber模式的最大reduce数 -->
	<property>
		<name>mapreduce.job.ubertask.maxreduces</name>
		<value>5</value>
	</property>
</configuration>
  1. 修改yarn-site.xml文件
[root@hadoop-1 hadoop]# cat yarn-site.xml 
<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<configuration>
 
<!-- Site specific YARN configuration properties -->
 
	<!-- NodeManager上运行的附属服务,需配置成mapreduce_shuffle才可运行MapReduce程序 -->
	<property>
		<name>yarn.nodemanager.aux-services</name>
		<value>mapreduce_shuffle</value>
	</property>
 
	<!-- 配置Web Application Proxy安全代理(防止yarn被攻击) -->
	<property>
		<name>yarn.web-proxy.address</name>
		<value>hadoop-2:8888</value>
	</property>
 
	<!-- 开启日志 -->
	<property>
		<name>yarn.log-aggregation-enable</name>
		<value>true</value>
	</property>
 
	<!-- 配置日志删除时间为7天,-1为禁用,单位为秒 -->
	<property>
		<name>yarn.log-aggregation.retain-seconds</name>
		<value>604800</value>
	</property>
 
	<!-- 修改日志目录 -->
	<property>
		<name>yarn.nodemanager.remote-app-log-dir</name>
		<value>/data/hadoop/logs</value>
	</property>
 	
<!--配置nodemanager可用的资源内存 
	<property>
		<name>yarn.nodemanager.resource.memory-mb</name>
		<value>2048</value>
	</property>
	配置nodemanager可用的资源CPU 
	<property>
		<name>yarn.nodemanager.resource.cpu-vcores</name>
		<value>2</value>
	</property> 
 -->
 
	<!-- //以下为YARN HA的配置// -->
	<!-- 开启YARN HA -->
	<property>
		<name>yarn.resourcemanager.ha.enabled</name>
		<value>true</value>
	</property>
 
	<!-- 启用自动故障转移 -->
	<property>
		<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
		<value>true</value>
	</property>
 
	<!-- 指定YARN HA的名称 -->
	<property>
		<name>yarn.resourcemanager.cluster-id</name>
		<value>yarncluster</value>
	</property>
 
	<!-- 指定两个resourcemanager的名称 -->
	<property>
		<name>yarn.resourcemanager.ha.rm-ids</name>
		<value>rm1,rm2</value>
	</property>
 
	<!-- 配置rm1,rm2的主机 -->
	<property>
		<name>yarn.resourcemanager.hostname.rm1</name>
		<value>hadoop-2</value>
	</property>
	<property>
		<name>yarn.resourcemanager.hostname.rm2</name>
		<value>hadoop-1</value>
	</property>
 
	<!-- 配置YARN的http端口 -->
	<property>
		<name>yarn.resourcemanager.webapp.address.rm1</name>
		<value>hadoop-2:8088</value>
	</property>	
	<property>
		<name>yarn.resourcemanager.webapp.address.rm2</name>
		<value>hadoop-1:8088</value>
	</property>
 
	<!-- 配置zookeeper的地址 -->
	<property>
		<name>yarn.resourcemanager.zk-address</name>
		<value>hadoop-3:2181,hadoop-4:2181,hadoop-5:2181</value>
	</property>
 
	<!-- 配置zookeeper的存储位置 -->
	<property>
		<name>yarn.resourcemanager.zk-state-store.parent-path</name>
		<value>/data/zookeeper/data/rmstore</value>
	</property>
 
	<!-- 开启yarn resourcemanager restart -->
	<property>
		<name>yarn.resourcemanager.recovery.enabled</name>
		<value>true</value>
	</property>
 
	<!-- 配置resourcemanager的状态存储到zookeeper中 -->
	<property>
		<name>yarn.resourcemanager.store.class</name>
		<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
	</property>
 
	<!-- 开启yarn nodemanager restart -->
	<property>
		<name>yarn.nodemanager.recovery.enabled</name>
		<value>true</value>
	</property>
 
	<!-- 配置nodemanager IPC的通信端口 -->
	<property>
		<name>yarn.nodemanager.address</name>
		<value>0.0.0.0:45454</value>
	</property>
</configuration>
  1. 修改slaves文件
[root@hadoop-1 hadoop]# cat slaves 
hadoop-3
hadoop-4
hadoop-5
  1. 创建配置文件中所需目录
[hadoop@hadoop-1 ~]$ mkdir /data/hadoop/{data,logs}
[hadoop@hadoop-1 ~]$ mkdir /data/hadoop/data/{namenode,datanode}
[hadoop@hadoop-1 ~]$ mkdir /data/hadoop/journalnode/logs -p
[hadoop@hadoop-1 ~]$ mkdir /data/zookeeper/data/rmstore -p
  1. 将hadoop整个目录传至其他服务器
for i in hadoop-2 hadoop-3 hadoop-4 hadoop-5;scp -r /data/hadoop hadoop@$i:/data/
  1. 配置hadoop环境变量
    hadoop-1、hadoop-2
[hadoop@hadoop-1 hadoop]$ sudo cat /etc/profile.d/hadoop.sh 
export JAVA_HOME=/data/java/jdk1.8.0_171
export JRE_HOME=/data/java/jdk1.8.0_171/jre
export CLASSPATH=./:/data/java/jdk1.8.0_171/lib:/data/java/jdk1.8.0_171/jre/lib
export HADOOP_HOME=/data/hadoop
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

hadoop-3、hadoop-4、hadoop-5

[hadoop@hadoop-3 ~]$ sudo cat /etc/profile.d/hadoop.sh 
[sudo] hadoop 的密码:
export JAVA_HOME=/data/java/jdk1.8.0_171
export JRE_HOME=/data/java/jdk1.8.0_171/jre
export CLASSPATH=./:/data/java/jdk1.8.0_171/lib:/data/java/jdk1.8.0_171/jre/lib
export HADOOP_HOME=/data/hadoop
export ZOOKEEPER_HOME=/data/zookeeper
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOOKEEPER_HOME/bin
  1. 格式化ZKFC(在hadoop-1操作)
[hadoop@hadoop-1 ~]$ hdfs zkfc -formatZK
  1. 启动journalnode(hadoop-3、hadoop-4、hadoop-5操作)
[hadoop@hadoop-3 ~]$ hadoop-daemon.sh start journalnode
starting journalnode, logging to /data/hadoop/logs/hadoop-hadoop-journalnode-hadoop-3.out
[hadoop@hadoop-3 ~]$ jps
1651 QuorumPeerMain
1812 JournalNode
1866 Jps

[hadoop@hadoop-4 ~]$ hadoop-daemon.sh start journalnode
starting journalnode, logging to /data/hadoop/logs/hadoop-hadoop-journalnode-hadoop-4.out
[hadoop@hadoop-4 ~]$ jps
1973 Jps
1758 QuorumPeerMain
1919 JournalNode

[hadoop@hadoop-5 manager]$ hadoop-daemon.sh start journalnode
starting journalnode, logging to /data/hadoop/logs/hadoop-hadoop-journalnode-hadoop-5.out
[hadoop@hadoop-5 manager]$ jps
1814 JournalNode
1868 Jps
1663 QuorumPeerMain
  1. 格式化HDFS(在hadoop-1操作)
[hadoop@hadoop-1 ~]$ hdfs namenode -format
  1. 将格式化后hadoop-1节点目录中的元数据目录复制到hadoop-2节点
[hadoop@hadoop-1 ~]$ scp /data/hadoop/data hadoop@hadoop-2:/data/hadoop/
  1. 初始化完成后可关闭journalnode(hadoop-3、hadoop-4、hadoop-5操作)
hadoop-daemon.sh stop journalnode

Hadoop集群的启动

启动步骤:**

  1. 启动zookeeper集群(hadoop-3、hadoop-4、hadoop-5操作)
    在初始化过程中,如果启动了zookeeper没有关闭进程,在这里就不用重复启动了
zkServer.sh start
  1. 启动HDFS(在hadoop-1上执行)
    此命令分别在hadoop-1、hadoop-2节点启动了NameNode和ZKFC,分别在hadoop-3、hadoop-4、hadoop-5节点启动了DataNode和JournalNode,如下所示。
[hadoop@hadoop-1 ~]$ start-dfs.sh
Starting namenodes on [hadoop-1 hadoop-2]
hadoop-1: starting namenode, logging to /data/hadoop/logs/hadoop-hadoop-namenode-hadoop-1.out
hadoop-2: starting namenode, logging to /data/hadoop/logs/hadoop-hadoop-namenode-hadoop-2.out
hadoop-4: starting datanode, logging to /data/hadoop/logs/hadoop-hadoop-datanode-hadoop-4.out
hadoop-5: starting datanode, logging to /data/hadoop/logs/hadoop-hadoop-datanode-hadoop-5.out
hadoop-3: starting datanode, logging to /data/hadoop/logs/hadoop-hadoop-datanode-hadoop-3.out
Starting journal nodes [hadoop-3 hadoop-4 hadoop-5]
hadoop-3: starting journalnode, logging to /data/hadoop/logs/hadoop-hadoop-journalnode-hadoop-3.out
hadoop-5: starting journalnode, logging to /data/hadoop/logs/hadoop-hadoop-journalnode-hadoop-5.out
hadoop-4: starting journalnode, logging to /data/hadoop/logs/hadoop-hadoop-journalnode-hadoop-4.out
Starting ZK Failover Controllers on NN hosts [hadoop-1 hadoop-2]
hadoop-2: starting zkfc, logging to /data/hadoop/logs/hadoop-hadoop-zkfc-hadoop-2.out
hadoop-1: starting zkfc, logging to /data/hadoop/logs/hadoop-hadoop-zkfc-hadoop-1.out
  1. 启动YARN(在hadoop-2上执行)
    此命令在hadoop-2节点启动了ResourceManager,分别在hadoop-3、hadoop-4、hadoop-5节点启动了NodeManager。
[hadoop@hadoop-2 ~]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /data/hadoop/logs/yarn-hadoop-resourcemanager-hadoop-2.out
hadoop-4: starting nodemanager, logging to /data/hadoop/logs/yarn-hadoop-nodemanager-hadoop-4.out
hadoop-5: starting nodemanager, logging to /data/hadoop/logs/yarn-hadoop-nodemanager-hadoop-5.out
hadoop-3: starting nodemanager, logging to /data/hadoop/logs/yarn-hadoop-nodemanager-hadoop-3.out
  1. 启动YARN的另一个ResourceManager(在hadoop-1执行,用于容灾)
[hadoop@hadoop-1 ~]$ yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /data/hadoop/logs/yarn-hadoop-resourcemanager-hadoop-1.out
  1. 启动YARN的安全代理(在hadoop-1执行)
    备注:proxyserver充当防火墙的角色,可以提高访问集群的安全性
[hadoop@hadoop-2 ~]$ yarn-daemon.sh start proxyserver
starting proxyserver, logging to /data/hadoop/logs/yarn-hadoop-proxyserver-hadoop-2.out
  1. 启动YARN的历史任务服务(在hadoop-1执行)
    备注:yarn-daemon.sh start historyserver已被弃用;CDH版本似乎有个问题,即mapred-site.xml配置的的mapreduce.jobhistory.address和mapreduce.jobhistory.webapp.address参数似乎不起作用,实际对应的端口号是10200和8188,而且部需要配置就可以在任意节点上开启历史任务服务。
#方法一:
[hadoop@hadoop-1 ~]$ mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /data/hadoop/logs/mapred-hadoop-historyserver-hadoop-1.out

#方法二:
[hadoop@hadoop-1 ~]$ yarn-daemon.sh start historyserver
  1. 集群进程查看
[hadoop@hadoop-1 manager]$ jps
2594 ResourceManager
2152 NameNode
2697 JobHistoryServer
4441 Jps
2794 ApplicationHistoryServer
2478 DFSZKFailoverController

[hadoop@hadoop-2 manager]$ jps
1937 ResourceManager
1666 NameNode
1778 DFSZKFailoverController
2243 WebAppProxyServer
2819 Jps

[hadoop@hadoop-3 manager]$ jps
2146 NodeManager
1651 QuorumPeerMain
2037 JournalNode
2390 Jps
1935 DataNode

[hadoop@hadoop-4 manager]$ jps
2258 NodeManager
2148 JournalNode
2525 Jps
1758 QuorumPeerMain
2046 DataNode

[hadoop@hadoop-5 manager]$ jps
1939 DataNode
2149 NodeManager
2405 Jps
2041 JournalNode
1663 QuorumPeerMain

web界面查看确认

  1. hdfs截图
    hadoop-1:http://192.168.10.51:50070,可看到NameNode为standby状态
    在这里插入图片描述
    hadoop-2:http://192.168.10.52:50070,可看到NameNode为active状态:
    在这里插入图片描述
  2. yarn截图
    hadoop-1:http://192.168.10.51:8088
    网页无法直接访问,会自动跳转到hadoop-2:8088的页面
    hadoop-2:http://192.168.10.52:8088
    在这里插入图片描述

参考(感谢作者)

https://blog.csdn.net/u010993514/article/details/83009822

  • 1
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值