大数据平台搭建（二）：hadoop HA 集群搭建

闹闹的BaBa

已于 2023-03-25 21:12:42 修改

阅读量2.7k

点赞数 2

分类专栏：大数据文章标签： zookeeper hadoop hadoop HA hadoop集群 zookeeper集群

于 2018-04-11 17:52:21 首次发布

本文链接：https://blog.csdn.net/u012415035/article/details/79893785

版权

大数据专栏收录该内容

8 篇文章 8 订阅

订阅专栏

#前言

本章搭建zookeeper集群和hadoop集群

1.hadoop版本的选择

1.目前而言，不收费的Hadoop版本主要有三个（均是国外厂商），分别是：Apache（最原始的版本，所有发行版均基于这个版本进行改进）、Cloudera版本（Cloudera’s Distribution Including Apache Hadoop，简称CDH）、Hortonworks版本(Hortonworks Data Platform，简称“HDP”），对于国内而言，绝大多数选择CDH版本。

2.上一段摘自网络，由于CDH用的比较多，所以我决定用CDH来搭建集群，但是我不想通过CM来安装，而是使用CDH tar包安装方式来搭建。

3.不用Apache版本原因是：CDH 提供的jar包比较稳定且不用自己去搭配版本(更多原因请看区别)。

4.不用CM安装CDH的方式原因是：太方便太傻瓜，我自己不放心，所以最终决定使用CDH5 tar包安装。

2. CDH和Apache原始版的区别

1. CDH对Hadoop版本的划分非常清晰，比如，cdh3、cdh4和cdh5，相比而言，Apache版本则混乱得多；比Apache hadoop在兼容性，安全性，稳定性上有增强。

2.CDH总是并应用了最新Bug修复或者Feature的Patch，并比Apache hadoop同功能版本提早发布，更新速度比Apache官方快。

3.安全 CDH支持Kerberos安全认证，apache hadoop则使用简陋的用户名匹配认证

4.)CDH文档清晰，很多采用Apache版本的用户都会阅读CDH提供的文档，包括安装文档、升级文档等。

5.)CDH支持Yum/Apt包，Tar包（本次使用方式），RPM包，Cloudera Manager四种方式安装,Apache hadoop只支持Tar包安装。

3.CDH版本选择

| hadoop生态选用CDH5.9.3|
| ------------- |-------------| -----|
| jdk-8u161-linux-x64.tar.gz
|zookeeper-3.4.5-cdh5.9.3.tar.gz
|hadoop-2.6.0-cdh5.9.3.tar.gz
|hive-1.1.0-cdh5.9.3.tar.gz
|sqoop2-1.99.5-cdh5.9.3.tar.gz
|hbase-1.2.0-cdh5.9.3.tar.gz
|。。。请认准cdh5.9.3去官方下载：http://archive.cloudera.com/cdh5/cdh/5/

4.集群规划

主机名称	IP	安装软件	运行的进程
hadoop201	192.168.8.201	jdk，hadoop	NameNode、 JournalNode 、DFSZKFailoverController(zkfc)、 ResourceManager
hadoop202	192.168.8.202	jdk，hadoop	NameNode、 JournalNode 、DFSZKFailoverController(zkfc)、 ResourceManager
hadoop203	192.168.8.203	jdk，hadoop，zk	DataNode、NodeManager、QuorumPeerMain
hadoop204	192.168.8.204	jdk，hadoop，zk	DataNode、NodeManager、QuorumPeerMain
hadoop205	192.168.8.205	jdk，hadoop， zk	DataNode、NodeManager、QuorumPeerMain

DFSZKFailoverController：监控管理NameNode，必须跟NameNode在一起

JournalNode：存储NameNode的状态等信息，包括edits文件等

QuorumPeerMain： zk进程，为什么用zookeeper？主要是利用zookeeper的选举机制、故障自动转移、心跳监测等保障高可靠

5.HDFS HA示例图

这里写图片描述

参考官方文档，使用QJM的HA方案来安装：

这里写图片描述

6.zookeeper集群搭建

1.203上安装，解压zk包，找到zoo.cfg，一般有个sample配置文件，改下名字即可。

2.vim zoo.cfg，主要修改数据目录和3个集群节点的通信选举端口

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
#数据文件目录
dataDir=/home/hadoop/zookeeper/data
#日志目录
#dataLogDir=/home/hadoop/zookeeper/zkdatalog
# the port at which the clients will connect
clientPort=2181
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
#server.服务编号=主机名称：Zookeeper不同节点之间同步和通信的端口：选举端口
server.3=hadoop203:2888:3888
server.4=hadoop204:2888:3888
server.5=hadoop205:2888:3888

3.将配置好的zookeeper文件夹scp到204 205上

4.三个节点分别cd到数据目录/home/hadoop/zookeeper/data下，创建myid文件，输入各自编号3,4,5 ，要跟zoo.cfg中配置的编号一致。

5.启动zk,/home/hadoop/zookeeper/sbin/zkServer.sh start

6.验证，jps zk进程QuorumPeerMain已启动。zkServer.sh status，一个leader，两个follower，即搭建成功。如果不放心，可以杀掉一个zk，然后看角色变化。

这里写图片描述

7.hadoop集群搭建

1.下载安装notepad++插件NPPFTP，为了修改配置文件方便,文件可以直接在notepad中打开修改

2.下载解压hadoop对应cdh版本5.9.3，在201上安装，并设置环境变量

3.配置hdfs，共4个，参考官方文档

hadoop-env.sh，主要是配置jdk

core-site.xml

	<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<!-- 指定hdfs的nameservice为mycluster -->
	<property>
	  <name>fs.defaultFS</name>
	  <value>hdfs://mycluster</value>
	</property>
	<!-- hadoop的临时目录 -->
	<property>  
		<name>hadoop.tmp.dir</name>  
		<value>/home/hadoop/hadoop/tmp</value>  
	</property>
	<!-- JournalNode存储本地状态等的临时目录 -->
	<property>
	  <name>dfs.journalnode.edits.dir</name>
	  <value>/home/hadoop/journalnodeTmp</value>
	</property>
	<!-- 指定zookeeper集群节点-->
	<property>
	   <name>ha.zookeeper.quorum</name>
	   <value>hadoop203:2181,hadoop204:2181,hadoop205:2181</value>
	 </property>
</configuration>

hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<!-- 指定nameservice的名字为mycluster -->
	<property>
	  <name>dfs.nameservices</name>
	  <value>mycluster</value>
	</property>
	<!-- 指定ns下的nn节点 -->
	<property>
	  <name>dfs.ha.namenodes.mycluster</name>
	  <value>nn1,nn2</value>
	</property>
	<!-- 指定nn的rpc通信地址 -->
	<property>
	  <name>dfs.namenode.rpc-address.mycluster.nn1</name>
	  <value>hadoop201:9000</value>
	</property>
	<property>
	  <name>dfs.namenode.rpc-address.mycluster.nn2</name>
	  <value>hadoop202:9000</value>
	</property>
	<!-- 指定nn的http地址 -->
	<property>
	  <name>dfs.namenode.http-address.mycluster.nn1</name>
	  <value>hadoop201:50070</value>
	</property>
	<property>
	  <name>dfs.namenode.http-address.mycluster.nn2</name>
	  <value>hadoop202:50070</value>
	</property>
	<!-- 指定namenode的元数据的存放目录 -->
	<property>
	  <name>dfs.namenode.shared.edits.dir</name>
	  <value>qjournal://hadoop201:8485;hadoop202:8485/mycluster</value>
	</property>
	<property>
	  <name>dfs.journalnode.edits.dir</name>
	  <value>/home/hadoop/hadoop/journal</value>
	</property>
	<!-- 指定故障转移的实现类 -->
	<property>
	  <name>dfs.client.failover.proxy.provider.mycluster</name>
	  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
	</property>
	<!-- 配置隔离机制方法,主要处理场景：
			1. nn1出问题但没down
			2. nn1和zkfc同时down掉(无法汇报给zk)-->
	<property>
      <name>dfs.ha.fencing.methods</name>
      <value>
			sshfence
			shell(/bin/true)
	  </value>
    </property>
	<!-- 配置隔离机制需要免密登陆-->
    <property>
      <name>dfs.ha.fencing.ssh.private-key-files</name>
      <value>/home/hadoop/.ssh/id_rsa</value>
    </property>
	<!-- 配置sshfence隔离机制超时时间-->
	<property>
      <name>dfs.ha.fencing.ssh.connect-timeout</name>
      <value>30000</value>
    </property>
	<!-- 开启故障自动切换-->
	<property>
	   <name>dfs.ha.automatic-failover.enabled</name>
	   <value>true</value>
	 </property>
</configuration>

slaves

4.配置yarn，共2个，参考官方文档

mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<!-- 指定mr框架为yarn方式 -->
	<property>
		<name>mapreduce.framework.name</name>
		<value>yarn</value>
	</property>
</configuration>

yarn-site.xml

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>
	<!-- rm  HA 配置启动 -->
	<property>
		<name>yarn.resourcemanager.ha.enabled</name>
	    <value>true</value>
	 </property>
	 <!-- rm故障自动转移 -->
	 <property>  
		<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>  
		<value>true</value>  
     </property> 
	 <property>  
           <name>yarn.resourcemanager.recovery.enabled</name>  
          <value>true</value>  
     </property>
	 <!-- 指定yarn cluster的唯一id -->
	 <property>
	    <name>yarn.resourcemanager.cluster-id</name>
	    <value>cluster1</value>
	 </property>
	 <property>
	    <name>yarn.resourcemanager.ha.rm-ids</name>
	    <value>rm1,rm2</value>
	 </property>
	 <!-- 指定rm地址 --> 
	 <property>
	    <name>yarn.resourcemanager.hostname.rm1</name>
	    <value>hadoop201</value>
	 </property>
	 <property>
	    <name>yarn.resourcemanager.hostname.rm2</name>
	    <value>hadoop202</value>
	 </property>
	 <!--  rm1端口号 -->
	 <property>  
           <name>yarn.resourcemanager.address.rm1</name>  
          <value>hadoop201:8032</value>  
     </property>   
	 <!-- rm1调度器的端口号 -->	 
     <property>  
          <name>yarn.resourcemanager.scheduler.address.rm1</name>  
          <value>hadoop201:8034</value>  
     </property>  
     <!-- rm1 webapp的端口号 -->	
     <property>  
          <name>yarn.resourcemanager.webapp.address.rm1</name>  
          <value>hadoop201:8088</value>  
     </property>
	<!--  rm2端口号 -->
	 <property>  
           <name>yarn.resourcemanager.address.rm2</name>  
          <value>hadoop202:8032</value>  
     </property>   
	 <!-- rm2调度器的端口号 -->	 
     <property>  
          <name>yarn.resourcemanager.scheduler.address.rm2</name>  
          <value>hadoop202:8034</value>  
     </property>  
     <!-- rm2 webapp的端口号 -->	
     <property>  
          <name>yarn.resourcemanager.webapp.address.rm2</name>  
          <value>hadoop202:8088</value>  
     </property>  
	 <!-- zk集群地址 --> 
	 <property>
	    <name>yarn.resourcemanager.zk-address</name>
	    <value>hadoop203:2181,hadoop204:2181,hadoop205:2181</value>
	 </property>
	 <property>  
          <name>yarn.resourcemanager.zk.state-store.address</name>  
           <value>hadoop203:2181,hadoop204:2181,hadoop205:2181</value>  
     </property> 
	 <!-- 执行MapReduce需要配置的shuffle过程 --> 
	 <property>  
           <name>yarn.nodemanager.aux-services</name>  
          <value>mapreduce_shuffle</value>  
     </property>  
     <property>  
           <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>  
          <value>org.apache.hadoop.mapred.ShuffleHandler</value>  
     </property>
</configuration>