YARN高可用集群部署
一.YARN高可用集群部署
编辑 mapred-site.xml 文件
[hadoop@server1 hadoop]$ vim etc/hadoop/mapred-site.xml
<configuration>
<!-- 指定 yarn 为 MapReduce 的框架 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
编辑 yarn-site.xml 文件
[hadoop@server1 hadoop]$ cat etc/hadoop/yarn-site.xml
<configuration>
<!-- 配置可以在 nodemanager 上运行 mapreduce 程序 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!-- 激活 RM 高可用 -->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property><!-- 指定 RM 的集群 id -->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>RM_CLUSTER</value>
</property>
<!-- 定义 RM 的节点-->
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<!-- 指定 RM1 的地址 -->
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>172.25.3.1</value>
</property>
<!-- 指定 RM2 的地址 -->
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>172.25.3.5</value>
</property>
<!-- 激活 RM 自动恢复 -->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<!-- 配置 RM 状态信息存储方式,有 MemStore 和 ZKStore-->
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</
value>
</property>
<!-- 配置为 zookeeper 存储时,指定 zookeeper 集群的地址 -->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>172.25.3.2:2181,172.25.3.3:2181,172.25.3.4:2181</value>
</property>
</configuration>
server1
启动集群 ResourceManager
[hadoop@server1 hadoop]$ sbin/start-yarn.sh
Starting resourcemanagers on [ 172.25.3.1 172.25.3.5]
Starting nodemanagers
[hadoop@server1 hadoop]$ jps
17602 DFSZKFailoverController
18914 ResourceManager
17779 NameNode
19226 Jps
server5 节点信息
[hadoop@server5 hadoop]$ jps
15840 ResourceManager
4342 DFSZKFailoverController
16153 Jps
4687 NameNode
server2/3/4信息 NodeManager
[hadoop@server3 hadoop]$ jps
5649 Jps
4338 QuorumPeerMain
5111 DataNode
4940 JournalNode
5551 NodeManager
访问172.25.3.1:8088
server1 为active
yarn集群信息 rm1为主节点
二.yarn 故障切换
关掉进程
[hadoop@server1 hadoop]$ jps
17602 DFSZKFailoverController
18914 ResourceManager
17779 NameNode
19305 Jps
[hadoop@server1 hadoop]$ kill 18914
[hadoop@server1 hadoop]$ jps
17602 DFSZKFailoverController
17779 NameNode
19305 Jps
查看主节点切换 server5 active
集群主节点切换为 rm2
重载server1
[hadoop@server1 hadoop]$ bin/yarn --daemon start resourcemanager
[hadoop@server1 hadoop]$ jps
19424 Jps
17602 DFSZKFailoverController
17779 NameNode
19372 ResourceManager
server1 状态为stabdby
备用