1、YARN-HA架构原理介绍
2、配置yarn-site.xml
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>10000</value>
</property>
<!-- HA not use-->
<!--
<property>
<name>yarn.resourcemanager.hostname</name>
<value>bigdata-pro01.kfk.com</value>
</property>
-->
<!-- HA -->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!-- change cluster1 to rs-->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>rs</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>bigdata-pro01.kfk.com</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>bigdata-pro02.kfk.com</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>bigdata-pro01.kfk.com:2181,bigdata-pro02.kfk.com:2181,bigdata-pro03.kfk.com:2181</value>
</property>
<!-- default is false-->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<!-- default is FileSystemRMStateStore-->
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
3、分发配置
scp -r etc/hadoop/yarn-site.xml bigdata-pro02.kfk.com:/opt/modules/hadoop-2.5.0/etc/hadoop/
scp -r etc/hadoop/yarn-site.xml bigdata-pro03.kfk.com:/opt/modules/hadoop-2.5.0/etc/hadoop/
4、启动服务
机器1、2启动resourcemanager 3启动nodemanager
实际先用sbin/start-all.sh 时1已经启动resourcemanager 同时123nodemanager都启动了
[kfk@bigdata-pro01 hadoop-2.5.0]$ sbin/yarn-daemon.sh start resourcemanager
[kfk@bigdata-pro02 hadoop-2.5.0]$ sbin/yarn-daemon.sh start resourcemanager
[kfk@bigdata-pro03 hadoop-2.5.0]$ sbin/yarn-daemon.sh start nodemanager
5、效果看地址
http://bigdata-pro01.kfk.com:8088/cluster/cluster
一个active一个standby 自动由zookeeper选举成功
6、实验自动故障转移
[kfk@bigdata-pro01 hadoop-2.5.0]$ sbin/yarn-daemon.sh start resourcemanager
resourcemanager running as process 13247. Stop it first.
[kfk@bigdata-pro01 hadoop-2.5.0]$ jps
13360 NodeManager
13072 JournalNode
11156 QuorumPeerMain
11780 DFSZKFailoverController
12884 DataNode
12777 NameNode
13693 Jps
13247 ResourceManager
跑到map的时候杀active进程 13247 ResourceManager
输入文件及文件夹创建好后执行MR
[kfk@bigdata-pro01 hadoop-2.5.0]$ bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar wordcount /user/kfk/data/wc.input /user/kfk/data/output/1
[kfk@bigdata-pro01 hadoop-2.5.0]$ kill -9 13247
连接不到rm1了 切换到rm2
最终成功