文章目录
一、优点
解决了YARN中ResourceManager单点问题。Zookerper中的ActiveStandbyElector来决定哪个ResourceManager 应该是Active。
二、yarn-site.xml文件的修改。
对于master将yarn-site.xml进行修改,存在的更换,不存在的补上。
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value> </property> <property>
<name>yarn.resourcemanager.cluster-id</name>
<value>rm-cluster</value> </property> <property>> <name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value> </property> <property>
<name>yarn.resourcemanager.ha.id</name>
<value>rm1</value> </property> <property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>master</value> </property> <property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>slave1</value> </property> <property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value> </property> <property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery
.ZKRMStateStore</value> </property> <property>
<name>yarn.resourcemanager.zk-address</name>
<value>master:2181,slave1:2181,slave2:2181</value> </property> <property>
<name>yarn.log-aggregation-enable</name>
<value>true</value> </property> <property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value> </property>
对于slave1子节点yarn.resourcemanager.ha.id的参数需要改成rm2,对于其他节点就直接删除该配置即可。
三、mapred-site.xml文件的修改
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020,slave1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888,slave1:19888</value>
</property>
</configuration>
注意:启动时出现org.apache.hadoop.yarn.server.resourcemanager.recovery .ZKRMStateStore not found的错误,请看:https://editor.csdn.net/md/?articleId=136384732
四、启动resource manager HA
4.1在master节点启动
[root@master~]cd /home/hadoop/hadoop-2.7.3/sbin
[root@master sbin]# start-dfs.sh
[root@master sbin]# start-yarn.sh
4.2在slave1节点启动
[root@slave1 ~]# yarn-daemon.sh start resourcemanager
[root@master ~]# jps
30288 Jps
2164 QuorumPeerMain
29460 NameNode
29689 SecondaryNameNode
29897 ResourceManager
[root@slave1 ~]# jps
2563 DataNode
2692 NodeManager
2886 ResourceManager
1080 QuorumPeerMain
2942 Jps
[root@slave2 ~]# jps
1082 QuorumPeerMain
2683 DataNode
2812 NodeManager
2988 Jps
五、打开浏览器查看Hadoop YRAN HA集群信息
Master节点为Active状态
Slave1为Standby状态。
六、使用命令查看RM状态
[root@master sbin]# yarn rmadmin -getServiceState rm2
standby
[root@master sbin]# yarn rmadmin -getServiceState rm1
active
七、主备切换
# yarn rmadmin -transitionToStandby rm1
# yarn rmadmin -transitionToActive rm2
八、关闭master的ResourceManager,查看RM的状态
[root@master ~]# yarn-daemon.sh stop resourcemanager
关闭master的ResourceManager,浏览器在http://master:8088/会显示无法访问该网页。
而slave1会自动转成ResourceManager。
[root@master sbin]# yarn rmadmin -getServiceState rm2
active
master重启ResourceManager,会自动转成standby
[root@master ~]# yarn-daemon.sh start resourcemanager
[root@master ~]# yarn rmadmin -getServiceState rm1
standby
九、查看ZNode信息
[root@master~]# zkServer.sh start
[root@master~]# zkCli.sh
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper, yarn-leader-election, rmstore]
[zk: localhost:2181(CONNECTED) 1] ls /rmstore
[ZKRMStateRoot]