YARN HA部署
yarn-site.xml文件:
<!--启用resourcemanager ha-->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!--开启HA-->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!--声明两台resourcemanager的地址-->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>cluster1</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>bigdata02.com</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>bigdata03.com</value>
</property>
<!--指定zookeeper集群的地址-->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>bigdata01.com:2181,bigdata02.com:2181,bigdata03.com:2181</value>
</property>
<!--启用自动恢复-->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<!--指定resourcemanager的状态信息存储在zookeeper集群-->
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
复制配置文件给其他服务
scp−retc/hadoop/yarn−site.xmlnode1:/opt/modules/hadoop−2.5.0−cdh5.3.6/etc/hadoop/
s
c
p
−
r
e
t
c
/
h
a
d
o
o
p
/
y
a
r
n
−
s
i
t
e
.
x
m
l
n
o
d
e
1
:
/
o
p
t
/
m
o
d
u
l
e
s
/
h
a
d
o
o
p
−
2.5.0
−
c
d
h
5.3.6
/
e
t
c
/
h
a
d
o
o
p
/
scp -r etc/hadoop/yarn-site.xml node2:/opt/modules/hadoop-2.5.0-cdh5.3.6/etc/hadoop/
启动各个服务器的服务
1.启动HDFS
** zookeeper
bin/zkServer.sh start
** resourcemanager和nodemanager
在bigdata02:
sbin/start−yarn.sh在bigdata03:
s
b
i
n
/
s
t
a
r
t
−
y
a
r
n
.
s
h
在
b
i
g
d
a
t
a
03
:
sbin/yarn-daemon.sh start resourcemanager
测试:
运行一个jar包进行测试
查看HA状态
bin/yarnrmadmin−getServiceStaterm1
b
i
n
/
y
a
r
n
r
m
a
d
m
i
n
−
g
e
t
S
e
r
v
i
c
e
S
t
a
t
e
r
m
1
bin/yarn rmadmin -getServiceState rm2
$ bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0-cdh5.3.6.jar wordcount /input /output
把active服务器的resourcemanager进程杀掉,观察job任务运行
$ kill -9 6854
总结
ha namenode&resourcemanager
1、ha namenode启动顺序
2、ha resourcemanager启动顺序
3、kill 掉一个namenode和resourcemanager,观察wordcount程序,
运行使用的hdfs文件和yarn上运行的记录,关闭一个另外一个保持信息的一致性。