yarn架构
yarn配置
1,最终去开发MR计算程序
*,HDFS和YARN 是俩概念
2,hadoop2.x 出现了一个yarn : 资源管理 》 MR 没有后台常服务
yarn模型:container 容器,里面会运行我们的AppMaster ,map/reduce Task
解耦
mapreduce on yarn
架构:
RM
NM
搭建:
NN NN JN ZKFC ZK DN RM NM
node01 * * *
node02 * * * * * *
node03 * * * * *
node04 * * * *
hadoop 1.x 2.x 3.x
hdfs: no ha ha(向前兼容,没有过多的改NN,二是通过新增了角色 zkfc)
yarn no yarn yarn (不是新增角色,二是直接在RM进程中增加了HA的模块)
-----通过官网:
mapred-site.xml > mapreduce on yarn
mapreduce.framework.name
yarn
yarn-site.xml
//shuffle 洗牌 M -shuffle> R
yarn.nodemanager.aux-services
mapreduce_shuffle
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>node02:2181,node03:2181,node04:2181</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>mashibing</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>node03</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>node04</value>
</property>
流程:
我hdfs等所有的都用root来操作的
node01:
cd $HADOOP_HOME/etc/hadoop
cp mapred-site.xml.template mapred-site.xml
vi mapred-site.xml
vi yarn-site.xml
scp mapred-site.xml yarn-site.xml node02:`pwd`
scp mapred-site.xml yarn-site.xml node03:`pwd`
scp mapred-site.xml yarn-site.xml node04:`pwd`
vi slaves //可以不用管,搭建hdfs时候已经改过了。。。
start-yarn.sh
node02:zkCli.sh
node03~04:
yarn-daemon.sh start resourcemanager
http://node03:8088
http://node04:8088
This is standby RM. Redirecting to the current active RM: http://node03:8088/