集群规划
配置
修改配置文件mapred-sitex.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<-- 跨平台配置 -->
<property>
<name>mapreduce.app-submission.cross-platform</name>
<value>true</value>
</property>
修改配置文件yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>cluster1</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>node01</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>node02</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>node02:2181,node03:2181,node04:2181</value>
</property>
将配置好信息分发到所有节点
在node02 node03 node04上启动zookeeper
./zkServer.sh start
node01上启动hdfs和yarn集群
start-dfs.sh
start-yarn.sh
在node02上单独启动一个备用ResourceManager
(node01上已自动启动一个active的RM)
yarn-daemon.sh start resourcemanager
从8088端口查看管理页面
测试案例
wordcount
使用MapReduce提供的测试用例wordcount
到mapreduce的jar包目录下
cd $HADOOP_HOME/share/hadoop/mapreduce
运行测试用例
hadoop jar hadoop-mapreduce-examples-2.6.5.jar wordcount /input /output
input:是hdfs文件系统中数据所在的目录
ouput:是hdfs中不存在的目录,程序运行的结果会输出到该目录,若目录存在会报错
查看运行结果
hdfs dfs -cat /output/*