hadoop yarn已经解决了MRv1中的诸多问题,在此安装一下hadoop yarn,然后方便学习spark,yarn
在hadoop第一版中的如/etc/hosts,ssh无密码登录等问题在此不会详细介绍,在此只是介绍一下yarn和hadoop version1的基本的配置不一样的地方
基本的三个配置文件会有区别,在此列出我自己的三个文件的配置情况,也是最基本的配置,然后给出yarn-site.xml的配置,最后解决一个遇到的问题。
1,core-site.xml文件
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://10.11.1.42:9100</value>
</property>
</configuration>
2,map-site.xml文件
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
3,hdfs-site.xml文件
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/egraldlo/hadoop/yarn/hadoop_related/data1,/home/egraldlo/hadoop/yarn/hadoop_related/data2</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/egraldlo/hadoop/yarn/hadoop_related/data</value>
</property>
</configuration>
4,yarn-site.xml文件
<configuration>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>mapreduce.shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>10.11.1.42:18040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>10.11.1.42:18030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>10.11.1.42:18088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>10.11.1.42:8025</value>
</property>
</configuration>
遇到的问题:
FATAL org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Failed to initialize mapreduce.shuffle java.lang.IllegalArgumentException: The ServiceName: mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid.The valid service name should only contain a-zA-Z0-9_ and can not start with numbers
解决方法:
yarn-site.xml文件中的mapreduce.shuffle改为mapreduce_shuffle
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>mapreduce_shuffle</value>
</property>