hadoop或yarn的每一个服务都可以单独部署。如namenode可以部署3.2,resourcemanager和nodemanager可以部署2.7. 这样一来,就需要给hadoop一套配置,给yarn一套配置。
hdfs配置(NN、DN配置):
- hdfs-site.xml
- core-site.xml
YARN配置(RM、NM配置):
- hdfs-site.xml
- core-site.xml
- mapred-site.xml
- yarn-site.xml
客户端:
- hdfs-site.xml
- core-site.xml
- mapred-site.xml
- yarn-site.xml
YARN配置了fairScheduler调度器后的配置文件如下:
yarn-site.xml
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop1</value>
</property>
<!-- 指定reducer获取数据的方式-->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>file:///home/hadoop/data/yarn/nm</value>
</property>
<!-- 日志聚合 -->
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>600000</value>
</property>
<property>
<name>yarn.log.server.url</name>
<value>http://hadoop1:19888/jobhistory/logs</value>
</property>
<!--避免虚拟内存不够-->
<!--property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property-->
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
<name>yarn.scheduler.fair.continuous-scheduling-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.scheduler.fair.continuous-scheduling-sleep-ms</name>
<value>500</value>
</property>
<property>
<name>yarn.scheduler.fair.allocation.file</name>
<value>/home/hadoop/yarn-current/etc/hadoop/fair-scheduler.xml</value>
</property>
</configuration>
上述配置了yarn日志聚合
、公平调度
功能。其中公平调度的配置文件如下:
配置了一个root.hadoop
的队列
fair-scheduler.xml
:
<?xml version="1.0" encoding="UTF-8"?>
<allocations>
<userMaxAppsDefault>4000</userMaxAppsDefault>
<queue name="root">
<aclSubmitApps>dps_jobcenter,dps_data_center,hive_admin</aclSubmitApps>
<aclAdministerApps>a</aclAdministerApps>
<queue name="hadoop">
<minResources>245760mb, 240vcores</minResources>
<maxResources>245760mb, 240vcores</maxResources>
<maxRunningApps>500</maxRunningApps>
<maxAMShare>0.25</maxAMShare>
<aclSubmitApps>*</aclSubmitApps>
<aclAdministerApps>*</aclAdministerApps>
<weight>1.0</weight>
</queue>
</queue>
</allocations>
mapred-site.xml
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<!-- 历史服务 -->
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop1:19888</value>
</property>
<!-- hadoop3 需要额外设置mapred_HOME -->
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=/home/hadoop/yarn-current</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=/home/hadoop/yarn-current</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=/home/hadoop/yarn-current</value>
</property>
</configuration>
上述配置MR任务使用yarn框架
、RM历史服务
,在hadoop3版本下,还需要额外配置mapred_HOME。
core-site.xml和hdfs-site.xml 常规配置即可。
core-site.xml
:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop1:9000</value>
</property>
<!-- 这个属性用来执行文件IO缓冲区的大-->
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
<description>该属性值单位为KB,131072KB即为默认的64M</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hadoop/data/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
</configuration>
~
hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!-- 主要是namenode和datanode的配置 -->
<property>
<name>dfs.nameservices</name>
<value>hadoop-cluster</value>
</property>
<!-- hdfs的副本数设置。也就是上传一个文件,其分割为block块后,>每个block的冗余副本个数-->
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<!-- namenode数据的存放地点。也就是namenode元数据存放的地方,记>录了hdfs系统中文件的元数据-->
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/hadoop/data/dfs/nn</value>
</property>
<!-- datanode数据的存放地点。也就是block块存放的目录了-->
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/hadoop/data/dfs/dn</value>
</property>
<!-- secondary namenode的http通讯地址 -->
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop1:9001</value>
</property>
</configuration>
~
启动
[hadoop@hadoop1 shellUtils]$ /home/hadoop/yarn-current/sbin/yarn-daemon.sh start resourcemanager
通过web页面看到队列: