hadoop版本为cloudera hadoop cdh3u3
配置步骤为
1. 将$HADOOP_HOME/contrib/fairscheduler/hadoop-fairscheduler-0.20.2-cdh3u3.jar拷贝到$HADOOP_HOME/lib文件夹中
2. 修改$HADOOP_HOME/conf/mapred-site.xml配置文件
- <property>
- <name>mapred.jobtracker.taskScheduler</name>
- <value>org.apache.hadoop.mapred.FairScheduler</value>
- </property>
- <property>
- <name>mapred.fairscheduler.allocation.file</name>
- <value>/home/hadoop/hadoop-0.20.2-cdh3u3/conf/fair-scheduler.xml</value>
- </property>
- <property>
- <name>mapred.fairscheduler.preemption</name>
- <value>true</value>
- </property>
- <property>
- <name>mapred.fairscheduler.assignmultiple</name>
- <value>true</value>
- </property>
- <property>
- <name>mapred.fairscheduler.poolnameproperty</name>
- <value>mapred.job.queue.name</value>
- <description>job.set("mapred.job.queue.name",pool);</description>
- </property>
- <property>
- <name>mapred.fairscheduler.preemption.only.log</name>
- <value>true</value>
- </property>
- <property>
- <name>mapred.fairscheduler.preemption.interval</name>
- <value>15000</value>
- </property>
- <property>
- <name>mapred.queue.names</name>
- <value>default,hadoop,hive</value>
- </property>
3. 在$HADOOP_HOME/conf/新建配置文件fair-scheduler.xml
- <?xmlversion="1.0"?>
- <allocations>
- <poolname="hive">
- <minMaps>90</minMaps>
- <minReduces>20</minReduces>
- <maxRunningJobs>20</maxRunningJobs>
- <weight>2.0</weight>
- <minSharePreemptionTimeout>30</minSharePreemptionTimeout>
- </pool>
- <poolname="hadoop">
- <minMaps>9</minMaps>
- <minReduces>2</minReduces>
- <maxRunningJobs>20</maxRunningJobs>
- <weight>1.0</weight>
- <minSharePreemptionTimeout>30</minSharePreemptionTimeout>
- </pool>
- <username="hadoop">
- <maxRunningJobs>6</maxRunningJobs>
- </user>
- <poolMaxJobsDefault>10</poolMaxJobsDefault>
- <userMaxJobsDefault>8</userMaxJobsDefault>
- <defaultMinSharePreemptionTimeout>600</defaultMinSharePreemptionTimeout>
- <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
- </allocations>
4. 在集群的各个节点执行以上步骤,然后重启集群,在http://namenode:50030/scheduler 即可查看到调度器运行状态,如果修改调度器配置的话,只需要修改文件fair-scheduler.xml ,不需重启配置即可生效。
5. 在执行hive任务时,设置hive属于的队列set mapred.job.queue.name=hive;
##########
另外,如果在执行MR JOB的时候出现XX用户访问不了YY队列的话,就需要在mapred-queue-acls.xml里配置相应的属性,来对访问权限进行控制,比如:
- <property>
- <name>mapred.queue.default.acl-submit-job</name>
- <value>*</value>
- <description>Commaseparatedlistofuserandgroupnamesthatareallowed
- tosubmitjobstothe'default'queue.Theuserlistandthegrouplist
- areseparatedbyablank.Fore.g.user1,user2group1,group2.
- Ifsettothespecialvalue'*',itmeansallusersareallowedto
- submitjobs.Ifsetto''(i.e.space),nouserwillbeallowedtosubmit
- jobs.
- ItisonlyusedifauthorizationisenabledinMap/Reducebysettingthe
- configurationpropertymapred.acls.enabledtotrue.
- IrrespectiveofthisACLconfiguration,theuserwhostartedtheclusterand
- clusteradministratorsconfiguredvia
- mapreduce.cluster.administratorscansubmitjobs.
- </description>
- </property>
- <property>
- <name>mapred.queue.default.acl-administer-jobs</name>
- <value>*</value>
- <description>Commaseparatedlistofuserandgroupnamesthatareallowed
- toviewjobdetails,killjobsormodifyjob'spriorityforallthejobs
- inthe'default'queue.Theuserlistandthegrouplist
- areseparatedbyablank.Fore.g.user1,user2group1,group2.
- Ifsettothespecialvalue'*',itmeansallusersareallowedtodo
- thisoperation.Ifsetto''(i.e.space),nouserwillbeallowedtodo
- thisoperation.
- ItisonlyusedifauthorizationisenabledinMap/Reducebysettingthe
- configurationpropertymapred.acls.enabledtotrue.
- IrrespectiveofthisACLconfiguration,theuserwhostartedtheclusterand
- clusteradministratorsconfiguredvia
- mapreduce.cluster.administratorscandotheaboveoperationsonallthejobs
- inallthequeues.Thejobownercandoalltheaboveoperationsonhis/her
- jobirrespectiveofthisACLconfiguration.
- </description>
- </property>