yarn配置标签调度

yarn配置标签调度

实现的目标(测试使用的是只有2个NodeManager节点):新建两个标签normal、highmem,配置两个队列dev、prd,其中dev最多使用集群50%的资源,prd可使用集群50%的资源

按照以下修改完配置,重启yarn,再新建标签、给机器添加标签
新建标签
yarn rmadmin -addToClusterNodeLabels “normal,highmem”

给机器添加标签
yarn rmadmin -replaceLabelsOnNode “主机名01:45454=normal 主机名02:45454=highmem”

配置可参考如下
一、在yarn-site.xml中新增以下配置

        <property>
                <name>yarn.nodemanager.address</name>
                <value>0.0.0.0:45454</value>
        </property>
        <property>
                <name>yarn.node-labels.enabled</name>
                <value>true</value>
        </property>
        <property>
                <name>yarn.node-labels.fs-store.root-dir</name>
                <value>/user/node-label</value>
        </property>
        <property>
                <name>yarn.resourcemanager.scheduler.class</name>
                <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
        </property>
        <property>
                <name>yarn.node-labels.manager-class</name>
                <value>org.apache.hadoop.yarn.server.resourcemanager.nodelabels.RMNodeLabelsManager</value>
        </property>

二、修改capacity-scheduler.xml

<configuration>

  <property>
    <name>yarn.scheduler.capacity.maximum-applications</name>
    <value>10000</value>
    <description>
      Maximum number of applications that can be pending and running.
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
    <value>0.8</value>
    <description>
      Maximum percent of resources in the cluster which can be used to run
      application masters i.e. controls number of concurrent running
      applications.
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.resource-calculator</name>
    <value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
    <description>
      The ResourceCalculator implementation to be used to compare
      Resources in the scheduler.
      The default i.e. DefaultResourceCalculator only uses Memory while
      DominantResourceCalculator uses dominant-resource to compare
      multi-dimensional resources such as Memory, CPU etc.
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.queues</name>
    <value>dev,prd</value>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.dev.capacity</name>
    <value>50</value>
  </property>
  <property>
    <name>yarn.scheduler.capacity.root.prd.capacity</name>
    <value>50</value>
  </property>
    <property>
    <name>yarn.scheduler.capacity.root.dev.maximum-capacity</name>
    <value>50</value>
  </property>
  <property>
    <name>yarn.scheduler.capacity.root.prd.maximum-capacity</name>
    <value>100</value>
  </property>
  <property>
    <name>yarn.scheduler.capacity.root.accessible-node-labels</name>
    <value>*</value>
  </property>
  <property>
    <name>yarn.scheduler.capacity.root.dev.accessible-node-labels</name>
    <value>normal</value>
  </property>
  <property>
    <name>yarn.scheduler.capacity.root.prd.accessible-node-labels</name>
    <value>highmem</value>
  </property>
  <property>
    <name>yarn.scheduler.capacity.root.accessible-node-labels.normal.capacity</name>
    <value>50</value>
  </property>
  <property>
    <name>yarn.scheduler.capacity.root.accessible-node-labels.highmem.capacity</name>
    <value>50</value>
  </property>
  <property>
    <name>yarn.scheduler.capacity.root.dev.accessible-node-labels.normal.capacity</name>
    <value>100</value>
  </property>
  <property>
    <name>yarn.scheduler.capacity.root.prd.accessible-node-labels.highmem.capacity</name>
    <value>100</value>
  </property>
 <property>
    <name>yarn.scheduler.capacity.root.dev.default-node-label-expression</name>
    <value>normal</value>
  </property>
  <property>
    <name>yarn.scheduler.capacity.root.prd.default-node-label-expression</name>
    <value>highmem</value>
  </property>
    <property>
    <name>yarn.scheduler.capacity.root.dev.state</name>
    <value>RUNNING</value>
  </property>
  <property>
    <name>yarn.scheduler.capacity.root.prd.state</name>
    <value>RUNNING</value>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.dev.acl_submit_applications</name>
    <value>*</value>
  </property>
  <property>
    <name>yarn.scheduler.capacity.root.prd.acl_submit_applications</name>
    <value>*</value>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.dev.acl_administer_queue</name>
    <value>*</value>
  </property>
  <property>
    <name>yarn.scheduler.capacity.root.prd.acl_administer_queue</name>
    <value>*</value>
  </property>
    <property>
    <name>yarn.scheduler.capacity.node-locality-delay</name>
    <value>2</value>
    <description>
      Number of missed scheduling opportunities after which the CapacityScheduler
      attempts to schedule rack-local containers.
      Typically this should be set to number of nodes in the cluster, By default is setting
      approximately number of nodes in one rack which is 40.
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.queue-mappings</name>
    <value></value>
    <description>
      A list of mappings that will be used to assign jobs to queues
      The syntax for this list is [u|g]:[name]:[queue_name][,next mapping]*
      Typically this list will be used to map users to queues,
      for example, u:%user:%user maps all users to queues with the same name
      as the user.
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.queue-mappings-override.enable</name>
    <value>false</value>
    <description>
      If a queue mapping is present, will it override the value specified
      by the user? This can be used by administrators to place jobs in queues
      that are different than the one specified by the user.
      The default is false.
    </description>
  </property>

</configuration>
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Hadoop YARN是Hadoop的下一代集群资源管理系统,它将集群资源划分为容器,通过资源调度器来分配和管理这些容器。资源调度器的配置对于集群的性能和资源利用率至关重要。 首先,需要配置YARN调度器类型。目前YARN支持两种调度器:容量调度器和公平调度器。容量调度器将集群资源按比例分配给不同的队列,每个队列有固定的资源容量;而公平调度器将资源动态分配给各个应用程序,根据应用程序的运行状况动态调整资源分配。 其次,需要配置队列的属性。队列属性包括队列名称、资源容量、资源限制等。资源容量指定了每个队列可以使用的最大资源数量,资源限制是为了避免某个队列占用过多资源而导致其他队列无法正常运行。 另外,还可以配置调度器的策略。调度策略根据不同的需求来决定资源的分配方式,比如公平策略会尽量保持各个应用程序获得相同的资源量,而容量策略则会按照预先设定的比例分配资源。 此外,还需要配置队列的优先级。队列优先级可以保证某个队列在资源不足时获得更多的资源,以确保高优先级的应用程序能够正常运行。 最后,还可以配置一些其他参数,例如最大容器数、最大AM资源比例等。这些参数可以根据具体的需求进行调整,以优化资源管理和调度效果。 总之,Hadoop YARN资源调度器的配置需要根据集群的实际情况和需求进行灵活的调整,以实现高效的资源管理和调度
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值