Yarn DRF资源分配算法


DRF是一种通用的多资源 最大最小公平分配策略(Max-Min Fairness Strategy),其核心思想是在多环境下一个用户的资源分配应该由用户的 主导份额的资源决定。主导份额的资源是在所有已经分配给用户的多种资源中,占据最大份额的一种资源。简而言之,DRF试图最大化所有用户中最小的主导份额。

一、DRF计算方式

假设:系统资源CPU和Memory分别为9 Core和18GB,应用A每个计算任务请求资源为<1 CPU,4GB>资源;应用B每个计算任务请求资源为<3 CPU,1GB>。如何为这种情况构建一个公平分配策略?
资源分配

二、DRF伪代码

∗ ∗ ∗ 算法 D R F 伪代码 ∗ ∗ ∗ ‾ ‾ ‾ ‾ 条件假设 R    − 系统资源容量 C    − 系统已分配资源的情况 s      − 应用的主导资源 U i   − 为应用 i 分配的资源 D i   − 应用 i ( 主导份额最小的应用 ) 待执行任务所需资源 变量初始化 R    = { r 1 , ⋯   , r m } 注释:系统资源容量 C    = { c 1 , ⋯   , c m } 注释:系统已分配的资源,初始化为 0 s    = { 0 , ⋯   , 0 } ⏞ n      注释:应用的主导资源,初始化为 0 U i = { u i 0 ⋯   , u i m }        注释:应用 i 分配到的资源,子项数和系统资源数相等 , 即 m 个选项 分配逻辑 i f   ( C + D i ) ≤ R   t h e n C = C + D i   注释:更新已分配资源向量 U i = U i + D i        注释:更新应用 i 分配资源向量 s i = m a x { u i j / r j } j = 1 m    注释:求应用 i 的主导资源 e l s e r e t u r n e n d   i f \begin{aligned} & \overline{\underline{\overline{\underline{***\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad 算法DRF伪代码\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad ***}}}} \\ & \textbf{条件假设}\\ & \qquad R \ \ -系统资源容量 \\ & \qquad C \ \ -系统已分配资源的情况 \\ & \qquad s \ \ \: -应用的主导资源 \\ & \qquad U_i \, - 为应用i分配的资源 \\ & \qquad D_i \, - 应用i(主导份额最小的应用)待执行任务所需资源 \\ & \textbf{变量初始化} \\ & \qquad R \ \ = \{r_1, \cdots, r_m\} \qquad \qquad 注释:系统资源容量 \\ & \qquad C \ \ = \{c_1, \cdots, c_m\} \qquad \qquad 注释:系统已分配的资源,初始化为0 \\ & \qquad s \ \ = \overbrace{\{0, \cdots, 0\}}^{n} \qquad \qquad \ \ \ \ \ 注释:应用的主导资源,初始化为0 \\ & \qquad U_i = \{u_{i0}\cdots,u_{im}\} \qquad \ \ \ \ \ \ \ 注释:应用i分配到的资源,子项数和系统资源数相等,即m个选项 \\ & \textbf{分配逻辑} \\ & \qquad if \ (C+Di) \leq R \ then \\ & \qquad \qquad C= C + Di \qquad \qquad \ \ 注释:更新已分配资源向量 \\ & \qquad \qquad Ui= Ui + Di \qquad \ \ \ \ \ \ \ 注释:更新应用i分配资源向量 \\ & \qquad \qquad s_i = max\{u_{ij}/r_j\}^{m}_{j=1} \ \ \ 注释:求应用i的主导资源\\ & \qquad else \\ & \qquad \qquad return \\ & \qquad end \ if \end{aligned} 算法DRF伪代码条件假设R  系统资源容量C  系统已分配资源的情况s  应用的主导资源Ui为应用i分配的资源Di应用i(主导份额最小的应用)待执行任务所需资源变量初始化R  ={r1,,rm}注释:系统资源容量C  ={c1,,cm}注释:系统已分配的资源,初始化为0s  ={0,,0} n     注释:应用的主导资源,初始化为0Ui={ui0,uim}       注释:应用i分配到的资源,子项数和系统资源数相等,m个选项分配逻辑if (C+Di)R thenC=C+Di  注释:更新已分配资源向量Ui=Ui+Di       注释:更新应用i分配资源向量si=max{uij/rj}j=1m   注释:求应用i的主导资源elsereturnend if

三、算法分配案例

假设:系统资源CPU和Memory分别为9 Core和18GB,应用A每个计算任务请求资源为<1 CPU,4GB>资源;应用B每个计算任务请求资源为<3 CPU,1GB>。应用A 3个任务,任务B 3个任务。

调度顺序应用A分配的资源应用A主导资源应用B分配的资源应用B主导资源CPURAM
A(1/9, 4/18)4/18(0,0)01/94/18
A(2/9, 8/18)8/18(0,0)02/98/18
B(2/9, 8/18)8/18(3/9,1/18)3/95/99/18
B(2/9, 8/18)8/18(6/9,2/18)6/98/910/18
A(3/9, 12/18)12/18(6,9)6/99/914/18

四、Yarn源码

Hadoop 2.0 YARN的hadoop-yarn-server-resourcemanager模块,Fair Scheduler里实现的DRF策略的代码(YARN的Scheduler主要实现的是Capacity和Fair,DRF是Fair里的一种,此外还有FIFO、Fair Share)。

package org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies;

@Private
@Unstable
public class DominantResourceFairnessPolicy extends SchedulingPolicy {
  public static final String NAME = "DRF";

  private static final DominantResourceFairnessComparator COMPARATOR =
      new DominantResourceFairnessComparator();
  private static final DominantResourceCalculator CALCULATOR =
      new DominantResourceCalculator();

  @Override
  public String getName() {
    return NAME;
  }

  @Override
  public byte getApplicableDepth() {
    return SchedulingPolicy.DEPTH_ANY;
  }

  @Override
  public Comparator<Schedulable> getComparator() {
    return COMPARATOR;
  }

  @Override
  public ResourceCalculator getResourceCalculator() {
    return CALCULATOR;
  }

  @Override
  public void computeShares(Collection<? extends Schedulable> schedulables,
      Resource totalResources) {
    for (ResourceType type : ResourceType.values()) {
      ComputeFairShares.computeShares(schedulables, totalResources, type);
    }
  }

  @Override
  public void computeSteadyShares(Collection<? extends FSQueue> queues,
      Resource totalResources) {
    for (ResourceType type : ResourceType.values()) {
      ComputeFairShares.computeSteadyShares(queues, totalResources, type);
    }
  }

  @Override
  public boolean checkIfUsageOverFairShare(Resource usage, Resource fairShare) {
    return !Resources.fitsIn(usage, fairShare);
  }

  @Override
  public boolean checkIfAMResourceUsageOverLimit(Resource usage, Resource maxAMResource) {
    return !Resources.fitsIn(usage, maxAMResource);
  }

  @Override
  public Resource getHeadroom(Resource queueFairShare, Resource queueUsage,
                              Resource maxAvailable) {
    long queueAvailableMemory =
        Math.max(queueFairShare.getMemorySize() - queueUsage.getMemorySize(), 0);
    int queueAvailableCPU =
        Math.max(queueFairShare.getVirtualCores() - queueUsage
            .getVirtualCores(), 0);
    Resource headroom = Resources.createResource(
        Math.min(maxAvailable.getMemorySize(), queueAvailableMemory),
        Math.min(maxAvailable.getVirtualCores(),
            queueAvailableCPU));
    return headroom;
  }

  @Override
  public void initialize(Resource clusterCapacity) {
    COMPARATOR.setClusterCapacity(clusterCapacity);
  }

  public static class DominantResourceFairnessComparator implements Comparator<Schedulable> {
    private static final int NUM_RESOURCES = ResourceType.values().length;
    
    private Resource clusterCapacity;

    public void setClusterCapacity(Resource clusterCapacity) {
      this.clusterCapacity = clusterCapacity;
    }

    @Override
    public int compare(Schedulable s1, Schedulable s2) {
      ResourceWeights sharesOfCluster1 = new ResourceWeights();
      ResourceWeights sharesOfCluster2 = new ResourceWeights();
      ResourceWeights sharesOfMinShare1 = new ResourceWeights();
      ResourceWeights sharesOfMinShare2 = new ResourceWeights();
      ResourceType[] resourceOrder1 = new ResourceType[NUM_RESOURCES];
      ResourceType[] resourceOrder2 = new ResourceType[NUM_RESOURCES];
      
      // Calculate shares of the cluster for each resource both schedulables.
      calculateShares(s1.getResourceUsage(),
          clusterCapacity, sharesOfCluster1, resourceOrder1, s1.getWeights());
      calculateShares(s1.getResourceUsage(),
          s1.getMinShare(), sharesOfMinShare1, null, ResourceWeights.NEUTRAL);
      calculateShares(s2.getResourceUsage(),
          clusterCapacity, sharesOfCluster2, resourceOrder2, s2.getWeights());
      calculateShares(s2.getResourceUsage(),
          s2.getMinShare(), sharesOfMinShare2, null, ResourceWeights.NEUTRAL);
      // A queue is needy for its min share if its dominant resource
      // (with respect to the cluster capacity) is below its configured min share
      // for that resource
      boolean s1Needy = sharesOfMinShare1.getWeight(resourceOrder1[0]) < 1.0f;
      boolean s2Needy = sharesOfMinShare2.getWeight(resourceOrder2[0]) < 1.0f;
      
      int res = 0;
      if (!s2Needy && !s1Needy) {
        res = compareShares(sharesOfCluster1, sharesOfCluster2,
            resourceOrder1, resourceOrder2);
      } else if (s1Needy && !s2Needy) {
        res = -1;
      } else if (s2Needy && !s1Needy) {
        res = 1;
      } else { // both are needy below min share
        res = compareShares(sharesOfMinShare1, sharesOfMinShare2,
            resourceOrder1, resourceOrder2);
      }
      if (res == 0) {
        // Apps are tied in fairness ratio. Break the tie by submit time.
        res = (int)(s1.getStartTime() - s2.getStartTime());
      }
      return res;
    }
    
    /**
     * Calculates and orders a resource's share of a pool in terms of two vectors.
     * The shares vector contains, for each resource, the fraction of the pool that
     * it takes up.  The resourceOrder vector contains an ordering of resources
     * by largest share.  So if resource=<10 MB, 5 CPU>, and pool=<100 MB, 10 CPU>,
     * shares will be [.1, .5] and resourceOrder will be [CPU, MEMORY].
     */
    void calculateShares(Resource resource, Resource pool,
        ResourceWeights shares, ResourceType[] resourceOrder, ResourceWeights weights) {
      shares.setWeight(MEMORY, (float)resource.getMemorySize() /
          (pool.getMemorySize() * weights.getWeight(MEMORY)));
      shares.setWeight(CPU, (float)resource.getVirtualCores() /
          (pool.getVirtualCores() * weights.getWeight(CPU)));
      // sort order vector by resource share
      if (resourceOrder != null) {
        if (shares.getWeight(MEMORY) > shares.getWeight(CPU)) {
          resourceOrder[0] = MEMORY;
          resourceOrder[1] = CPU;
        } else  {
          resourceOrder[0] = CPU;
          resourceOrder[1] = MEMORY;
        }
      }
    }
    
    private int compareShares(ResourceWeights shares1, ResourceWeights shares2,
        ResourceType[] resourceOrder1, ResourceType[] resourceOrder2) {
      for (int i = 0; i < resourceOrder1.length; i++) {
        int ret = (int)Math.signum(shares1.getWeight(resourceOrder1[i])
            - shares2.getWeight(resourceOrder2[i]));
        if (ret != 0) {
          return ret;
        }
      }
      return 0;
    }
  }
}

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值