YARN源码分析之ApplicationMaster分配策略

本文探讨了YARN中ApplicationMaster(AM)的container分配策略,通过源码分析发现,AM container的分配并非随机,而是基于节点的空闲资源。在ContainerScheduler中,CapacityScheduler负责调度任务,它会根据节点状态和资源需求进行分配。当NM心跳时,CapacityScheduler会检查预留container或进行新的调度。调度过程中涉及 FiCaSchedulerApp、AbstractCSQueue 和 LeafQueue 等组件,通过复杂的逻辑确保资源的有效分配。
摘要由CSDN通过智能技术生成

一次和朋友的谈话中涉及到ApplicationMaster的container分配策略是什么,我映像中是随机分配的,但他说是根据各节点空闲资源来分配的。
之前看代码的时候也没注意这块的逻辑,既然现在有了疑惑那就去代码里瞅瞅。

个人站点地址:http://bigdatadecode.club/YARN源码分析之ApplicationMaster分配策略.html

从MR的运行log中可以找到AM的container是在什么时候分配的,见log

2017-04-09 03:26:17,113 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1491729774382_0001_000001 State change from SUBMITTED to SCHEDULED
2017-04-09 03:26:17,415 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1491729774382_0001_01_000001 Container Transitioned from NEW to ALLOCATED

AM container是在appattempt的状态由SUBMITTED变为SCHEDULED时初始化的。
appattempt由SUBMITTED变为SCHEDULED状态的处理逻辑为:

public static final class ScheduleTransition
    implements
    MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> {
   
  @Override
  public RMAppAttemptState transition(RMAppAttemptImpl appAttempt,
      RMAppAttemptEvent event) {
    ApplicationSubmissionContext subCtx = appAttempt.submissionContext;
    if (!subCtx.getUnmanagedAM()) {
      // Need reset #containers before create new attempt, because this request
      // will be passed to scheduler, and scheduler will deduct the number after
      // AM container allocated
      // 设置am container的请求
      appAttempt.amReq.setNumContainers(1);
      appAttempt.amReq.setPriority(AM_CONTAINER_PRIORITY);
      // ResourceName为ANY表示任何机架上的任一机器
      appAttempt.amReq.setResourceName(ResourceRequest.ANY);
      appAttempt.amReq.setRelaxLocality(true);

      // 由调度器来分配资源
      Allocation amContainerAllocation =
          appAttempt.scheduler.allocate(appAttempt.applicationAttemptId,
              Collections.singletonList(appAttempt.amReq),
              EMPTY_CONTAINER_RELEASE_LIST, null, null);
      ...
      return RMAppAttemptState.SCHEDULED;
    } else {
      ...
    }
  }
}

首先为AM container构造container请求,其实从appAttempt.amReq.setResourceName(ResourceRequest.ANY)就可以看出am container的分配原则是随机的,因为在创建请求时对ResourceName并没有要求。但我们还是继续看下代码以验证下。
请求创建成功之后,由调度器来分配资源,这里默认使用的是Capacity调度,代码如下:

// CapacityScheduler.java
public Allocation allocate(ApplicationAttemptId applicationAttemptId,
    List<ResourceRequest> ask, List<ContainerId> release, 
    List<String> blacklistAdditions, List<String> blacklistRemovals) {

  FiCaSchedulerApp application = getApplicationAttempt(applicationAttemptId);
  ...
  // Release containers
  releaseContainers(release, application);

  synchronized (application) {
    ...
    if (!ask.isEmpty()) {
      ...
      application.showRequests();
      // 将请求该application attempt的map中
      // Update application requests
      application.updateResourceRequests(ask);
      application.showRequests();
    }

    application.updateBlacklist(blacklistAdditions, blacklistRemovals);
    //
    return application.getAllocation(getResourceCalculator(),
                 clusterResource, getMinimumResourceCapability());
  }
}

CapacityScheduler分配请求时,调用application.updateResourceRequests(ask)将请求放入map中,等待nm心跳时来取。
这个application是FiCaSchedulerApp的对象,FiCaSchedulerApp其实对应的是application attempt。updateResurceRequests代码如下:

public synchronized void updateResourceRequests(
    List<ResourceRequest> requests) {
  if (!isStopped) {
    // AppSchedulingInfo.updateResourceRequests
    appSchedulingInfo.updateResourceRequests(requests, false);
  }
}

AppSchedulingInfo记录了application的所有消费情况,当然也包括这个application正在运行或者已经完成的container。

synchronized public void updateResourceRequests(
    List<ResourceRequest> requests, boolean recoverPreemptedRequest) {
  // Update resource requests
  for (ResourceRequest request : requests) {
    Priority priority = request.getPriority();
    String resourceName = request.getResourceName();
    boolean updatePendingResources = false;
    ResourceRequest lastRequest = null;
    // 如果request的ResourceName是ResourceRequest.ANY
    // 只有am container是ANY???不应该吧
    if (resourceName.equals(ResourceRequest.ANY)) {
      ...
      // ResourceRequest.ANY才置为true??
      updatePendingResources = true;

      // Premature optimization?
      // Assumes that we won't see 
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值