Hadoop Yarn (FairScheduler) 多角度核心源码分析01

最新推荐文章于 2022-01-08 17:56:48 发布

凡哲_Lucas

最新推荐文章于 2022-01-08 17:56:48 发布

阅读量1k

点赞数 2

分类专栏： Yarn

本文链接：https://blog.csdn.net/weixin_35792948/article/details/80754071

版权

研究调度器，我们首先要知道调度器的客户是谁，谁需要调度，有以下两种来源：
（1）作业提交的时候，申请运行AppMaster需要的容器时
（2）作业运行期间，AppMaster申请运行map以及reduce任务，或者spark任务等等所需要的容器

这一节主要分析第一种请求的源码，即申请AppMaster需要的容器时候的整个过程。申请运行AppMaster需要的容器的起源：
当AppMaster运行需要的容器是在RMAppAttempt对象处于Submttied状态，收到ATTEMPT_ADDED事件，然后调用SchedulerTransition()进行申请的：
RMAppAttempt的存在的实例是RMAppAttemptImpl，对应申请过程的状态机跳变事件的代码为：

 @Override
    public RMAppAttemptState transition(RMAppAttemptImpl appAttempt,
        RMAppAttemptEvent event) {
      ApplicationSubmissionContext subCtx = appAttempt.submissionContext;
      if (!subCtx.getUnmanagedAM()) {
        // Need reset #containers before create new attempt, because this request
        // will be passed to scheduler, and scheduler will deduct the number after
        // AM container allocated

        // Currently, following fields are all hard coded,
        // TODO: change these fields when we want to support
        // priority or multiple containers AM container allocation.
        for (ResourceRequest amReq : appAttempt.amReqs) {
          amReq.setNumContainers(1);
          amReq.setPriority(AM_CONTAINER_PRIORITY);
        }

设置好容器个数优先级参数后，进行容器的申请，所谓的getUnmanagedAM是判断AM是否是通过RM进行申请的，如果是就是正常的AM，如果不是说明AM是用户自己通过命令行起起来的而不通过RM进行申请。

 // AM resource has been checked when submission
        Allocation amContainerAllocation =
            appAttempt.scheduler.allocate(
                appAttempt.applicationAttemptId,
                appAttempt.amReqs,
                EMPTY_CONTAINER_RELEASE_LIST,
                amBlacklist.getAdditions(),
                amBlacklist.getRemovals());
        if (amContainerAllocation != null
            && amContainerAllocation.getContainers() != null) {
          assert (amContainerAllocation.getContainers().size() == 0);
        }
        return RMAppAttemptState.SCHEDULED;
      } else {
        // save state and then go to LAUNCHED state
        appAttempt.storeAttempt();
        return RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING;
      }

根据所选择的调度器的不同，调用对应调度器的allocate函数，这里调用的是FairScheduler的allocate函数，如果该RMAppAttempt的allocate没有收揽到容器（amContainerAllocation.getContainers().size() == 0），那么将会停留在RMAppAttemptState.SCHEDULED。appAttempt.scheduler.allocate（）这个函数中，一旦收揽到了容器，会把触发RMContainerEventType.ACQUIRED事件，从而推动RMAppAttempt的状态机事件RMAppAttemptState.CONTAINER_ALLOCATED，状态机的之间都是互相推动相辅相成。
所谓allocate函数只不过是从新分配容器的列表中，收揽过来RMContainer对象，并为这些容器办理NMToken等对象，然后打包成一个Allocation对象。

 @Override
  public Allocation allocate(ApplicationAttemptId appAttemptId,
      List<ResourceRequest> ask, List<ContainerId> release,
      List<String> blacklistAdditions, List<String> blacklistRemovals) {

    // Make sure this application exists
    FSAppAttempt application = getSchedulerApp(appAttemptId);
    if (application == null) {
      LOG.info("Calling allocate on removed " +
          "or non existant application " + appAttemptId);
      return EMPTY_ALLOCATION;
    }

    // Sanity check
    SchedulerUtils.normalizeRequests(ask, DOMINANT_RESOURCE_CALCULATOR,
        getClusterResource(), minimumAllocation, getMaximumResourceCapability(),
        incrAllocation);

    // Record container allocation start time
    application.recordContainerRequestTime(getClock().getTime());

    // Release containers
    releaseContainers(release, application);

    synchronized (application) {
      if (!ask.isEmpty()) {
        if (LOG.isDebugEnabled()) {
          LOG.debug("allocate: pre-update" +
              " applicationAttemptId=" + appAttemptId +
              " application=" + application.getApplicationId());
        }
        application.showRequests();

        // Update application requests
        application.updateResourceRequests(ask);

        application.showRequests();
      }

      Set<ContainerId> preemptionContainerIds =
          application.getPreemptionContainerIds();
      if (LOG.isDebugEnabled()) {
        LOG.debug(
            "allocate: post-update" + " applicationAttemptId=" + appAttemptId
                + " #ask=" + ask.size() + " reservation= " + application
                .getCurrentReservation());

        LOG.debug("Preempting " + preemptionContainerIds.size()
            + " container(s)");
      }

      if (application.isWaitingForAMContainer(application.getApplicationId())) {
        // Allocate is for AM and update AM blacklist for this
        application.updateAMBlacklist(
            blacklistAdditions, blacklistRemov