yarn3.2源码分析之NM端startContainer的流程

概述

NM端启动container会经过一系列event:

  • initApplication类型的ApplicationEvent
  • init_application_resources类型的LocalizationEvent
    ResourceLocalizationService是LocalizationEvent的事件处理器。在处理init_application_resources类型的LocalizationEvent时,记录用户与LocalResourcesTracker的map映射到privateRsrc map集合,记录appId与LocalResourcesTracker的map映射到appRsrc map集合。
  • applicationInited类型的ApplicationEvent
  • initContainer类型的ContainerEvent
  • localize_container_resources类型的LocalizationEvent
    ResourceLocalizationService是LocalizationEvent的事件处理器。在处理localize_container_resources类型的LocalizationEvent时,将LocalizationEvent显式转换为ContainerLocalizationRequestEvent,根据event中请求的resource的可见性、用户、appId,从ResourceLocalizationService的privateRsrc map集合和appRsrc map集合中获取相应的LocalResourcesTracker,由LocalResourcesTracker发起request类型的ResourceEvent。
  • request类型的ResourceEvent
    LocalResourceTrackerImpl是ResourceEvent的事件处理器。在处理request类型的ResourceEvent时,创建LocalizedResource,并记录该request与LocalizedResource的映射关系到map集合中。LocalizedResource同样也是ResourceEvent的事件处理器,ResourceEvent被抛给LocalizedResource处理,然后LocalizedResource发起request_resource_localization类型的LocalizerEvent。

ContainerManagerImpl

startContainers()方法

startContainers对request中的container请求进行遍历,调用startContainerInternal启动一个container,这个container是在nodemanager上准备运行task的。启动成功的放入succeededContainers列表中,失败的则放入failedContainers中,遍历结束构造一个response返回给rm。

/**
   * Start a list of containers on this NodeManager.
   */
  @Override
  public StartContainersResponse startContainers(
      StartContainersRequest requests) throws YarnException, IOException {
    UserGroupInformation remoteUgi = getRemoteUgi();
    NMTokenIdentifier nmTokenIdentifier = selectNMTokenIdentifier(remoteUgi);
    authorizeUser(remoteUgi, nmTokenIdentifier);
    List<ContainerId> succeededContainers = new ArrayList<ContainerId>();
    Map<ContainerId, SerializedException> failedContainers =
        new HashMap<ContainerId, SerializedException>();
    // Synchronize with NodeStatusUpdaterImpl#registerWithRM
    // to avoid race condition during NM-RM resync (due to RM restart) while a
    // container is being started, in particular when the container has not yet
    // been added to the containers map in NMContext.
    synchronized (this.context) {
      for (StartContainerRequest request : requests
          .getStartContainerRequests()) {
        ContainerId containerId = null;
        try {
          if (request.getContainerToken() == null
              || request.getContainerToken().getIdentifier() == null) {
            throw new IOException(INVALID_CONTAINERTOKEN_MSG);
          }

          ContainerTokenIdentifier containerTokenIdentifier = BuilderUtils
              .newContainerTokenIdentifier(request.getContainerToken());
          verifyAndGetContainerTokenIdentifier(request.getContainerToken(),
              containerTokenIdentifier);
          containerId = containerTokenIdentifier.getContainerID();

          // Initialize the AMRMProxy service instance only if the container is of
          // type AM and if the AMRMProxy service is enabled
          if (amrmProxyEnabled && containerTokenIdentifier.getContainerType()
              .equals(ContainerType.APPLICATION_MASTER)) {
            this.getAMRMProxyService().processApplicationStartRequest(request);
          }
          performContainerPreStartChecks(nmTokenIdentifier, request,
              containerTokenIdentifier);
          startContainerInternal(containerTokenIdentifier, request);
          succeededContainers.add(containerId);
        } catch (YarnException e) {
          failedContainers.put(containerId, SerializedException.newInstance(e));
        } catch (InvalidToken ie) {
          failedContainers
              .put(containerId, SerializedException.newInstance(ie));
          throw ie;
        } catch (IOException e) {
          throw RPCUtil.getRemoteException(e);
        }
      }
      return StartContainersResponse
          .newInstance(getAuxServiceMetaData(), succeededContainers,
              failedContainers);
    }
  }

startContainerInternal()方法

创建container;

创建initApplication类型的ApplicationEvent;

 

protected void startContainerInternal(
      ContainerTokenIdentifier containerTokenIdentifier,
      StartContainerRequest request) throws YarnException, IOException {

    ContainerId containerId = containerTokenIdentifier.getContainerID();
    String containerIdStr = containerId.toString();
    String user = containerTokenIdentifier.getApplicationSubmitter();

    LOG.info("Start request for " + containerIdStr + " by user " + user);

    ContainerLaunchContext launchContext = request.getContainerLaunchContext();

    // Sanity check for local resources
    for (Map.Entry<String, LocalResource> rsrc : launchContext
        .getLocalResources().entrySet()) {
      if (rsrc.getValue() == null || rsrc.getValue().getResource() == null) {
        throw new YarnException(
            "Null resource URL for local resource " + rsrc.getKey() + " : " + rsrc.getValue());
      } else if (rsrc.getValue().getType() == null) {
        throw new YarnException(
            "Null resource type for local resource " + rsrc.getKey() + " : " + rsrc.getValue());
      } else if (rsrc.getValue().getVisibility() == null) {
        throw new YarnException(
            "Null resource visibility for local resource " + rsrc.getKey() + " : " + rsrc.getValue());
      }
    }

    Credentials credentials =
        YarnServerSecurityUtils.parseCredentials(launchContext);

    long containerStartTime = SystemClock.getInstance().getTime();
   //创建container
    Container container =
        new ContainerImpl(getConfig(), this.dispatcher,
            launchContext, credentials, metrics, containerTokenIdentifier,
            context, containerStartTime);
    ApplicationId applicationID =
        containerId.getApplicationAttemptId().getApplicationId();
    if (context.getContainers().putIfAbsent(containerId, container) != null) {
      NMAuditLogger.logFailure(user, AuditConstants.START_CONTAINER,
        "ContainerManagerImpl", "Container already running on this node!",
        applicationID, containerId);
      throw RPCUtil.getRemoteException("Container " + containerIdStr
          + " already is running on this node!!");
    }

    this.readLock.lock();
    try {
      if (!isServiceStopped()) {
        if (!context.getApplications().containsKey(applicationID)) {
          // Create the application
          // populate the flow context from the launch context if the timeline
          // service v.2 is enabled
          FlowContext flowContext =
              getFlowContext(launchContext, applicationID);

          Application application =
              new ApplicationImpl(dispatcher, user, flowContext,
                  applicationID, credentials, context);
          if (context.getApplications().putIfAbsent(applicationID,
              application) == null) {
            LOG.info("Creating a new application reference for app "
                + applicationID);
            LogAggregationContext logAggregationContext =
                containerTokenIdentifier.getLogAggregationContext();
            Map<ApplicationAccessType, String> appAcls =
                container.getLaunchContext().getApplicationACLs();
            context.getNMStateStore().storeApplication(applicationID,
                buildAppProto(applicationID, user, credentials, appAcls,
                    logAggregationContext, flowContext));
//创建initApplication类型的ApplicationEvent
            dispatcher.getEventHandler().handle(new ApplicationInitEvent(
                applicationID, appAcls, logAggregationContext));
          }
        } else if (containerTokenIdentifier.getContainerType()
            == ContainerType.APPLICATION_MASTER) {
          FlowContext flowContext =
              getFlowContext(launchContext, applicationID);
          if (flowContext != null) {
            ApplicationImpl application =
                (ApplicationImpl) context.getApplications().get(applicationID);

            // update flowContext reference in ApplicationImpl
            application.setFlowContext(flowContext);

            // Required to update state store for recovery.
            context.getNMStateStore().storeApplication(applicationID,
                buildAppProto(applicationID, user, credentials,
                    container.getLaunchContext().getApplicationACLs(),
                    containerTokenIdentifier.getLogAggregationContext(),
                    flowContext));

            LOG.info(
                "Updated application reference with flowContext " + flowContext
                    + " for app " + applicationID);
          } else {
            LOG.info("TimelineService V2.0 is not enabled. Skipping updating "
                + "flowContext for application " + applicationID);
          }
        }

        this.context.getNMStateStore().storeContainer(containerId,
            containerTokenIdentifier.getVersion(), containerStartTime, request);
//创建initContainer类型的ApplicationEvent
        dispatcher.getEventHandler().handle(
          new ApplicationContainerInitEvent(container));

        this.context.getContainerTokenSecretManager().startContainerSuccessful(
          containerTokenIdentifier);
        NMAuditLogger.logSuccess(user, AuditConstants.START_CONTAINER,
          "ContainerManageImpl", applicationID, containerId);
        // TODO launchedContainer misplaced -> doesn't necessarily mean a container
        // launch. A finished Application will not launch containers.
        metrics.launchedContainer();
        metrics.allocateContainer(containerTokenIdentifier.getResource());
      } else {
        throw new YarnException(
            "Container start failed as the NodeManager is " +
            "in the process of shutting down");
      }
    } finally {
      this.readLock.unlock();
    }
  }

public ApplicationInitEvent(ApplicationId appId,
      Map<ApplicationAccessType, String> acls,
      LogAggregationContext logAggregationContext) {
    super(appId, ApplicationEventType.INIT_APPLICATION);
    this.applicationACLs = acls;
    this.logAggregationContext = logAggregationContext;
  }

public ApplicationContainerInitEvent(Container container) {
    super(container.getContainerId().getApplicationAttemptId()
        .getApplicationId(), ApplicationEventType.INIT_CONTAINER);
    this.container = container;
  }

AsyncDispatcher注册处理器

  public ContainerManagerImpl(Context context, ContainerExecutor exec,
      DeletionService deletionContext, NodeStatusUpdater nodeStatusUpdater,
      NodeManagerMetrics metrics, LocalDirsHandlerService dirsHandler) {
..................

    // ContainerManager level dispatcher.
    dispatcher = new AsyncDispatcher("NM ContainerManager dispatcher");
  

    containersLauncher = createContainersLauncher(context, exec);
    addService(containersLauncher);


    dispatcher.register(ContainerEventType.class,
        new ContainerEventDispatcher());
    dispatcher.register(ApplicationEventType.class,
        createApplicationEventDispatcher());
    dispatcher.register(LocalizationEventType.class,
        new LocalizationEventHandlerWrapper(rsrcLocalizationSrvc,
            nmMetricsPublisher));
    dispatcher.register(AuxServicesEventType.class, auxiliaryServices);
    dispatcher.register(ContainersMonitorEventType.class, containersMonitor);
    dispatcher.register(ContainersLauncherEventType.class, containersLauncher);
    dispatcher.register(ContainerSchedulerEventType.class, containerScheduler);

    addService(dispatcher);

  }

protected EventHandler<ApplicationEvent> createApplicationEventDispatcher() {
    return new ApplicationEventDispatcher();
  }

  @Override
  public void serviceInit(Configuration conf) throws Exception {

    logHandler =
      createLogHandler(conf, this.context, this.deletionService);
    addIfService(logHandler);
    dispatcher.register(LogHandlerEventType.class, logHandler);
    
    // add the shared cache upload service (it will do nothing if the shared
    // cache is disabled)
    SharedCacheUploadService sharedCacheUploader =
        createSharedCacheUploaderService();
    addService(sharedCacheUploader);
    dispatcher.register(SharedCacheUploadEventType.class, sharedCacheUploader);
 .............
  }

ApplicationEvent的处理器 —— ApplicationImpl

ApplicationEventDispatcher

 class ApplicationEventDispatcher implements EventHandler<ApplicationEvent> {
    @Override
    public void handle(ApplicationEvent event) {
      Application app =
          ContainerManagerImpl.this.context.getApplications().get(
              event.getApplicationID());
      if (app != null) {
        app.handle(event);
        if (nmMetricsPublisher != null) {
          nmMetricsPublisher.publishApplicationEvent(event);
        }
      } else {
        LOG.warn("Event " + event + " sent to absent application "
            + event.getApplicationID());
      }
    }
  }

ApplicationImpl使用transition处理ApplicationEvent

@Override
  public void handle(ApplicationEvent event) {

    this.writeLock.lock();

    try {
      ApplicationId applicationID = event.getApplicationID();
      if (LOG.isDebugEnabled()) {
        LOG.debug(
            "Processing " + applicationID + " of type " + event.getType());
      }
      ApplicationState oldState = stateMachine.getCurrentState();
      ApplicationState newState = null;
      try {
        // queue event requesting init of the same app
        newState = stateMachine.doTransition(event.getType(), event);
      } catch (InvalidStateTransitionException e) {
        LOG.warn("Can't handle this event at current state", e);
      }
      if (newState != null && oldState != newState) {
        LOG.info("Application " + applicationID + " transitioned from "
            + oldState + " to " + newState);
      }
    } finally {
      this.writeLock.unlock();
    }
  }

处理initApplication类型ApplicationEvent的transition

ApplicationImpl的内部stateMachineFactory注册处理initApplication类型的transition

// Transitions from NEW state
           .addTransition(ApplicationState.NEW, ApplicationState.INITING,
               ApplicationEventType.INIT_APPLICATION, new AppInitTransition())

 AppInitTransition

创建ApplicationStarted类型的LogHandlerEvent,并提交给AsyncDispatcher

  /**
   * Notify services of new application.
   * 
   * In particular, this initializes the {@link LogAggregationService}
   */
  @SuppressWarnings("unchecked")
  static class AppInitTransition implements
      SingleArcTransition<ApplicationImpl, ApplicationEvent> {
    @Override
    public void transition(ApplicationImpl app, ApplicationEvent event) {
      ApplicationInitEvent initEvent = (ApplicationInitEvent)event;
      app.applicationACLs = initEvent.getApplicationACLs();
      app.aclsManager.addApplication(app.getAppId(), app.applicationACLs);
      // Inform the logAggregator
      app.logAggregationContext = initEvent.getLogAggregationContext();
      app.dispatcher.getEventHandler().handle(
          new LogHandlerAppStartedEvent(app.appId, app.user,
              app.credentials, app.applicationACLs,
              app.logAggregationContext, app.applicationLogInitedTimestamp));
    }
  }

public LogHandlerAppStartedEvent(ApplicationId appId, String user,
      Credentials credentials, Map<ApplicationAccessType, String> appAcls,
      LogAggregationContext logAggregationContext, long appLogInitedTime) {
//创建ApplicationStarted类型的LogHandlerEvent
    super(LogHandlerEventType.APPLICATION_STARTED);
    this.applicationId = appId;
    this.user = user;
    this.credentials = credentials;
    this.appAcls = appAcls;
    this.logAggregationContext = logAggregationContext;
    this.recoveredAppLogInitedTime = appLogInitedTime;
  }

 AsyncDispatcher注册处理LogHandlerEvent的处理器

dispatcher.register(LogHandlerEventType.class, logHandler);

protected LogHandler createLogHandler(Configuration conf, Context context,
      DeletionService deletionService) {
    if (conf.getBoolean(YarnConfiguration.LOG_AGGREGATION_ENABLED,
        YarnConfiguration.DEFAULT_LOG_AGGREGATION_ENABLED)) {
      return new LogAggregationService(this.dispatcher, context,
          deletionService, dirsHandler);
    } else {
      return new NonAggregatingLogHandler(this.dispatcher, deletionService,
                                          dirsHandler,
                                          context.getNMStateStore());
    }
  }

NonAggregatingLogHandler处理ApplicationStarted类型的LogHandlerEvent

@Override
  public void handle(LogHandlerEvent event) {
    switch (event.getType()) {
      case APPLICATION_STARTED:
        LogHandlerAppStartedEvent appStartedEvent =
            (LogHandlerAppStartedEvent) event;
        this.appOwners.put(appStartedEvent.getApplicationId(),
            appStartedEvent.getUser());
        this.dispatcher.getEventHandler().handle(
            new ApplicationEvent(appStartedEvent.getApplicationId(),
                ApplicationEventType.APPLICATION_LOG_HANDLING_INITED));
        break;
..................
    }
  }

处理application_log_handling_inited类型ApplicationEvent的transition

ApplicationImpl的内部stateMachineFactory注册处理application_log_handling_inited类型的transition

 .addTransition(ApplicationState.INITING, ApplicationState.INITING,
               ApplicationEventType.APPLICATION_LOG_HANDLING_INITED,
               new AppLogInitDoneTransition())

 AppLogInitDoneTransition

创建init_application_resources类型的LocalizationEvent,提交给AsyncDispatcher。

 /**
   * Handles the APPLICATION_LOG_HANDLING_INITED event that occurs after
   * {@link LogAggregationService} has created the directories for the app
   * and started the aggregation thread for the app.
   * 
   * In particular, this requests that the {@link ResourceLocalizationService}
   * localize the application-scoped resources.
   */
  @SuppressWarnings("unchecked")
  static class AppLogInitDoneTransition implements
      SingleArcTransition<ApplicationImpl, ApplicationEvent> {
    @Override
    public void transition(ApplicationImpl app, ApplicationEvent event) {
      app.dispatcher.getEventHandler().handle(
          new ApplicationLocalizationEvent(
              LocalizationEventType.INIT_APPLICATION_RESOURCES, app));
      app.setAppLogInitedTimestamp(event.getTimestamp());
      try {
        app.appStateStore.storeApplication(app.appId, buildAppProto(app));
      } catch (Exception ex) {
        LOG.warn("failed to update application state in state store", ex);
      }
    }
  }

 处理LocalizationEvent的处理器 ——  ResourceLocalizationService

 private final ResourceLocalizationService rsrcLocalizationSrvc;

dispatcher.register(LocalizationEventType.class,
        new LocalizationEventHandlerWrapper(rsrcLocalizationSrvc,
            nmMetricsPublisher));

 LocalizationEventHandlerWrapper只是ResourceLocalizationService的代理

 @Override
    public void handle(LocalizationEvent event) {
      origLocalizationEventHandler.handle(event);
      if (timelinePublisher != null) {
        timelinePublisher.publishLocalizationEvent(event);
      }
    }

 ResourceLocalizationService

创建applicationInited类型的ApplicationEvent,并提交给AsyncDispatcher

public void handle(LocalizationEvent event) {
    // TODO: create log dir as $logdir/$user/$appId
    switch (event.getType()) {
    case INIT_APPLICATION_RESOURCES:
      handleInitApplicationResources(
          ((ApplicationLocalizationEvent)event).getApplication());
      break;
    case LOCALIZE_CONTAINER_RESOURCES:
      handleInitContainerResources((ContainerLocalizationRequestEvent) event);
      break;
    case CONTAINER_RESOURCES_LOCALIZED:
      handleContainerResourcesLocalized((ContainerLocalizationEvent) event);
      break;
    .................
    }
  }

private void handleInitApplicationResources(Application app) {
    // 0) Create application tracking structs
    String userName = app.getUser();
    privateRsrc.putIfAbsent(userName, new LocalResourcesTrackerImpl(userName,
        null, dispatcher, true, super.getConfig(), stateStore, dirsHandler));
    String appIdStr = app.getAppId().toString();
    appRsrc.putIfAbsent(appIdStr, new LocalResourcesTrackerImpl(app.getUser(),
        app.getAppId(), dispatcher, false, super.getConfig(), stateStore,
        dirsHandler));
    // 1) Signal container init
    //
    // This is handled by the ApplicationImpl state machine and allows
    // containers to proceed with launching.
    dispatcher.getEventHandler().handle(new ApplicationInitedEvent(
          app.getAppId()));
  }

public ApplicationInitedEvent(ApplicationId appID) {
    super(appID, ApplicationEventType.APPLICATION_INITED);
  }

 处理applicationInited类型ApplicationEvent的transition

ApplicationImpl的内部stateMachineFactory注册处理applicationInited类型的transition

.addTransition(ApplicationState.INITING, ApplicationState.RUNNING,
               ApplicationEventType.APPLICATION_INITED,
               new AppInitDoneTransition())

 AppInitDoneTransition

static class AppInitDoneTransition implements
      SingleArcTransition<ApplicationImpl, ApplicationEvent> {
    @Override
    public void transition(ApplicationImpl app, ApplicationEvent event) {
      // Start all the containers waiting for ApplicationInit
      for (Container container : app.containers.values()) {
        app.dispatcher.getEventHandler().handle(new ContainerInitEvent(
              container.getContainerId()));
      }
    }
  }

public ContainerInitEvent(ContainerId c) {
    super(c, ContainerEventType.INIT_CONTAINER);
  }

处理ContainerEvent的处理器和处理initContainer类型ContainerEvent的transition

ContianerManagerImpl中的AsyncDispatcher注册处理ContainerEvent类型的处理器 —— ContainerEventDispatcher

dispatcher.register(ContainerEventType.class,
        new ContainerEventDispatcher());

class ContainerEventDispatcher implements EventHandler<ContainerEvent> {
    @Override
    public void handle(ContainerEvent event) {
      Map<ContainerId,Container> containers =
        ContainerManagerImpl.this.context.getContainers();
      Container c = containers.get(event.getContainerID());
      if (c != null) {
        c.handle(event);
        if (nmMetricsPublisher != null) {
          nmMetricsPublisher.publishContainerEvent(event);
        }
      } else {
        LOG.warn("Event " + event + " sent to absent container " +
            event.getContainerID());
      }
    }
  }

 ContainerImpl处理ContainerEvent

public void handle(ContainerEvent event) {
    try {
      this.writeLock.lock();

      ContainerId containerID = event.getContainerID();
      if (LOG.isDebugEnabled()) {
        LOG.debug("Processing " + containerID + " of type " + event.getType());
      }
      ContainerState oldState = stateMachine.getCurrentState();
      ContainerState newState = null;
      try {
        newState =
            stateMachine.doTransition(event.getType(), event);
      } catch (InvalidStateTransitionException e) {
        LOG.warn("Can't handle this event at current state: Current: ["
            + oldState + "], eventType: [" + event.getType() + "]," +
            " container: [" + containerID + "]", e);
      }
      if (newState != null && oldState != newState) {
        LOG.info("Container " + containerID + " transitioned from "
            + oldState
            + " to " + newState);
      }
    } finally {
      this.writeLock.unlock();
    }
  }

 处理initContainer类型ContainerEvent的transition
ContainerImpl的内部stateMachineFactory注册处理initContainer类型ContainerEvent的transition

  // From NEW State
    .addTransition(ContainerState.NEW,
        EnumSet.of(ContainerState.LOCALIZING,
            ContainerState.SCHEDULED,
            ContainerState.LOCALIZATION_FAILED,
            ContainerState.DONE),
        ContainerEventType.INIT_CONTAINER, new RequestResourcesTransition())

 RequestResourcesTransition 处理initContainer类型ContainerEvent

当接收到initContainer信息时,进行状态迁移。

如果有资源可以分配,发送ContainerLocalizationRequest请求给ResourceLocalizationManager,并进入Localizing状态。

如果没有资源可以分配,发送launchCoantainer event并直接进入scheduled状态。

  /**
   * State transition when a NEW container receives the INIT_CONTAINER
   * message.
   * 
   * If there are resources to localize, sends a
   * ContainerLocalizationRequest (LOCALIZE_CONTAINER_RESOURCES)
   * to the ResourceLocalizationManager and enters LOCALIZING state.
   * 
   * If there are no resources to localize, sends LAUNCH_CONTAINER event
   * and enters SCHEDULED state directly.
   * 
   * If there are any invalid resources specified, enters LOCALIZATION_FAILED
   * directly.
   */
  @SuppressWarnings("unchecked") // dispatcher not typed
  static class RequestResourcesTransition implements
      MultipleArcTransition<ContainerImpl,ContainerEvent,ContainerState> {
    @Override
    public ContainerState transition(ContainerImpl container,
        ContainerEvent event) {
      if (container.recoveredStatus == RecoveredContainerStatus.COMPLETED) {
        container.sendFinishedEvents();
        return ContainerState.DONE;
      } else if (container.recoveredStatus == RecoveredContainerStatus.QUEUED) {
        return ContainerState.SCHEDULED;
      } else if (container.recoveredAsKilled &&
          container.recoveredStatus == RecoveredContainerStatus.REQUESTED) {
        // container was killed but never launched
        container.metrics.killedContainer();
        NMAuditLogger.logSuccess(container.user,
            AuditConstants.FINISH_KILLED_CONTAINER, "ContainerImpl",
            container.containerId.getApplicationAttemptId().getApplicationId(),
            container.containerId);
        container.metrics.releaseContainer(
            container.containerTokenIdentifier.getResource());
        container.sendFinishedEvents();
        return ContainerState.DONE;
      }

      final ContainerLaunchContext ctxt = container.launchContext;
      container.metrics.initingContainer();

      container.dispatcher.getEventHandler().handle(new AuxServicesEvent
          (AuxServicesEventType.CONTAINER_INIT, container));

      // Inform the AuxServices about the opaque serviceData
      Map<String,ByteBuffer> csd = ctxt.getServiceData();
      if (csd != null) {
        // This can happen more than once per Application as each container may
        // have distinct service data
        for (Map.Entry<String,ByteBuffer> service : csd.entrySet()) {
          container.dispatcher.getEventHandler().handle(
              new AuxServicesEvent(AuxServicesEventType.APPLICATION_INIT,
                  container.user, container.containerId
                      .getApplicationAttemptId().getApplicationId(),
                  service.getKey().toString(), service.getValue()));
        }
      }

      container.containerLocalizationStartTime = clock.getTime();

      // Send requests for public, private resources
      Map<String,LocalResource> cntrRsrc = ctxt.getLocalResources();
      if (!cntrRsrc.isEmpty()) {
        try {
          Map<LocalResourceVisibility, Collection<LocalResourceRequest>> req =
              container.resourceSet.addResources(ctxt.getLocalResources());
          container.dispatcher.getEventHandler().handle(
              new ContainerLocalizationRequestEvent(container, req));
        } catch (URISyntaxException e) {
          // malformed resource; abort container launch
          LOG.warn("Failed to parse resource-request", e);
          container.cleanup();
          container.metrics.endInitingContainer();
          return ContainerState.LOCALIZATION_FAILED;
        }
        return ContainerState.LOCALIZING;
      } else {
//发送launchCoantainer event到AsyncDispatcher
        container.sendScheduleEvent();
        container.metrics.endInitingContainer();
        return ContainerState.SCHEDULED;
      }
    }
  }

 localizeContainerResource类型的ContainerLocalizationEvent

/**
   * Event requesting the localization of the rsrc.
   * @param c Container
   * @param rsrc LocalResourceRequests map
   */
  public ContainerLocalizationRequestEvent(Container c,
      Map<LocalResourceVisibility, Collection<LocalResourceRequest>> rsrc) {
    super(LocalizationEventType.LOCALIZE_CONTAINER_RESOURCES, c);
    this.rsrc = rsrc;
  }

 处理LocalizationEvent的处理器 ——  ResourceLocalizationService

处理localize_container_resources类型的LocalizationEvent

public void handle(LocalizationEvent event) {
    // TODO: create log dir as $logdir/$user/$appId
    switch (event.getType()) {
    case INIT_APPLICATION_RESOURCES:
      handleInitApplicationResources(
          ((ApplicationLocalizationEvent)event).getApplication());
      break;
    case LOCALIZE_CONTAINER_RESOURCES:
      handleInitContainerResources((ContainerLocalizationRequestEvent) event);
      break;
    case CONTAINER_RESOURCES_LOCALIZED:
      handleContainerResourcesLocalized((ContainerLocalizationEvent) event);
      break;
    .................
    }
  }

/**
   * For each of the requested resources for a container, determines the
   * appropriate {@link LocalResourcesTracker} and forwards a 
   * {@link LocalResourceRequest} to that tracker.
   */
  private void handleInitContainerResources(
      ContainerLocalizationRequestEvent rsrcReqs) {
    Container c = rsrcReqs.getContainer();
//前置校验container的当前状态,如果状态不为localizing、running、reinitializing这三者之一,则不往下处理
    EnumSet<ContainerState> set =
        EnumSet.of(ContainerState.LOCALIZING,
            ContainerState.RUNNING, ContainerState.REINITIALIZING);
    if (!set.contains(c.getContainerState())) {
      LOG.warn(c.getContainerId() + " is at " + c.getContainerState()
          + " state, do not localize resources.");
      return;
    }
    // create a loading cache for the file statuses
    LoadingCache<Path,Future<FileStatus>> statCache =
        CacheBuilder.newBuilder().build(FSDownload.createStatusCacheLoader(getConfig()));
    LocalizerContext ctxt = new LocalizerContext(
        c.getUser(), c.getContainerId(), c.getCredentials(), statCache);
    Map<LocalResourceVisibility, Collection<LocalResourceRequest>> rsrcs =
      rsrcReqs.getRequestedResources();
    for (Map.Entry<LocalResourceVisibility, Collection<LocalResourceRequest>> e :
         rsrcs.entrySet()) {
//ResourceLocalizationService记录了每个用户,或者每个应用对应的LocalResourcesTracker。
//用户映射的LocalResourcesTracker用于跟踪private resource;
//appId映射的LocalResourcesTracker用于跟踪application resource;
//LocalResourcesTracker publicRsrc成员变量用于跟踪public resource;
//根据LocalResource的可见性(public、private或application)、用户名、appId,获取相应的LocalResourcesTracker
      LocalResourcesTracker tracker =
          getLocalResourcesTracker(e.getKey(), c.getUser(),
              c.getContainerId().getApplicationAttemptId()
                  .getApplicationId());
      for (LocalResourceRequest req : e.getValue()) {
//由LocalResourcesTracker发起ResourceRequestEvent事件处理请求
        tracker.handle(new ResourceRequestEvent(req, e.getKey(), ctxt));
        if (LOG.isDebugEnabled()) {
          LOG.debug("Localizing " + req.getPath() +
              " for container " + c.getContainerId());
        }
      }
    }
  }

public ResourceRequestEvent(LocalResourceRequest resource,
      LocalResourceVisibility vis, LocalizerContext context) {
    super(resource, ResourceEventType.REQUEST);
    this.vis = vis;
    this.context = context;
  }

处理ResourceEvent的处理器 —— LocalResourcesTrackerImpl

/*
   * Synchronizing this method for avoiding races due to multiple ResourceEvent's
   * coming to LocalResourcesTracker from Public/Private localizer and
   * Resource Localization Service.
   */
  @Override
  public synchronized void handle(ResourceEvent event) {
    LocalResourceRequest req = event.getLocalResourceRequest();
    LocalizedResource rsrc = localrsrc.get(req);
    switch (event.getType()) {
    case LOCALIZED:
      if (useLocalCacheDirectoryManager) {
        inProgressLocalResourcesMap.remove(req);
      }
      break;
    case REQUEST:
      if (rsrc != null && (!isResourcePresent(rsrc))) {
        LOG.info("Resource " + rsrc.getLocalPath()
            + " is missing, localizing it again");
        removeResource(req);
        rsrc = null;
      }
      if (null == rsrc) {
        rsrc = new LocalizedResource(req, dispatcher);
        localrsrc.put(req, rsrc);
      }
      break;
  
    }

    if (rsrc == null) {
      LOG.warn("Received " + event.getType() + " event for request " + req
          + " but localized resource is missing");
      return;
    }
    rsrc.handle(event);

    // Remove the resource if its downloading and its reference count has
    // become 0 after RELEASE. This maybe because a container was killed while
    // localizing and no other container is referring to the resource.
    // NOTE: This should NOT be done for public resources since the
    //       download is not associated with a container-specific localizer.
    if (event.getType() == ResourceEventType.RELEASE) {
      if (rsrc.getState() == ResourceState.DOWNLOADING &&
          rsrc.getRefCount() <= 0 &&
          rsrc.getRequest().getVisibility() != LocalResourceVisibility.PUBLIC) {
        removeResource(req);
      }
    }

    if (event.getType() == ResourceEventType.LOCALIZED) {
      if (rsrc.getLocalPath() != null) {
        try {
          stateStore.finishResourceLocalization(user, appId,
              buildLocalizedResourceProto(rsrc));
        } catch (IOException ioe) {
          LOG.error("Error storing resource state for " + rsrc, ioe);
        }
      } else {
        LOG.warn("Resource " + rsrc + " localized without a location");
      }
    }
  }

处理ResourceEvent的处理器 —— LocalizedResource

public void handle(ResourceEvent event) {
    try {
      this.writeLock.lock();

      Path resourcePath = event.getLocalResourceRequest().getPath();
      if (LOG.isDebugEnabled()) {
        LOG.debug("Processing " + resourcePath + " of type " + event.getType());
      }
      ResourceState oldState = this.stateMachine.getCurrentState();
      ResourceState newState = null;
      try {
        newState = this.stateMachine.doTransition(event.getType(), event);
      } catch (InvalidStateTransitionException e) {
        LOG.warn("Can't handle this event at current state", e);
      }
      if (newState != null && oldState != newState) {
        if (LOG.isDebugEnabled()) {
          LOG.debug("Resource " + resourcePath + (localPath != null ?
              "(->" + localPath + ")": "") + " size : " + getSize()
              + " transitioned from " + oldState + " to " + newState);
        }
      }
    } finally {
      this.writeLock.unlock();
    }
  }

处理Request类型ResourceEvent的transition

LocalizedResource的内部StateMachineFactory添加处理Request类型ResourceEvent的transition —— FetchResourceTransition

  .addTransition(ResourceState.INIT, ResourceState.DOWNLOADING,
        ResourceEventType.REQUEST, new FetchResourceTransition())

FetchResourceTransition

/**
   * Transition from INIT to DOWNLOADING.
   * Sends a {@link LocalizerResourceRequestEvent} to the
   * {@link ResourceLocalizationService}.
   */
  @SuppressWarnings("unchecked") // dispatcher not typed
  private static class FetchResourceTransition extends ResourceTransition {
    @Override
    public void transition(LocalizedResource rsrc, ResourceEvent event) {
      ResourceRequestEvent req = (ResourceRequestEvent) event;
      LocalizerContext ctxt = req.getContext();
      ContainerId container = ctxt.getContainerId();
      rsrc.ref.add(container);
      rsrc.dispatcher.getEventHandler().handle(
          new LocalizerResourceRequestEvent(rsrc, req.getVisibility(), ctxt, 
              req.getLocalResourceRequest().getPattern()));
    }
  }

public LocalizerResourceRequestEvent(LocalizedResource resource,
      LocalResourceVisibility vis, LocalizerContext context, String pattern) {
    super(LocalizerEventType.REQUEST_RESOURCE_LOCALIZATION,
        context.getContainerId().toString());
    this.vis = vis;
    this.context = context;
    this.resource = resource;
    this.pattern = pattern;
  }

 

 

 

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值