yarn3.2源码分析之YarnClient与ResourceManager通信之submitApplication

概述

YarnClient通过ApplicationClientProtocol与ResourceManager通信。YarnClient通过它完成向RM提交应用程序、查看应用程序状态、控制应用程序(杀死)等。在ResourceManager中,负责与YarnClient通信的组件是ClientRMService。如果是submitApplication等通信,ClientRMService再转交给RMAppMamger。

YarnClient 提交应用

submitApplication()方法会提交一个应用给yarn,这是一个阻塞调用。也就是说只有当应用成功提交给ResourceManager时才会返回ApplicationId。

当提交应用后,它在内部会调用ApplicationClientProtocol#getApplicationReport()阻塞等待,直至应用成功提交。如果RM fail或者RM restart,getApplicationReport()会抛出ApplicationNotFoundException。submitApplication() API会重提交这个应用。

YarnClient  interface

/**
   * <p>
   * Submit a new application to <code>YARN.</code> It is a blocking call - it
   * will not return {@link ApplicationId} until the submitted application is
   * submitted successfully and accepted by the ResourceManager.
   * </p>
   * 
   * <p>
   * Users should provide an {@link ApplicationId} as part of the parameter
   * {@link ApplicationSubmissionContext} when submitting a new application,
   * otherwise it will throw the {@link ApplicationIdNotProvidedException}.
   * </p>
   *
   * <p>This internally calls {@link ApplicationClientProtocol#submitApplication
   * (SubmitApplicationRequest)}, and after that, it internally invokes
   * {@link ApplicationClientProtocol#getApplicationReport
   * (GetApplicationReportRequest)} and waits till it can make sure that the
   * application gets properly submitted. If RM fails over or RM restart
   * happens before ResourceManager saves the application's state,
   * {@link ApplicationClientProtocol
   * #getApplicationReport(GetApplicationReportRequest)} will throw
   * the {@link ApplicationNotFoundException}. This API automatically resubmits
   * the application with the same {@link ApplicationSubmissionContext} when it
   * catches the {@link ApplicationNotFoundException}</p>
   *
   * @param appContext
   *          {@link ApplicationSubmissionContext} containing all the details
   *          needed to submit a new application
   * @return {@link ApplicationId} of the accepted application
   * @throws YarnException
   * @throws IOException
   * @see #createApplication()
   */
  public abstract ApplicationId submitApplication(
      ApplicationSubmissionContext appContext) throws YarnException,
      IOException;

YarnClientImpl

//通过ApplicatonClientProtocol发起提交应用请求
protected ApplicationClientProtocol rmClient;

@Override
  public ApplicationId
      submitApplication(ApplicationSubmissionContext appContext)
          throws YarnException, IOException {
    ApplicationId applicationId = appContext.getApplicationId();
    if (applicationId == null) {
      throw new ApplicationIdNotProvidedException(
          "ApplicationId is not provided in ApplicationSubmissionContext");
    }
    SubmitApplicationRequest request =
        Records.newRecord(SubmitApplicationRequest.class);
    request.setApplicationSubmissionContext(appContext);

    // Automatically add the timeline DT into the CLC
    // Only when the security and the timeline service are both enabled
    if (isSecurityEnabled() && timelineServiceEnabled) {
      addTimelineDelegationToken(appContext.getAMContainerSpec());
    }

    //TODO: YARN-1763:Handle RM failovers during the submitApplication call.
    rmClient.submitApplication(request);

    int pollCount = 0;
    long startTime = System.currentTimeMillis();

    while (true) {
      try {
        YarnApplicationState state =
            getApplicationReport(applicationId).getYarnApplicationState();
        if (!state.equals(YarnApplicationState.NEW) &&
            !state.equals(YarnApplicationState.NEW_SAVING)) {
          LOG.info("Submitted application " + applicationId);
          break;
        }

        long elapsedMillis = System.currentTimeMillis() - startTime;
        if (enforceAsyncAPITimeout() &&
            elapsedMillis >= asyncApiPollTimeoutMillis) {
          throw new YarnException("Timed out while waiting for application " +
              applicationId + " to be submitted successfully");
        }

        // Notify the client through the log every 10 poll, in case the client
        // is blocked here too long.
        if (++pollCount % 10 == 0) {
          LOG.info("Application submission is not finished, " +
              "submitted application " + applicationId +
              " is still in " + state);
        }
        try {
          Thread.sleep(submitPollIntervalMillis);
        } catch (InterruptedException ie) {
          LOG.error("Interrupted while waiting for application "
              + applicationId
              + " to be successfully submitted.");
        }
      } catch (ApplicationNotFoundException ex) {
        // FailOver or RM restart happens before RMStateStore saves
        // ApplicationState
        LOG.info("Re-submit application " + applicationId + "with the " +
            "same ApplicationSubmissionContext");
        rmClient.submitApplication(request);
      }
    }

    return applicationId;
  }

ClientRMService处理submit application

ClientRMService继承了ApplicationClientProtocol,会处理来自YarnClient的所有RPC请求。

在submit application中,它会调用RMAppManager来提交应用。

@Override
  public SubmitApplicationResponse submitApplication(
      SubmitApplicationRequest request) throws YarnException, IOException {
    ApplicationSubmissionContext submissionContext = request
        .getApplicationSubmissionContext();
    ApplicationId applicationId = submissionContext.getApplicationId();
   ...................
    // Check whether app has already been put into rmContext,
    // If it is, simply return the response
    if (rmContext.getRMApps().get(applicationId) != null) {
      LOG.info("This is an earlier submitted application: " + applicationId);
      return SubmitApplicationResponse.newInstance();
    }
     ....................

    try {
      // call RMAppManager to submit application directly
      rmAppManager.submitApplication(submissionContext,
          System.currentTimeMillis(), user);

      LOG.info("Application with id " + applicationId.getId() + 
          " submitted by user " + user);
      RMAuditLogger.logSuccess(user, AuditConstants.SUBMIT_APP_REQUEST,
          "ClientRMService", applicationId, callerContext);
    } catch (YarnException e) {
      LOG.info("Exception in submitting " + applicationId, e);
      RMAuditLogger.logFailure(user, AuditConstants.SUBMIT_APP_REQUEST,
          e.getMessage(), "ClientRMService",
          "Exception in submitting application", applicationId, callerContext);
      throw e;
    }

    return recordFactory
        .newRecordInstance(SubmitApplicationResponse.class);
  }

RMAppManager处理submit application

通过包含Application信息的ApplicationSubmissionContext创建RMAppImpl,并存入RMContext。

RMContext的EventHandler处理该应用的开始事件。

protected void submitApplication(
      ApplicationSubmissionContext submissionContext, long submitTime,
      String user) throws YarnException {
    ApplicationId applicationId = submissionContext.getApplicationId();

    // Passing start time as -1. It will be eventually set in RMAppImpl
    // constructor.
//创建RMAppImpl,并存入RMContext
    RMAppImpl application = createAndPopulateNewRMApp(
        submissionContext, submitTime, user, false, -1);
    try {
      if (UserGroupInformation.isSecurityEnabled()) {
        this.rmContext.getDelegationTokenRenewer()
            .addApplicationAsync(applicationId,
                BuilderUtils.parseCredentials(submissionContext),
                submissionContext.getCancelTokensWhenComplete(),
                application.getUser(),
                BuilderUtils.parseTokensConf(submissionContext));
      } else {
        // Dispatcher is not yet started at this time, so these START events
        // enqueued should be guaranteed to be first processed when dispatcher
        // gets started.
        this.rmContext.getDispatcher().getEventHandler()
            .handle(new RMAppEvent(applicationId, RMAppEventType.START));
      }
    } catch (Exception e) {
      LOG.warn("Unable to parse credentials for " + applicationId, e);
      // Sending APP_REJECTED is fine, since we assume that the
      // RMApp is in NEW state and thus we haven't yet informed the
      // scheduler about the existence of the application
      this.rmContext.getDispatcher().getEventHandler()
          .handle(new RMAppEvent(applicationId,
              RMAppEventType.APP_REJECTED, e.getMessage()));
      throw RPCUtil.getRemoteException(e);
    }
  }

创建RMApp(RMAppImpl)

//将提交应用的applicationId对应RMApp存入到RMContext
private final RMContext rmContext;

private RMAppImpl createAndPopulateNewRMApp(
      ApplicationSubmissionContext submissionContext, long submitTime,
      String user, boolean isRecovery, long startTime) throws YarnException {

    .......................
    // Create RMApp
    RMAppImpl application =
        new RMAppImpl(applicationId, rmContext, this.conf,
            submissionContext.getApplicationName(), user,
            submissionContext.getQueue(),
            submissionContext, this.scheduler, this.masterService,
            submitTime, submissionContext.getApplicationType(),
            submissionContext.getApplicationTags(), amReqs, placementContext,
            startTime);
    // Concurrent app submissions with same applicationId will fail here
    // Concurrent app submissions with different applicationIds will not
    // influence each other
    //RMAppImpl存入RMContext
    if (rmContext.getRMApps().putIfAbsent(applicationId, application) !=
        null) {
      String message = "Application with id " + applicationId
          + " is already present! Cannot add a duplicate!";
      LOG.warn(message);
      throw new YarnException(message);
    }

    if (YarnConfiguration.timelineServiceV2Enabled(conf)) {
      // Start timeline collector for the submitted app
      application.startTimelineCollector();
    }
    // Inform the ACLs Manager
    this.applicationACLsManager.addApplication(applicationId,
        submissionContext.getAMContainerSpec().getApplicationACLs());
    return application;
  }

rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(applicationId, RMAppEventType.START))源码分析

RMContext关联的Dispatcher

RMContext使用的是AsyncDispatcher

  protected RMContextImpl rmContext;
  private Dispatcher rmDispatcher;

@Override
  protected void serviceInit(Configuration conf) throws Exception {
     .............
    // register the handlers for all AlwaysOn services using setupDispatcher().
    rmDispatcher = setupDispatcher();
    addIfService(rmDispatcher);
    rmContext.setDispatcher(rmDispatcher);
    .............
  }

  /**
   * Register the handlers for alwaysOn services
   */
  private Dispatcher setupDispatcher() {
    Dispatcher dispatcher = createDispatcher();
    dispatcher.register(RMFatalEventType.class,
        new ResourceManager.RMFatalEventDispatcher());
    return dispatcher;
  }

protected Dispatcher createDispatcher() {
    return new AsyncDispatcher("RM Event dispatcher");
  }
  

AsyncDispather注册事件类型,以及该类型的事件处理器

事件类型及事件处理器存入map集合eventDispatchers

protected final Map<Class<? extends Enum>, EventHandler> eventDispatchers;

public void register(Class<? extends Enum> eventType, EventHandler handler) {
        EventHandler<Event> registeredHandler = (EventHandler)this.eventDispatchers.get(eventType);
        LOG.info("Registering " + eventType + " for " + handler.getClass());
        if (registeredHandler == null) {
            this.eventDispatchers.put(eventType, handler);
        } else {
            AsyncDispatcher.MultiListenerHandler multiHandler;
            if (!(registeredHandler instanceof AsyncDispatcher.MultiListenerHandler)) {
                multiHandler = new AsyncDispatcher.MultiListenerHandler();
                multiHandler.addHandler(registeredHandler);
                multiHandler.addHandler(handler);
                this.eventDispatchers.put(eventType, multiHandler);
            } else {
                multiHandler = (AsyncDispatcher.MultiListenerHandler)registeredHandler;
                multiHandler.addHandler(handler);
            }
        }

    }

ResourceManager的内部类RMActiveService为AsyncDispatcher注册事件

 @Override
    protected void serviceInit(Configuration configuration) throws Exception {

     rmDispatcher.register(SchedulerEventType.class, schedulerDispatcher);

      // Register event handler for RmAppEvents
      rmDispatcher.register(RMAppEventType.class,
          new ApplicationEventDispatcher(rmContext));

      // Register event handler for RmAppAttemptEvents
      rmDispatcher.register(RMAppAttemptEventType.class,
          new ApplicationAttemptEventDispatcher(rmContext));

      // Register event handler for RmNodes
      rmDispatcher.register(
          RMNodeEventType.class, new NodeEventDispatcher(rmContext));

       rmDispatcher.register(RMAppManagerEventType.class, rmAppManager);

       rmDispatcher.register(AMLauncherEventType.class,
          applicationMasterLauncher);

}

AsyncDispatcher处理RMAppEventType类型的事件

AsyncDispatcher的EventHandler ——GenericEventHandler

GenericEventHandler.handle()方法也只是把event存入eventQueue

private EventHandler handlerInstance;
private final BlockingQueue<Event> eventQueue;

public EventHandler getEventHandler() {
        if (this.handlerInstance == null) {
            this.handlerInstance = new AsyncDispatcher.GenericEventHandler();
        }

        return this.handlerInstance;
    }

  class GenericEventHandler implements EventHandler<Event> {
        GenericEventHandler() {
        }

        public void handle(Event event) {
            if (!AsyncDispatcher.this.blockNewEvents) {
                AsyncDispatcher.this.drained = false;
                int qSize = AsyncDispatcher.this.eventQueue.size();
                if (qSize != 0 && qSize % 1000 == 0) {
                    AsyncDispatcher.LOG.info("Size of event-queue is " + qSize);
                }

                int remCapacity = AsyncDispatcher.this.eventQueue.remainingCapacity();
                if (remCapacity < 1000) {
                    AsyncDispatcher.LOG.warn("Very low remaining capacity in the event-queue: " + remCapacity);
                }

                try {
                    AsyncDispatcher.this.eventQueue.put(event);
                } catch (InterruptedException var5) {
                    if (!AsyncDispatcher.this.stopped) {
                        AsyncDispatcher.LOG.warn("AsyncDispatcher thread interrupted", var5);
                    }

                    throw new YarnRuntimeException(var5);
                }
            }
        }
    }

AsyncDispatcher的异步线程从eventQueue取出event进行处理

创建和运行异步线程

private Thread eventHandlingThread;

 @Override
  protected void serviceStart() throws Exception {
    //start all the components
    super.serviceStart();
    eventHandlingThread = new Thread(createThread());
    eventHandlingThread.setName("AsyncDispatcher event handler");
    eventHandlingThread.start();
  }

异步线程从eventQueue取出event分发给相应的EventHandler

Runnable createThread() {
    return new Runnable() {
      @Override
      public void run() {
        while (!stopped && !Thread.currentThread().isInterrupted()) {
          drained = eventQueue.isEmpty();
          // blockNewEvents is only set when dispatcher is draining to stop,
          // adding this check is to avoid the overhead of acquiring the lock
          // and calling notify every time in the normal run of the loop.
          if (blockNewEvents) {
            synchronized (waitForDrained) {
              if (drained) {
                waitForDrained.notify();
              }
            }
          }
          Event event;
          try {
            //从eventQueue取出event
            event = eventQueue.take();
          } catch(InterruptedException ie) {
            if (!stopped) {
              LOG.warn("AsyncDispatcher thread interrupted", ie);
            }
            return;
          }
          if (event != null) {
            //分发event
            dispatch(event);
          }
        }
      }
    };
  }

AsyncDispatcher中的map集合—— eventDispatchers记录了每个event对应的EventHandler。找到对应的EventHandler处理该event即可。

  @SuppressWarnings("unchecked")
  protected void dispatch(Event event) {
    //all events go thru this loop
    if (LOG.isDebugEnabled()) {
      LOG.debug("Dispatching the event " + event.getClass().getName() + "."
          + event.toString());
    }

    Class<? extends Enum> type = event.getType().getDeclaringClass();

    try{
      //从eventDispatchers中找到event对应的EventHandler
      EventHandler handler = eventDispatchers.get(type);
      if(handler != null) {
        //专项EventHandler处理专项event
        handler.handle(event);
      } else {
        throw new Exception("No handler for registered for " + type);
      }
    } catch (Throwable t) {
      //TODO Maybe log the state of the queue
      LOG.fatal("Error in dispatcher thread", t);
      // If serviceStop is called, we should exit this thread gracefully.
      if (exitOnDispatchException
          && (ShutdownHookManager.get().isShutdownInProgress()) == false
          && stopped == false) {
        Thread shutDownThread = new Thread(createShutDownThread());
        shutDownThread.setName("AsyncDispatcher ShutDown handler");
        shutDownThread.start();
      }
    }
  }

ApplicationEventDispatcher处理RMAppEvent

RMAppEvent注册的EventHandler是ApplicationEventDispatcher。

ApplicationEventDispatcher其实是通过前面创建并存入RMContext的RMAppImpl,来处理RMAppEvent。

@Private
  public static final class ApplicationEventDispatcher implements
      EventHandler<RMAppEvent> {

    private final RMContext rmContext;

    public ApplicationEventDispatcher(RMContext rmContext) {
      this.rmContext = rmContext;
    }

    @Override
    public void handle(RMAppEvent event) {
      ApplicationId appID = event.getApplicationId();
      RMApp rmApp = this.rmContext.getRMApps().get(appID);
      if (rmApp != null) {
        try {
          rmApp.handle(event);
        } catch (Throwable t) {
          LOG.error("Error in handling event type " + event.getType()
              + " for application " + appID, t);
        }
      }
    }
  }

RMAppImpl处理RMAppEvent

@Override
  public void handle(RMAppEvent event) {

    this.writeLock.lock();

    try {
      ApplicationId appID = event.getApplicationId();
      LOG.debug("Processing event for " + appID + " of type "
          + event.getType());
      final RMAppState oldState = getState();
      try {
        /* keep the master in sync with the state machine */
        this.stateMachine.doTransition(event.getType(), event);
      } catch (InvalidStateTransitionException e) {
        LOG.error("App: " + appID
            + " can't handle this event at current state", e);
        onInvalidStateTransition(event.getType(), oldState);
      }

      // Log at INFO if we're not recovering or not in a terminal state.
      // Log at DEBUG otherwise.
      if ((oldState != getState()) &&
          (((recoveredFinalState == null)) ||
            (event.getType() != RMAppEventType.RECOVER))) {
        LOG.info(String.format(STATE_CHANGE_MESSAGE, appID, oldState,
            getState(), event.getType()));
      } else if ((oldState != getState()) && LOG.isDebugEnabled()) {
        LOG.debug(String.format(STATE_CHANGE_MESSAGE, appID, oldState,
            getState(), event.getType()));
      }
    } finally {
      this.writeLock.unlock();
    }
  }

RMAppImpl的stateMachine (状态机)

每个RMAppImpl都有一个对应的stateMachine,在实例化RMAppImpl时会初始化该stateMachine。

创建InternalStateMachine。它是StateMachineFactory的内部类。

//RMAppImpl.java
private final StateMachine<RMAppState, RMAppEventType, RMAppEvent> stateMachine;

public RMAppImpl(ApplicationId applicationId, RMContext rmContext,
      Configuration config, String name, String user, String queue,
      ApplicationSubmissionContext submissionContext, YarnScheduler scheduler,
      ApplicationMasterService masterService, long submitTime,
      String applicationType, Set<String> applicationTags,
      List<ResourceRequest> amReqs, ApplicationPlacementContext
      placementContext, long startTime) {
     ...................
      this.stateMachine = stateMachineFactory.make(this);
     ...................

}

//StateMachineFactory.java
public StateMachine<STATE, EVENTTYPE, EVENT> make(OPERAND operand) {
        return new StateMachineFactory.InternalStateMachine(operand, this.defaultInitialState);
    }

状态迁移

private class InternalStateMachine implements StateMachine<STATE, EVENTTYPE, EVENT> {
        private final OPERAND operand;
        private STATE currentState;

        InternalStateMachine(OPERAND operand, STATE initialState) {
            this.operand = operand;
            this.currentState = initialState;
            if (!StateMachineFactory.this.optimized) {
                StateMachineFactory.this.maybeMakeStateMachineTable();
            }

        }

        public synchronized STATE getCurrentState() {
            return this.currentState;
        }

        public synchronized STATE doTransition(EVENTTYPE eventType, EVENT event) throws InvalidStateTransitonException {
            this.currentState = StateMachineFactory.this.doTransition(this.operand, this.currentState, eventType, event);
            return this.currentState;
        }
    }

StateMachineFactory#doTransition()方法取出transition处理相应类型RMAppEvent

/**
   * Effect a transition due to the effecting stimulus.
   * @param state current state
   * @param eventType trigger to initiate the transition
   * @param cause causal eventType context
   * @return transitioned state
   */
  private STATE doTransition
           (OPERAND operand, STATE oldState, EVENTTYPE eventType, EVENT event)
      throws InvalidStateTransitionException {
    // We can assume that stateMachineTable is non-null because we call
    //  maybeMakeStateMachineTable() when we build an InnerStateMachine ,
    //  and this code only gets called from inside a working InnerStateMachine .
    Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>> transitionMap
      = stateMachineTable.get(oldState);
    if (transitionMap != null) {
      Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition
          = transitionMap.get(eventType);
      if (transition != null) {
        return transition.doTransition(operand, oldState, event, eventType);
      }
    }
    throw new InvalidStateTransitionException(oldState, eventType);
  }

 RMAppImpl添加start类型的RMAppEventType的处理Transition

 .addTransition(RMAppState.NEW, RMAppState.NEW_SAVING,
        RMAppEventType.START, new RMAppNewlySavingTransition())

 

  • 1
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值