PAI FrameworkLauncher(3) 一个AM的生命周期

最新推荐文章于 2024-01-23 12:08:57 发布

weixin_30521649

最新推荐文章于 2024-01-23 12:08:57 发布

阅读量102

点赞数

原文链接：http://www.cnblogs.com/chenqy930/p/9474984.html

版权

因为AM被设计成只负责一个framework，所以一个AM的生命周期其实就是一个framework的生命周期。下面总结的流程是一个framework从开始到结束的过程，不包含task 失败的情况，假设是一个健健康康的framework。

一、涉及到的AM subService简介

主要是requestManager和statusManger

1、requestManager
requestManager中有一个线程会在AM的整个生命周期中，每隔30s去zk上pullRequest()。当然pull的是同一个frameworkLauncher的有关信息，由LauncherRequest和aggreatedFrameworkRequest两部分组成。

其中aggregatedFrameworkRequest = FrameworkRequest + all other feedback reqeust

LauncherRequest中有ClusterConfigurattion和其他信息。

pull下来之后会检查一些条件并更新requestManager中与任务运行有关的内容，比如FrameworkDescriptor、TaskRoles、platParams等，目的应该是保持AM与zkStore上的内容同步吧。

  private void pullRequest() throws Exception {
    // Pull LauncherRequest
    LOGGER.logDebug("Pulling LauncherRequest");
    LauncherRequest newLauncherRequest = zkStore.getLauncherRequest();
    LOGGER.logDebug("Pulled LauncherRequest");

    // newLauncherRequest is always not null
    updateLauncherRequest(newLauncherRequest);

    // Pull AggregatedFrameworkRequest
    AggregatedFrameworkRequest aggFrameworkRequest;
    try {
      LOGGER.logDebug("Pulling AggregatedFrameworkRequest");
      aggFrameworkRequest = zkStore.getAggregatedFrameworkRequest(conf.getFrameworkName());
      LOGGER.logDebug("Pulled AggregatedFrameworkRequest");
    } catch (NoNodeException e) {
      existsLocalVersionFrameworkRequest = 0;
      throw new NonTransientException(
          "Failed to getAggregatedFrameworkRequest, FrameworkRequest is already deleted on ZK", e);
    }

    // newFrameworkDescriptor is always not null
    FrameworkDescriptor newFrameworkDescriptor = aggFrameworkRequest.getFrameworkRequest().getFrameworkDescriptor();
    checkFrameworkVersion(newFrameworkDescriptor);
    flattenFrameworkDescriptor(newFrameworkDescriptor);
    updateFrameworkDescriptor(newFrameworkDescriptor);
    updateOverrideApplicationProgressRequest(aggFrameworkRequest.getOverrideApplicationProgressRequest());
    updateMigrateTaskRequests(aggFrameworkRequest.getMigrateTaskRequests());
  }

View Code

2、statusManager

statusManger中也会有一个线程在AM的整个生命周期中，每隔30s去push status到zk上。

statusManager负责管理framework的运行情况。而framework由一系列taskRole组成，每个taskRole中会有一个或多个task，这些task都是相同的，因此可以用标号1、2、3之类的区分。

确定一个task的对象 TaskLocator = TaskRoleName + TaskIndex

管理一个Task对象 TaskStatus

在statusManager中由以下的Map来管理taskRole和task们：

// Manage the CURD to ZK Status
public class StatusManager extends AbstractService {  // THREAD SAFE
  private static final DefaultLogger LOGGER = new DefaultLogger(StatusManager.class);

  private final ApplicationMaster am;
  private final Configuration conf;
  private final ZookeeperStore zkStore;

  /**
   * REGION BaseStatus
   */
  // AM only need to maintain TaskRoleStatus and TaskStatuses, and it is the only maintainer.
  // TaskRoleName -> TaskRoleStatus
  private Map<String, TaskRoleStatus> taskRoleStatuses = new HashMap<>();
  // TaskRoleName -> TaskStatuses
  private Map<String, TaskStatuses> taskStatuseses = new HashMap<>();

  /**
   * REGION ExtensionStatus
   * ExtensionStatus should be always CONSISTENT with BaseStatus
   */
  // Used to invert index TaskStatus by ContainerId/TaskState instead of TaskStatusLocator, i.e. TaskRoleName + TaskIndex
  // TaskState -> TaskStatusLocators
  private Map<TaskState, HashSet<TaskStatusLocator>> taskStateLocators = new HashMap<>();
  // Live Associated ContainerId -> TaskStatusLocator
  private Map<String, TaskStatusLocator> liveAssociatedContainerIdLocators = new HashMap<>();
  // Live Associated HostNames
  // TODO: Using MachineName instead of HostName to avoid unstable HostName Resolution
  private HashSet<String> liveAssociatedHostNames = new HashSet<>();

  /**
   * REGION StateVariable
   */
  // Whether Mem Status is changed since previous zkStore update
  // TaskRoleName -> TaskRoleStatusChanged
  private Map<String, Boolean> taskRoleStatusesChanged = new HashMap<>();
  // TaskRoleName -> TaskStatusesChanged
  private Map<String, Boolean> taskStatusesesChanged = new HashMap<>();

  // No need to persistent ContainerRequest since it is only valid within one application attempt.
  // Used to generate an unique Priority for each ContainerRequest in current application attempt.
  // This helps to match ContainerRequest and allocated Container.
  // Besides, it can also avoid the issue YARN-314.
  private Priority nextContainerRequestPriority = Priority.newInstance(0);
  // Used to track current ContainerRequest for Tasks in CONTAINER_REQUESTED state
  // TaskStatusLocator -> ContainerRequest
  private Map<TaskStatusLocator, ContainerRequest> taskContainerRequests = new HashMap<>();
  // Used to invert index TaskStatusLocator by ContainerRequest.Priority
  // Priority -> TaskStatusLocator
  private Map<Priority, TaskStatusLocator> priorityLocators = new HashMap<>();

View Code

二、开始运行framework

上面说的RequestManager pullRequest()中会从zk获取frameworkRequest和launcherRequest，pull哪一个framework是从AM的conf读的环境变量知道的。因为前面在client端提交AM的时候，设置AM所在container的运行环境时，就将frameworkName、frameworkVersion及zkConnectString等内容写入AM所在container的环境变量中了。

将frameworkReqeust和launcherRequest从zk上获得后，调用updateFrameworkDescriptor()方法，该方法及继续调用onTaskNumberUpdated()方法，该方法中会将更新statusMangager、addContainerRequest两个任务扔到AM的线程池中。addContainerRequest()即是由AMRMClient去向RM请求container执行task，RM分配containers后便会触发后面的onContainerAllocated操作，是(二)中描述的内容。

在这个过程中，还检查了FrameworkRequest有没有更新，如果更新了应该执行新任务，此处暂不表。

下面是updateFrameworkDescriptor的源码。

private void updateFrameworkDescriptor(FrameworkDescriptor newFrameworkDescriptor) throws Exception {
    if (YamlUtils.deepEquals(frameworkDescriptor, newFrameworkDescriptor)) {
      return;
    }

    LOGGER.logSplittedLines(Level.INFO,
        "Detected FrameworkDescriptor changes. Updating to new FrameworkDescriptor:\n%s",
        WebCommon.toJson(newFrameworkDescriptor));

    checkUnsupportedOnTheFlyChanges(newFrameworkDescriptor);

    // Replace on the fly FrameworkDescriptor with newFrameworkDescriptor.
    // The operation is Atomic, since it only modifies the reference.
    // So, the on going read for the old FrameworkDescriptor will not get intermediate results
    frameworkDescriptor = newFrameworkDescriptor;

    // Backup old to detect changes
    PlatformSpecificParametersDescriptor oldPlatParams = platParams;
    Map<String, TaskRoleDescriptor> oldTaskRoles = taskRoles;
    Map<String, ServiceDescriptor> oldTaskServices = taskServices;

    // Update ExtensionRequest
    user = frameworkDescriptor.getUser();
    platParams = frameworkDescriptor.getPlatformSpecificParameters();
    taskRoles = frameworkDescriptor.getTaskRoles();
    Map<String, RetryPolicyDescriptor> newTaskRetryPolicies = new HashMap<>();
    Map<String, ServiceDescriptor> newTaskServices = new HashMap<>();
    Map<String, ResourceDescriptor> newTaskResources = new HashMap<>();
    Map<String, TaskRolePlatformSpecificParametersDescriptor> newTaskPlatParams = new HashMap<>();
    for (Map.Entry<String, TaskRoleDescriptor> taskRole : taskRoles.entrySet()) {
      String taskRoleName = taskRole.getKey();
      TaskRoleDescriptor taskRoleDescriptor = taskRole.getValue();
      newTaskRetryPolicies.put(taskRoleName, taskRoleDescriptor.getTaskRetryPolicy());
      newTaskServices.put(taskRoleName, taskRoleDescriptor.getTaskService());
      newTaskResources.put(taskRoleName, taskRoleDescriptor.getTaskService().getResource());
      newTaskPlatParams.put(taskRoleName, taskRoleDescriptor.getPlatformSpecificParameters());
    }
    taskRetryPolicies = newTaskRetryPolicies;
    taskServices = newTaskServices;
    taskResources = newTaskResources;
    taskPlatParams = newTaskPlatParams;
    Map<String, Integer> taskNumbers = getTaskNumbers(taskRoles);
    Map<String, Integer> serviceVersions = getServiceVersions(taskServices);

    // Notify AM to take actions for Request
    if (oldPlatParams == null) {
      // For the first time, send all Request to AM
      am.onServiceVersionsUpdated(serviceVersions);
      am.onTaskNumbersUpdated(taskNumbers);
      {
        // Only start them for the first time
        am.onStartRMResyncHandler();
        // Start TransitionTaskStateQueue at last, in case some Tasks in the queue
        // depend on the Request or previous AM Notify.
        am.onStartTransitionTaskStateQueue();
      }
    } else {
      // For the other times, only send changed Request to AM
      if (!CommonExts.equals(getServiceVersions(oldTaskServices), serviceVersions)) {
        am.onServiceVersionsUpdated(serviceVersions);
      }
      if (!CommonExts.equals(getTaskNumbers(oldTaskRoles), taskNumbers)) {
        am.onTaskNumbersUpdated(taskNumbers);
      }
    }
  }

View Code

三、结束framework和AM

入口是RMClientallbackHandler中的onContainersCompleted()方法。

public void onContainersCompleted(List<ContainerStatus> completedContainers) {
    am.onContainersCompleted(completedContainers);
  }

View Code

那么当RM告诉你有一堆containers结束了之后怎么做呢？用for循环一个个检查。这个检查任务也是扔进线程池里做的。

对每一个结束的container，提取出它的containerId、exitStatus、diagnotics进入下一步。

private void completeContainers(List<ContainerStatus> containerStatuses) throws Exception {
    for (ContainerStatus containerStatus : containerStatuses) {
      completeContainer(
          containerStatus.getContainerId().toString(),
          containerStatus.getExitStatus(),
          containerStatus.getDiagnostics(),
          false);
    }
  }

View Code

下一步呢，就是从container中获得当前container所运行的taskStatus，将该task状态标记为CONTAINER_COMPLETED，然后调用attemptToRetry()做“尸检”。

  private void completeContainer(String containerId, int exitCode, String diagnostics, Boolean needToRelease) throws Exception {
    if (needToRelease) {
      tryToReleaseContainer(containerId);
      if (exitCode == ExitStatusKey.CONTAINER_MIGRATE_TASK_REQUESTED.toInt()) {
        requestManager.onMigrateTaskRequestContainerReleased(containerId);
      }
    }

    String logSuffix = String.format(
        "[%s]: completeContainer: ExitCode: %s, ExitDiagnostics: %s, NeedToRelease: %s",
        containerId, exitCode, diagnostics, needToRelease);

    if (!statusManager.isContainerIdLiveAssociated(containerId)) {
      LOGGER.logDebug("[NotLiveAssociated]%s", logSuffix);
      return;
    }

    TaskStatus taskStatus = statusManager.getTaskStatusWithLiveAssociatedContainerId(containerId);
    String taskRoleName = taskStatus.getTaskRoleName();
    TaskStatusLocator taskLocator = new TaskStatusLocator(taskRoleName, taskStatus.getTaskIndex());
    String linePrefix = String.format("%s: ", taskLocator);

    LOGGER.logSplittedLines(Level.INFO,
        "%s%s\n%s",
        taskLocator, logSuffix, generateContainerDiagnostics(taskStatus, linePrefix));

    statusManager.transitionTaskState(taskLocator, TaskState.CONTAINER_COMPLETED,
        new TaskEvent().setContainerExitCode(exitCode).setContainerExitDiagnostics(diagnostics));

    // Post-mortem CONTAINER_COMPLETED Task
    attemptToRetry(taskStatus);
  }

View Code

尸检部分进一步检查container的退出状态是否是SUCCEEDED。如果不是SUCCEEDED，有fancyRetryPolicy和normalRetryPolicy两种retry方式。现在主要看如果成功退出了，那么会进入到completeTask()中。

private void attemptToRetry(TaskStatus taskStatus) throws Exception {
    String taskRoleName = taskStatus.getTaskRoleName();
    TaskStatusLocator taskLocator = new TaskStatusLocator(taskRoleName, taskStatus.getTaskIndex());
    Integer exitCode = taskStatus.getContainerExitCode();
    ExitType exitType = taskStatus.getContainerExitType();
    Integer retriedCount = taskStatus.getTaskRetryPolicyState().getRetriedCount();
    RetryPolicyState newRetryPolicyState = YamlUtils.deepCopy(taskStatus.getTaskRetryPolicyState(), RetryPolicyState.class);

    RetryPolicyDescriptor retryPolicy = requestManager.getTaskRetryPolicy(taskRoleName);
    Boolean fancyRetryPolicy = retryPolicy.getFancyRetryPolicy();
    Integer maxRetryCount = retryPolicy.getMaxRetryCount();

    String logPrefix = String.format("%s: attemptToRetry: ", taskLocator);

    LOGGER.logSplittedLines(Level.INFO,
        logPrefix + "ContainerExitCode: [%s], ContainerExitType: [%s], RetryPolicyState:\n[%s]",
        exitCode, exitType, WebCommon.toJson(newRetryPolicyState));

    String completeTaskLogPrefix = logPrefix + "Will completeTask. Reason: ";
    String retryTaskLogPrefix = logPrefix + "Will retryTask with new Container. Reason: ";

    // 1. FancyRetryPolicy
    String fancyRetryPolicyLogSuffix = String.format("FancyRetryPolicy: Task exited due to %s.", exitType);
    if (exitType == ExitType.TRANSIENT_NORMAL) {
      newRetryPolicyState.setTransientNormalRetriedCount(newRetryPolicyState.getTransientNormalRetriedCount() + 1);
      if (fancyRetryPolicy) {
        LOGGER.logWarning(retryTaskLogPrefix + fancyRetryPolicyLogSuffix);
        retryTask(taskStatus, newRetryPolicyState);
        return;
      }
    } else if (exitType == ExitType.TRANSIENT_CONFLICT) {
      newRetryPolicyState.setTransientConflictRetriedCount(newRetryPolicyState.getTransientConflictRetriedCount() + 1);
      if (fancyRetryPolicy) {
        LOGGER.logWarning(retryTaskLogPrefix + fancyRetryPolicyLogSuffix);
        retryTask(taskStatus, newRetryPolicyState);
        return;
      }
    } else if (exitType == ExitType.NON_TRANSIENT) {
      newRetryPolicyState.setNonTransientRetriedCount(newRetryPolicyState.getNonTransientRetriedCount() + 1);
      if (fancyRetryPolicy) {
        LOGGER.logWarning(completeTaskLogPrefix + fancyRetryPolicyLogSuffix);
        completeTask(taskStatus);
        return;
      }
    } else {
      if (exitType == ExitType.SUCCEEDED) {
        newRetryPolicyState.setSucceededRetriedCount(newRetryPolicyState.getSucceededRetriedCount() + 1);
      } else {
        newRetryPolicyState.setUnKnownRetriedCount(newRetryPolicyState.getUnKnownRetriedCount() + 1);
      }
      if (fancyRetryPolicy) {
        // FancyRetryPolicy only handle exit due to transient and non-transient failure specially,
        // Leave exit due to others to NormalRetryPolicy
        LOGGER.logInfo(logPrefix +
            "Transfer the RetryDecision to NormalRetryPolicy. Reason: " +
            fancyRetryPolicyLogSuffix);
      }
    }

    // 2. NormalRetryPolicy
    if (maxRetryCount == GlobalConstants.USING_EXTENDED_UNLIMITED_VALUE ||
        (exitType != ExitType.SUCCEEDED && maxRetryCount == GlobalConstants.USING_UNLIMITED_VALUE) ||
        (exitType != ExitType.SUCCEEDED && retriedCount < maxRetryCount)) {
      newRetryPolicyState.setRetriedCount(newRetryPolicyState.getRetriedCount() + 1);

      LOGGER.logWarning(retryTaskLogPrefix +
              "RetriedCount %s has not reached MaxRetryCount %s.",
          retriedCount, maxRetryCount);
      retryTask(taskStatus, newRetryPolicyState);
      return;
    } else {
      if (exitType == ExitType.SUCCEEDED) {
        LOGGER.logInfo(completeTaskLogPrefix +
            "Task exited due to %s.", exitType);
        completeTask(taskStatus);
        return;
      } else {
        LOGGER.logWarning(completeTaskLogPrefix +
                "RetriedCount %s has reached MaxRetryCount %s.",
            retriedCount, maxRetryCount);
        completeTask(taskStatus);
        return;
      }
    }
  }

View Code

在completeTask()方法中, statusManager会将task的状态标记为TASK_COMPLETED，然后调用attemptToStop()方法。

private void completeTask(TaskStatus taskStatus) throws Exception {
    String taskRoleName = taskStatus.getTaskRoleName();
    TaskStatusLocator taskLocator = new TaskStatusLocator(taskRoleName, taskStatus.getTaskIndex());

    LOGGER.logSplittedLines(Level.INFO,
        "%s: completeTask: TaskStatus:\n%s",
        taskLocator, WebCommon.toJson(taskStatus));

    statusManager.transitionTaskState(taskLocator, TaskState.TASK_COMPLETED);
    attemptToStop(taskStatus);
  }

View Code

attemptToStop(taskStatus)方法其实尝试的是停止整个AM的运行，所以相当于会检查传入的这个taskStatus是不是framework中最后一个task了。

下面的代码会分3种情况来判断：

1）如果退出状态不是SUCCEEDED，且设置了minFailedTaskCount。那么检查已经failed 的task数量，如果该数量超过了minFailedTaskCount的设置值，那么结束AM。

2）如果退出状态是SUCCEEDED，且设置了minSuccessTaskCount。那么检查已经success的task数量，如果该数量超过了minSuccessTaskCount的设置值，那么结束AM。

3）由statusManager查询是否所有的task都是在finalState了（finalState == TASK_COMPLETED）。如果有，执行if语句中的内容。也就是说如果只有一个task在finalState的话，这一步是不做任何动作直接返回的。

　如果所有task都是finalState了，再进一步检查在各个task的运行过程中有没有fail的情况来构造诊断信息，结束AM。

这三种情况最后都调用了stopForApplicationCompletion()方法。

private void attemptToStop(TaskStatus taskStatus) throws IOException {
    String taskRoleName = taskStatus.getTaskRoleName();
    ExitType exitType = taskStatus.getContainerExitType();

    TaskRoleApplicationCompletionPolicyDescriptor applicationCompletionPolicy =
        requestManager.getTaskRoleApplicationCompletionPolicy(taskRoleName);
    Integer minFailedTaskCount = applicationCompletionPolicy.getMinFailedTaskCount();
    Integer minSucceededTaskCount = applicationCompletionPolicy.getMinSucceededTaskCount();

    if (exitType != ExitType.SUCCEEDED && minFailedTaskCount != null) {
      List<TaskStatus> failedTaskStatuses = statusManager.getFailedTaskStatus(taskRoleName);
      if (minFailedTaskCount <= failedTaskStatuses.size()) {
        String applicationCompletionReason = String.format(
            "[%s]: FailedTaskCount %s has reached MinFailedTaskCount %s.",
            taskRoleName, failedTaskStatuses.size(), minFailedTaskCount);
        stopForApplicationCompletion(applicationCompletionReason, failedTaskStatuses);
      }
    }

    if (exitType == ExitType.SUCCEEDED && minSucceededTaskCount != null) {
      List<TaskStatus> succeededTaskStatuses = statusManager.getSucceededTaskStatus(taskRoleName);
      if (minSucceededTaskCount <= succeededTaskStatuses.size()) {
        String applicationCompletionReason = String.format(
            "[%s]: SucceededTaskCount %s has reached MinSucceededTaskCount %s.",
            taskRoleName, succeededTaskStatuses.size(), minSucceededTaskCount);
        stopForApplicationCompletion(applicationCompletionReason, succeededTaskStatuses);
      }
    }

    if (statusManager.isAllTaskInFinalState()) {
      int totalTaskCount = statusManager.getTaskCount();
      List<TaskStatus> failedTaskStatuses = statusManager.getFailedTaskStatus();
      String applicationCompletionReason = String.format(
          "All Tasks completed and no ApplicationCompletionPolicy has ever been triggered: " +
              "TotalTaskCount: %s, FailedTaskCount: %s.",
          totalTaskCount, failedTaskStatuses.size());
      stopForApplicationCompletion(applicationCompletionReason);
    }
  }

View Code

stopForApplicationCompletion()工作是封装一个StopStatus对象，并将该对象传给stop()方法。

stop()方法中，结束requestManager、statusManager，rmClient向RM注销并结束线程。到这里AM的生命周期也结束了。

  // THREAD SAFE
  @Override
  public synchronized void stop(StopStatus stopStatus) {
    // Best Effort to stop Gracefully
    super.stop(stopStatus);

    AggregateException ae = new AggregateException();

    // Stop AM's SubServices
    // No need to stop nmClient, since it may be time consuming to stop all Containers, leave it for RM.
    // Since here is Best Effort, leave the GC work of zkStore and hdfsStore to LauncherService.
    try {
      if (yarnClient != null) {
        yarnClient.stop();
      }
    } catch (Exception e) {
      ae.addException(e);
    }

    try {
      if (statusManager != null) {
        statusManager.stop(stopStatus);
      }
    } catch (Exception e) {
      ae.addException(e);
    }

    try {
      if (requestManager != null) {
        requestManager.stop(stopStatus);
      }
    } catch (Exception e) {
      ae.addException(e);
    }

    // Stop rmClient at last, since there is no work left in current AM, and only then RM is
    // allowed to process the application, such as generate application's diagnostics.
    try {
      if (rmClient != null) {
        if (stopStatus.getNeedUnregister()) {
          LOGGER.logInfo("Unregistering %s to RM", serviceName);
          rmClient.unregisterApplicationMaster(
              stopStatus.getCode() == 0 ?
                  FinalApplicationStatus.SUCCEEDED :
                  FinalApplicationStatus.FAILED,
              stopStatus.getDiagnostics(), conf.getAmTrackingUrl());
        }
        rmClient.stop();
      }
    } catch (Exception e) {
      ae.addException(e);
    }

    if (ae.getExceptions().size() > 0) {
      LOGGER.logWarning(ae, "Failed to stop %s gracefully", serviceName);
    }

    LOGGER.logInfo("%s stopped", serviceName);
    System.exit(stopStatus.getCode());
  }

View Code

从上面过程看到，zk是一个很重要的角色，requestManager通过它提供给AM运行任务所需要的全部信息，并且能保证AM能与用户的request同步更新。statusManager通过它把task和taskRole的状态更新到zk上，给用户查询。

转载于:https://www.cnblogs.com/chenqy930/p/9474984.html

weixin_30521649

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
PAI FrameworkLauncher(3) 一个AM的生命周期

因为AM被设计成只负责一个framework，所以一个AM的生命周期其实就是一个framework的生命周期。下面总结的流程是一个framework从开始到结束的过程，不包含task 失败的情况，假设是一个健健康康的framework。一、涉及到的AM subService简介主要是requestManager和statusManger1、requestManagerreques...
复制链接

扫一扫