因为AM被设计成只负责一个framework,所以一个AM的生命周期其实就是一个framework的生命周期。下面总结的流程是一个framework从开始到结束的过程,不包含task 失败的情况,假设是一个健健康康的framework。
一、涉及到的AM subService简介
主要是requestManager和statusManger
1、requestManager
requestManager中有一个线程会在AM的整个生命周期中,每隔30s去zk上pullRequest()。当然pull的是同一个frameworkLauncher的有关信息,由LauncherRequest和aggreatedFrameworkRequest两部分组成。
其中aggregatedFrameworkRequest = FrameworkRequest + all other feedback reqeust
LauncherRequest中有ClusterConfigurattion和其他信息。
pull下来之后会检查一些条件并更新requestManager中与任务运行有关的内容,比如FrameworkDescriptor、TaskRoles、platParams等,目的应该是保持AM与zkStore上的内容同步吧。
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
private void pullRequest() throws Exception { // Pull LauncherRequest LOGGER.logDebug("Pulling LauncherRequest"); LauncherRequest newLauncherRequest = zkStore.getLauncherRequest(); LOGGER.logDebug("Pulled LauncherRequest"); // newLauncherRequest is always not null updateLauncherRequest(newLauncherRequest); // Pull AggregatedFrameworkRequest AggregatedFrameworkRequest aggFrameworkRequest; try { LOGGER.logDebug("Pulling AggregatedFrameworkRequest"); aggFrameworkRequest = zkStore.getAggregatedFrameworkRequest(conf.getFrameworkName()); LOGGER.logDebug("Pulled AggregatedFrameworkRequest"); } catch (NoNodeException e) { existsLocalVersionFrameworkRequest = 0; throw new NonTransientException( "Failed to getAggregatedFrameworkRequest, FrameworkRequest is already deleted on ZK", e); } // newFrameworkDescriptor is always not null FrameworkDescriptor newFrameworkDescriptor = aggFrameworkRequest.getFrameworkRequest().getFrameworkDescriptor(); checkFrameworkVersion(newFrameworkDescriptor); flattenFrameworkDescriptor(newFrameworkDescriptor); updateFrameworkDescriptor(newFrameworkDescriptor); updateOverrideApplicationProgressRequest(aggFrameworkRequest.getOverrideApplicationProgressRequest()); updateMigrateTaskRequests(aggFrameworkRequest.getMigrateTaskRequests()); }
2、statusManager
statusManger中也会有一个线程在AM的整个生命周期中,每隔30s去push status到zk上。
statusManager负责管理framework的运行情况。而framework由一系列taskRole组成,每个taskRole中会有一个或多个task,这些task都是相同的,因此可以用标号1、2、3之类的区分。
确定一个task的对象 TaskLocator = TaskRoleName + TaskIndex
管理一个Task对象 TaskStatus
在statusManager中由以下的Map来管理taskRole和task们:
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
// Manage the CURD to ZK Status public class StatusManager extends AbstractService { // THREAD SAFE private static final DefaultLogger LOGGER = new DefaultLogger(StatusManager.class); private final ApplicationMaster am; private final Configuration conf; private final ZookeeperStore zkStore; /** * REGION BaseStatus */ // AM only need to maintain TaskRoleStatus and TaskStatuses, and it is the only maintainer. // TaskRoleName -> TaskRoleStatus private Map<String, TaskRoleStatus> taskRoleStatuses = new HashMap<>(); // TaskRoleName -> TaskStatuses private Map<String, TaskStatuses> taskStatuseses = new HashMap<>(); /** * REGION ExtensionStatus * ExtensionStatus should be always CONSISTENT with BaseStatus */ // Used to invert index TaskStatus by ContainerId/TaskState instead of TaskStatusLocator, i.e. TaskRoleName + TaskIndex // TaskState -> TaskStatusLocators private Map<TaskState, HashSet<TaskStatusLocator>> taskStateLocators = new HashMap<>(); // Live Associated ContainerId -> TaskStatusLocator private Map<String, TaskStatusLocator> liveAssociatedContainerIdLocators = new HashMap<>(); // Live Associated HostNames // TODO: Using MachineName instead of HostName to avoid unstable HostName Resolution private HashSet<String> liveAssociatedHostNames = new HashSet<>(); /** * REGION StateVariable */ // Whether Mem Status is changed since previous zkStore update // TaskRoleName -> TaskRoleStatusChanged private Map<String, Boolean> taskRoleStatusesChanged = new HashMap<>(); // TaskRoleName -> TaskStatusesChanged private Map<String, Boolean> taskStatusesesChanged = new HashMap<>(); // No need to persistent ContainerRequest since it is only valid within one application attempt. // Used to generate an unique Priority for each ContainerRequest in current application attempt. // This helps to match ContainerRequest and allocated Container. // Besides, it can also avoid the issue YARN-314. private Priority nextContainerRequestPriority = Priority.newInstance(0); // Used to track current ContainerRequest for Tasks in CONTAINER_REQUESTED state // TaskStatusLocator -> ContainerRequest private Map<TaskStatusLocator, ContainerRequest> taskContainerRequests = new HashMap<>(); // Used to invert index TaskStatusLocator by ContainerRequest.Priority // Priority -> TaskStatusLocator private Map<Priority, TaskStatusLocator> priorityLocators = new HashMap<>();
二、开始运行framework
上面说的RequestManager pullRequest()中会从zk获取frameworkRequest和launcherRequest,pull哪一个framework是从AM的conf读的环境变量知道的。因为前面在client端提交AM的时候,设置AM所在container的运行环境时,就将frameworkName、frameworkVersion及zkConnectString等内容写入AM所在container的环境变量中了。
将frameworkReqeust和launcherRequest从zk上获得后,调用updateFrameworkDescriptor()方法,该方法及继续调用onTaskNumberUpdated()方法,该方法中会将更新statusMangager、addContainerRequest两个任务扔到AM的线程池中。addContainerRequest()即是由AMRMClient去向RM请求container执行task,RM分配containers后便会触发后面的onContainerAllocated操作,是(二)中描述的内容。
在这个过程中,还检查了FrameworkRequest有没有更新,如果更新了应该执行新任务,此处暂不表。
下面是updateFrameworkDescriptor的源码。
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
private void updateFrameworkDescriptor(FrameworkDescriptor newFrameworkDescriptor) throws Exception { if (YamlUtils.deepEquals(frameworkDescriptor, newFrameworkDescriptor)) { return; } LOGGER.logSplittedLines(Level.INFO, "Detected FrameworkDescriptor changes. Updating to new FrameworkDescriptor:\n%s", WebCommon.toJson(newFrameworkDescriptor)); checkUnsupportedOnTheFlyChanges(newFrameworkDescriptor); // Replace on the fly FrameworkDescriptor with newFrameworkDescriptor. // The operation is Atomic, since it only modifies the reference. // So, the on going read for the old FrameworkDescriptor will not get intermediate results frameworkDescriptor = newFrameworkDescriptor; // Backup old to detect changes PlatformSpecificParametersDescriptor oldPlatParams = platParams; Map<String, TaskRoleDescriptor> oldTaskRoles = taskRoles; Map<String, ServiceDescriptor> oldTaskServices = taskServices; // Update ExtensionRequest user = frameworkDescriptor.getUser(); platParams = frameworkDescriptor.getPlatformSpecificParameters(); taskRoles = frameworkDescriptor.getTaskRoles(); Map<String, RetryPolicyDescriptor> newTaskRetryPolicies = new HashMap<>(); Map<String, ServiceDescriptor> newTaskServices = new HashMap<>(); Map<String, ResourceDescriptor> newTaskResources = new HashMap<>(); Map<String, TaskRolePlatformSpecificParametersDescriptor> newTaskPlatParams = new HashMap<>(); for (Map.Entry<String, TaskRoleDescriptor> taskRole : taskRoles.entrySet()) { String taskRoleName = taskRole.getKey(); TaskRoleDescriptor taskRoleDescriptor = taskRole.getValue(); newTaskRetryPolicies.put(taskRoleName, taskRoleDescriptor.getTaskRetryPolicy()); newTaskServices.put(taskRoleName, taskRoleDescriptor.getTaskService()); newTaskResources.put(taskRoleName, taskRoleDescriptor.getTaskService().getResource()); newTaskPlatParams.put(taskRoleName, taskRoleDescriptor.getPlatformSpecificParameters()); } taskRetryPolicies = newTaskRetryPolicies; taskServices = newTaskServices; taskResources = newTaskResources; taskPlatParams = newTaskPlatParams; Map<String, Integer> taskNumbers = getTaskNumbers(taskRoles); Map<String, Integer> serviceVersions = getServiceVersions(taskServices); // Notify AM to take actions for Request if (oldPlatParams == null) { // For the first time, send all Request to AM am.onServiceVersionsUpdated(serviceVersions); am.onTaskNumbersUpdated(taskNumbers); { // Only start them for the first time am.onStartRMResyncHandler(); // Start TransitionTaskStateQueue at last, in case some Tasks in the queue // depend on the Request or previous AM Notify. am.onStartTransitionTaskStateQueue(); } } else { // For the other times, only send changed Request to AM if (!CommonExts.equals(getServiceVersions(oldTaskServices), serviceVersions)) { am.onServiceVersionsUpdated(serviceVersions); } if (!CommonExts.equals(getTaskNumbers(oldTaskRoles), taskNumbers)) { am.onTaskNumbersUpdated(taskNumbers); } } }
三、结束framework和AM
入口是RMClientallbackHandler中的onContainersCompleted()方法。
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
public void onContainersCompleted(List<ContainerStatus> completedContainers) { am.onContainersCompleted(completedContainers); }
那么当RM告诉你有一堆containers结束了之后怎么做呢?用for循环一个个检查。这个检查任务也是扔进线程池里做的。
对每一个结束的container,提取出它的containerId、exitStatus、diagnotics进入下一步。
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
private void completeContainers(List<ContainerStatus> containerStatuses) throws Exception { for (ContainerStatus containerStatus : containerStatuses) { completeContainer( containerStatus.getContainerId().toString(), containerStatus.getExitStatus(), containerStatus.getDiagnostics(), false); } }
下一步呢,就是从container中获得当前container所运行的taskStatus,将该task状态标记为CONTAINER_COMPLETED,然后调用attemptToRetry()做“尸检”。
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
private void completeContainer(String containerId, int exitCode, String diagnostics, Boolean needToRelease) throws Exception { if (needToRelease) { tryToReleaseContainer(containerId); if (exitCode == ExitStatusKey.CONTAINER_MIGRATE_TASK_REQUESTED.toInt()) { requestManager.onMigrateTaskRequestContainerReleased(containerId); } } String logSuffix = String.format( "[%s]: completeContainer: ExitCode: %s, ExitDiagnostics: %s, NeedToRelease: %s", containerId, exitCode, diagnostics, needToRelease); if (!statusManager.isContainerIdLiveAssociated(containerId)) { LOGGER.logDebug("[NotLiveAssociated]%s", logSuffix); return; } TaskStatus taskStatus = statusManager.getTaskStatusWithLiveAssociatedContainerId(containerId); String taskRoleName = taskStatus.getTaskRoleName(); TaskStatusLocator taskLocator = new TaskStatusLocator(taskRoleName, taskStatus.getTaskIndex()); String linePrefix = String.format("%s: ", taskLocator); LOGGER.logSplittedLines(Level.INFO, "%s%s\n%s", taskLocator, logSuffix, generateContainerDiagnostics(taskStatus, linePrefix)); statusManager.transitionTaskState(taskLocator, TaskState.CONTAINER_COMPLETED, new TaskEvent().setContainerExitCode(exitCode).setContainerExitDiagnostics(diagnostics)); // Post-mortem CONTAINER_COMPLETED Task attemptToRetry(taskStatus); }
尸检部分进一步检查container的退出状态是否是SUCCEEDED。如果不是SUCCEEDED,有fancyRetryPolicy和normalRetryPolicy两种retry方式。现在主要看如果成功退出了,那么会进入到completeTask()中。
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
private void attemptToRetry(TaskStatus taskStatus) throws Exception { String taskRoleName = taskStatus.getTaskRoleName(); TaskStatusLocator taskLocator = new TaskStatusLocator(taskRoleName, taskStatus.getTaskIndex()); Integer exitCode = taskStatus.getContainerExitCode(); ExitType exitType = taskStatus.getContainerExitType(); Integer retriedCount = taskStatus.getTaskRetryPolicyState().getRetriedCount(); RetryPolicyState newRetryPolicyState = YamlUtils.deepCopy(taskStatus.getTaskRetryPolicyState(), RetryPolicyState.class); RetryPolicyDescriptor retryPolicy = requestManager.getTaskRetryPolicy(taskRoleName); Boolean fancyRetryPolicy = retryPolicy.getFancyRetryPolicy(); Integer maxRetryCount = retryPolicy.getMaxRetryCount(); String logPrefix = String.format("%s: attemptToRetry: ", taskLocator); LOGGER.logSplittedLines(Level.INFO, logPrefix + "ContainerExitCode: [%s], ContainerExitType: [%s], RetryPolicyState:\n[%s]", exitCode, exitType, WebCommon.toJson(newRetryPolicyState)); String completeTaskLogPrefix = logPrefix + "Will completeTask. Reason: "; String retryTaskLogPrefix = logPrefix + "Will retryTask with new Container. Reason: "; // 1. FancyRetryPolicy String fancyRetryPolicyLogSuffix = String.format("FancyRetryPolicy: Task exited due to %s.", exitType); if (exitType == ExitType.TRANSIENT_NORMAL) { newRetryPolicyState.setTransientNormalRetriedCount(newRetryPolicyState.getTransientNormalRetriedCount() + 1); if (fancyRetryPolicy) { LOGGER.logWarning(retryTaskLogPrefix + fancyRetryPolicyLogSuffix); retryTask(taskStatus, newRetryPolicyState); return; } } else if (exitType == ExitType.TRANSIENT_CONFLICT) { newRetryPolicyState.setTransientConflictRetriedCount(newRetryPolicyState.getTransientConflictRetriedCount() + 1); if (fancyRetryPolicy) { LOGGER.logWarning(retryTaskLogPrefix + fancyRetryPolicyLogSuffix); retryTask(taskStatus, newRetryPolicyState); return; } } else if (exitType == ExitType.NON_TRANSIENT) { newRetryPolicyState.setNonTransientRetriedCount(newRetryPolicyState.getNonTransientRetriedCount() + 1); if (fancyRetryPolicy) { LOGGER.logWarning(completeTaskLogPrefix + fancyRetryPolicyLogSuffix); completeTask(taskStatus); return; } } else { if (exitType == ExitType.SUCCEEDED) { newRetryPolicyState.setSucceededRetriedCount(newRetryPolicyState.getSucceededRetriedCount() + 1); } else { newRetryPolicyState.setUnKnownRetriedCount(newRetryPolicyState.getUnKnownRetriedCount() + 1); } if (fancyRetryPolicy) { // FancyRetryPolicy only handle exit due to transient and non-transient failure specially, // Leave exit due to others to NormalRetryPolicy LOGGER.logInfo(logPrefix + "Transfer the RetryDecision to NormalRetryPolicy. Reason: " + fancyRetryPolicyLogSuffix); } } // 2. NormalRetryPolicy if (maxRetryCount == GlobalConstants.USING_EXTENDED_UNLIMITED_VALUE || (exitType != ExitType.SUCCEEDED && maxRetryCount == GlobalConstants.USING_UNLIMITED_VALUE) || (exitType != ExitType.SUCCEEDED && retriedCount < maxRetryCount)) { newRetryPolicyState.setRetriedCount(newRetryPolicyState.getRetriedCount() + 1); LOGGER.logWarning(retryTaskLogPrefix + "RetriedCount %s has not reached MaxRetryCount %s.", retriedCount, maxRetryCount); retryTask(taskStatus, newRetryPolicyState); return; } else { if (exitType == ExitType.SUCCEEDED) { LOGGER.logInfo(completeTaskLogPrefix + "Task exited due to %s.", exitType); completeTask(taskStatus); return; } else { LOGGER.logWarning(completeTaskLogPrefix + "RetriedCount %s has reached MaxRetryCount %s.", retriedCount, maxRetryCount); completeTask(taskStatus); return; } } }
在completeTask()方法中, statusManager会将task的状态标记为TASK_COMPLETED,然后调用attemptToStop()方法。
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
private void completeTask(TaskStatus taskStatus) throws Exception { String taskRoleName = taskStatus.getTaskRoleName(); TaskStatusLocator taskLocator = new TaskStatusLocator(taskRoleName, taskStatus.getTaskIndex()); LOGGER.logSplittedLines(Level.INFO, "%s: completeTask: TaskStatus:\n%s", taskLocator, WebCommon.toJson(taskStatus)); statusManager.transitionTaskState(taskLocator, TaskState.TASK_COMPLETED); attemptToStop(taskStatus); }
attemptToStop(taskStatus)方法其实尝试的是停止整个AM的运行,所以相当于会检查传入的这个taskStatus是不是framework中最后一个task了。
下面的代码会分3种情况来判断:
1)如果退出状态不是SUCCEEDED,且设置了minFailedTaskCount。那么检查已经failed 的task数量,如果该数量超过了minFailedTaskCount的设置值,那么结束AM。
2)如果退出状态是SUCCEEDED,且设置了minSuccessTaskCount。那么检查已经success的task数量,如果该数量超过了minSuccessTaskCount的设置值,那么结束AM。
3)由statusManager查询是否所有的task都是在finalState了(finalState == TASK_COMPLETED)。如果有,执行if语句中的内容。也就是说如果只有一个task在finalState的话,这一步是不做任何动作直接返回的。
如果所有task都是finalState了,再进一步检查在各个task的运行过程中有没有fail的情况来构造诊断信息,结束AM。
这三种情况最后都调用了stopForApplicationCompletion()方法。
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
private void attemptToStop(TaskStatus taskStatus) throws IOException { String taskRoleName = taskStatus.getTaskRoleName(); ExitType exitType = taskStatus.getContainerExitType(); TaskRoleApplicationCompletionPolicyDescriptor applicationCompletionPolicy = requestManager.getTaskRoleApplicationCompletionPolicy(taskRoleName); Integer minFailedTaskCount = applicationCompletionPolicy.getMinFailedTaskCount(); Integer minSucceededTaskCount = applicationCompletionPolicy.getMinSucceededTaskCount(); if (exitType != ExitType.SUCCEEDED && minFailedTaskCount != null) { List<TaskStatus> failedTaskStatuses = statusManager.getFailedTaskStatus(taskRoleName); if (minFailedTaskCount <= failedTaskStatuses.size()) { String applicationCompletionReason = String.format( "[%s]: FailedTaskCount %s has reached MinFailedTaskCount %s.", taskRoleName, failedTaskStatuses.size(), minFailedTaskCount); stopForApplicationCompletion(applicationCompletionReason, failedTaskStatuses); } } if (exitType == ExitType.SUCCEEDED && minSucceededTaskCount != null) { List<TaskStatus> succeededTaskStatuses = statusManager.getSucceededTaskStatus(taskRoleName); if (minSucceededTaskCount <= succeededTaskStatuses.size()) { String applicationCompletionReason = String.format( "[%s]: SucceededTaskCount %s has reached MinSucceededTaskCount %s.", taskRoleName, succeededTaskStatuses.size(), minSucceededTaskCount); stopForApplicationCompletion(applicationCompletionReason, succeededTaskStatuses); } } if (statusManager.isAllTaskInFinalState()) { int totalTaskCount = statusManager.getTaskCount(); List<TaskStatus> failedTaskStatuses = statusManager.getFailedTaskStatus(); String applicationCompletionReason = String.format( "All Tasks completed and no ApplicationCompletionPolicy has ever been triggered: " + "TotalTaskCount: %s, FailedTaskCount: %s.", totalTaskCount, failedTaskStatuses.size()); stopForApplicationCompletion(applicationCompletionReason); } }
stopForApplicationCompletion()工作是封装一个StopStatus对象,并将该对象传给stop()方法。
stop()方法中,结束requestManager、statusManager,rmClient向RM注销并结束线程。到这里AM的生命周期也结束了。
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
// THREAD SAFE @Override public synchronized void stop(StopStatus stopStatus) { // Best Effort to stop Gracefully super.stop(stopStatus); AggregateException ae = new AggregateException(); // Stop AM's SubServices // No need to stop nmClient, since it may be time consuming to stop all Containers, leave it for RM. // Since here is Best Effort, leave the GC work of zkStore and hdfsStore to LauncherService. try { if (yarnClient != null) { yarnClient.stop(); } } catch (Exception e) { ae.addException(e); } try { if (statusManager != null) { statusManager.stop(stopStatus); } } catch (Exception e) { ae.addException(e); } try { if (requestManager != null) { requestManager.stop(stopStatus); } } catch (Exception e) { ae.addException(e); } // Stop rmClient at last, since there is no work left in current AM, and only then RM is // allowed to process the application, such as generate application's diagnostics. try { if (rmClient != null) { if (stopStatus.getNeedUnregister()) { LOGGER.logInfo("Unregistering %s to RM", serviceName); rmClient.unregisterApplicationMaster( stopStatus.getCode() == 0 ? FinalApplicationStatus.SUCCEEDED : FinalApplicationStatus.FAILED, stopStatus.getDiagnostics(), conf.getAmTrackingUrl()); } rmClient.stop(); } } catch (Exception e) { ae.addException(e); } if (ae.getExceptions().size() > 0) { LOGGER.logWarning(ae, "Failed to stop %s gracefully", serviceName); } LOGGER.logInfo("%s stopped", serviceName); System.exit(stopStatus.getCode()); }
从上面过程看到,zk是一个很重要的角色,requestManager通过它提供给AM运行任务所需要的全部信息,并且能保证AM能与用户的request同步更新。statusManager通过它把task和taskRole的状态更新到zk上,给用户查询。