本文,说下NodeManager篇:本文重在于介绍初始化部分:
还是从start-yarn.sh的脚本追本溯源,最后发现启动的类是NodeManager:
package org.apache.hadoop.yarn.server.nodemanager;
public static void main(String[] args) {
Thread.setDefaultUncaughtExceptionHandler(new YarnUncaughtExceptionHandler());
StringUtils.startupShutdownMessage(NodeManager.class, args, LOG);
NodeManager nodeManager = new NodeManager();
Configuration conf = new YarnConfiguration();
nodeManager.initAndStartNodeManager(conf, false);
}
直接从main方法开始看起:
NodeManager nodeManager = new NodeManager();
看这句话,是NodeManager的初始化,采用的是父类CompositeService的方法,最终调度到AbstractService内:
/**
* Construct the service.
* @param name service name
*/
public AbstractService(String name) {
this.name = name;
stateModel = new ServiceStateModel(name);
}
/**
* Implements the service state model.
*/
@Public
@Evolving
public class ServiceStateModel
/**
* Create the service state model in the {@link Service.STATE#NOTINITED}
* state.
*/
public ServiceStateModel(String name) {
this(name, Service.STATE.NOTINITED);
}
/** Constructed but not initialized */
NOTINITED(0, "NOTINITED"),
初始化的过程中,给NodeManager初始化了一个状态模型,服务初始状态是STATE.NOTINITED,构建而未曾初始化,这里,我们必须要对状态集中注意力,因为yarn重要的核心就在于基于状态转换的异步处理机制。
接下来,看相应配置的初始化:
Configuration conf = new YarnConfiguration();
这个没什么可说的,等到用到相应配置再看,YarnConfiguration内部定义了很多的相应参数,前面这几句代码看起来都简单,那么,重头戏肯定就在这里了:
nodeManager.initAndStartNodeManager(conf, false);
private void initAndStartNodeManager(Configuration conf, boolean hasToReboot) {
try {
// Remove the old hook if we are rebooting.
if (hasToReboot && null != nodeManagerShutdownHook) {
ShutdownHookManager.get().removeShutdownHook(nodeManagerShutdownHook);
}
nodeManagerShutdownHook = new CompositeServiceShutdownHook(this);
ShutdownHookManager.get().addShutdownHook(nodeManagerShutdownHook, SHUTDOWN_HOOK_PRIORITY);
// System exit should be called only when NodeManager is instantiated from
// main() funtion
this.shouldExitOnShutdownEvent = true;
this.init(conf);
this.start();
} catch (Throwable t) {
LOG.fatal("Error starting NodeManager", t);
System.exit(-1);
}
}
果然,前面的检查和钩子我们不看了,直接研究其init方法。
@Override
public void init(Configuration conf) {
if (conf == null) {
throw new ServiceStateException("Cannot initialize service "
+ getName() + ": null configuration");
}
if (isInState(STATE.INITED)) {
return;
}
synchronized (stateChangeLock) {
if (enterState(STATE.INITED) != STATE.INITED) {
setConfig(conf);
try {
serviceInit(config);
if (isInState(STATE.INITED)) {
//if the service ended up here during init,
//notify the listeners
notifyListeners();
}
} catch (Exception e) {
noteFailure(e);
ServiceOperations.stopQuietly(LOG, this);
throw ServiceStateException.convert(e);
}
}
}
}
init方法,最终调用到了AbstractService的init方法,而内部的重要实现则是serviceInit方法,这是NodeManager自身的方法,注意,这里的判断都是能通过的,因为我们最初的状态时NOTINITED。
我们看看里面调用的serviceInit方法传入的参数,发现传入的是AbstractService内部的conf,而这个conf是从哪儿加载来的?
protected void serviceInit(Configuration conf) throws Exception {
if (conf != config) {
LOG.debug("Config has been overridden during init");
setConfig(conf);
}
}
原来在这里,把我们的YarnConfiguration加载为了AbstractService内部的Configuration:
我们看serviceInit方法:
@Override
protected void serviceInit(Configuration conf) throws Exception {
conf.setBoolean(Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY, true);
rmWorkPreservingRestartEnabled = conf.getBoolean(YarnConfiguration.RM_WORK_PRESERVING_RECOVERY_ENABLED,
YarnConfiguration.DEFAULT_RM_WORK_PRESERVING_RECOVERY_ENABLED);
initAndStartRecoveryStore(conf);
NMContainerTokenSecretManager containerTokenSecretManager = new NMContainerTokenSecretManager(conf, nmStore);
NMTokenSecretManagerInNM nmTokenSecretManager = new NMTokenSecretManagerInNM(nmStore);
recoverTokens(nmTokenSecretManager, containerTokenSecretManager);
this.aclsManager = new ApplicationACLsManager(conf);
ContainerExecutor exec = ReflectionUtils.newInstance(conf.getClass(YarnConfiguration.NM_CONTAINER_EXECUTOR,
DefaultContainerExecutor.class, ContainerExecutor.class), conf);
try {
exec.init();
} catch (IOException e) {
throw new YarnRuntimeException("Failed to initialize container executor", e);
}
DeletionService del = createDeletionService(exec);
addService(del);
// NodeManager level dispatcher
this.dispatcher = new AsyncDispatcher();
nodeHealthChecker = new NodeHealthCheckerService();
addService(nodeHealthChecker);
dirsHandler = nodeHealthChecker.getDiskHandler();
this.context = createNMContext(containerTokenSecretManager, nmTokenSecretManager, nmStore);
nodeStatusUpdater = createNodeStatusUpdater(context, dispatcher, nodeHealthChecker);
NodeResourceMonitor nodeResourceMonitor = createNodeResourceMonitor();
addService(nodeResourceMonitor);
containerManager = createContainerManager(context, exec, del, nodeStatusUpdater, this.aclsManager, dirsHandler);
addService(containerManager);
((NMContext) context).setContainerManager(containerManager);
WebServer webServer = createWebServer(context, containerManager.getContainersMonitor(), this.aclsManager,
dirsHandler);
addService(webServer);
((NMContext) context).setWebServer(webServer);
dispatcher.register(ContainerManagerEventType.class, containerManager);
dispatcher.register(NodeManagerEventType.class, this);
addService(dispatcher);
DefaultMetricsSystem.initialize("NodeManager");
// StatusUpdater should be added last so that it get started last
// so that we make sure everything is up before registering with RM.
addService(nodeStatusUpdater);
((NMContext) context).setNodeStatusUpdater(nodeStatusUpdater);
super.serviceInit(conf);
// TODO add local dirs to del
}
内容很长,抽丝剥茧,一点点看。
首先是initAndStartRecoveryStore:
private void initAndStartRecoveryStore(Configuration conf) throws IOException {
boolean recoveryEnabled = conf.getBoolean(YarnConfiguration.NM_RECOVERY_ENABLED,
YarnConfiguration.DEFAULT_NM_RECOVERY_ENABLED);
if (recoveryEnabled) {
FileSystem recoveryFs = FileSystem.getLocal(conf);
String recoveryDirName = conf.get(YarnConfiguration.NM_RECOVERY_DIR);
if (recoveryDirName == null) {
throw new IllegalArgumentException(
"Recovery is enabled but " + YarnConfiguration.NM_RECOVERY_DIR + " is not set.");
}
Path recoveryRoot = new Path(recoveryDirName);
recoveryFs.mkdirs(recoveryRoot, new FsPermission((short) 0700));
nmStore = new NMLeveldbStateStoreService();
} else {
nmStore = new NMNullStateStoreService();
}
nmStore.init(conf);
nmStore.start();
}
默认情况下,recoveryEnabled为false,我们直接分析else的代码,其中的init方法,最后还是要走自己的serviceInit方法:
/** Initialize the state storage */
@Override
public void serviceInit(Configuration conf) throws IOException {
initStorage(conf);
}
initStorage内无动作,而且start方法调用的storeStorage方法内也无实现:
继续往下看:
ContainerExecutor exec = ReflectionUtils.newInstance(conf.getClass(YarnConfiguration.NM_CONTAINER_EXECUTOR,
DefaultContainerExecutor.class, ContainerExecutor.class), conf);
try {
exec.init();
} catch (IOException e) {
throw new YarnRuntimeException("Failed to initialize container executor", e);
}
在RM和NM的交互中,Container经常被使用到,而在NodeManager初始化的时候,其就必须知道自己到底有多少可用的Container,而实际的计算和分配,则是由ContainerExecutor来实现的,默认的实现类是:DefaultContainerExecutor:
// NodeManager level dispatcher
this.dispatcher = new AsyncDispatcher();
这句注释很明确,NodeManager级别的调度器,自然还有其他level的调度器,而这个主要用于管理需要NodeManager来处理的事件:
nodeHealthChecker = new NodeHealthCheckerService();
addService(nodeHealthChecker);
dirsHandler = nodeHealthChecker.getDiskHandler();
这个nodeHealthChecker,是用于NM节点健康状态的检测,这里调用的addService,是为了最后的统一初始化调用,所以我们要看看其内部的serviceInit方法:
@Override
protected void serviceInit(Configuration conf) throws Exception {
if (NodeHealthScriptRunner.shouldRun(conf)) {
nodeHealthScriptRunner = new NodeHealthScriptRunner();
addService(nodeHealthScriptRunner);
}
addService(dirsHandler);
super.serviceInit(conf);
}
/*
* Method which initializes the values for the script path and interval time.
*/
@Override
protected void serviceInit(Configuration conf) throws Exception {
this.conf = conf;
this.nodeHealthScript =
conf.get(YarnConfiguration.NM_HEALTH_CHECK_SCRIPT_PATH);
this.intervalTime = conf.getLong(YarnConfiguration.NM_HEALTH_CHECK_INTERVAL_MS,
YarnConfiguration.DEFAULT_NM_HEALTH_CHECK_INTERVAL_MS);
this.scriptTimeout = conf.getLong(
YarnConfiguration.NM_HEALTH_CHECK_SCRIPT_TIMEOUT_MS,
YarnConfiguration.DEFAULT_NM_HEALTH_CHECK_SCRIPT_TIMEOUT_MS);
String[] args = conf.getStrings(YarnConfiguration.NM_HEALTH_CHECK_SCRIPT_OPTS,
new String[] {});
timer = new NodeHealthMonitorExecutor(args);
super.serviceInit(conf);
}
public NodeHealthMonitorExecutor(String[] args) {
ArrayList<String> execScript = new ArrayList<String>();
execScript.add(nodeHealthScript);
if (args != null) {
execScript.addAll(Arrays.asList(args));
}
shexec = new ShellCommandExecutor(execScript
.toArray(new String[execScript.size()]), null, null, scriptTimeout);
}
追本溯源过来,我们发现里面定义了一个定时的脚本执行,来定时检测NM的健康状况。
this.context = createNMContext(containerTokenSecretManager, nmTokenSecretManager, nmStore);
这句话看似很简单,实际上是代码的集中化,内部的构造非常重要,我们看看:
public NMContext(NMContainerTokenSecretManager containerTokenSecretManager,
NMTokenSecretManagerInNM nmTokenSecretManager, LocalDirsHandlerService dirsHandler,
ApplicationACLsManager aclsManager, NMStateStoreService stateStore) {
this.containerTokenSecretManager = containerTokenSecretManager;
this.nmTokenSecretManager = nmTokenSecretManager;
this.dirsHandler = dirsHandler;
this.aclsManager = aclsManager;
this.nodeHealthStatus.setIsNodeHealthy(true);
this.nodeHealthStatus.setHealthReport("Healthy");
this.nodeHealthStatus.setLastHealthReportTime(System.currentTimeMillis());
this.stateStore = stateStore;
}
在说RM结构的时候,有个rmContext的大管家,而这里,NMContext其实就是每个NM的大管家。
nodeStatusUpdater = createNodeStatusUpdater(context, dispatcher, nodeHealthChecker);
所以说Hadoop的代码写的都很清晰明了,一眼就能看出来这个类是用于NM节点状态定时更新的,因为最终需要把这个服务加入到serviceList,我们要看看其初始化的逻辑:
@Override
protected void serviceInit(Configuration conf) throws Exception {
int memoryMb = conf.getInt(YarnConfiguration.NM_PMEM_MB, YarnConfiguration.DEFAULT_NM_PMEM_MB);
float vMemToPMem = conf.getFloat(YarnConfiguration.NM_VMEM_PMEM_RATIO,
YarnConfiguration.DEFAULT_NM_VMEM_PMEM_RATIO);
int virtualMemoryMb = (int) Math.ceil(memoryMb * vMemToPMem);
int virtualCores = conf.getInt(YarnConfiguration.NM_VCORES, YarnConfiguration.DEFAULT_NM_VCORES);
this.totalResource = Resource.newInstance(memoryMb, virtualCores);
metrics.addResource(totalResource);
this.tokenKeepAliveEnabled = isTokenKeepAliveEnabled(conf);
this.tokenRemovalDelayMs = conf.getInt(YarnConfiguration.RM_NM_EXPIRY_INTERVAL_MS,
YarnConfiguration.DEFAULT_RM_NM_EXPIRY_INTERVAL_MS);
this.minimumResourceManagerVersion = conf.get(YarnConfiguration.NM_RESOURCEMANAGER_MINIMUM_VERSION,
YarnConfiguration.DEFAULT_NM_RESOURCEMANAGER_MINIMUM_VERSION);
// Default duration to track stopped containers on nodemanager is 10Min.
// This should not be assigned very large value as it will remember all the
// containers stopped during that time.
durationToTrackStoppedContainers = conf.getLong(YARN_NODEMANAGER_DURATION_TO_TRACK_STOPPED_CONTAINERS, 600000);
if (durationToTrackStoppedContainers < 0) {
String message = "Invalid configuration for " + YARN_NODEMANAGER_DURATION_TO_TRACK_STOPPED_CONTAINERS
+ " default " + "value is 10Min(600000).";
LOG.error(message);
throw new YarnException(message);
}
if (LOG.isDebugEnabled()) {
LOG.debug(YARN_NODEMANAGER_DURATION_TO_TRACK_STOPPED_CONTAINERS + " :" + durationToTrackStoppedContainers);
}
super.serviceInit(conf);
LOG.info("Initialized nodemanager for " + nodeId + ":" + " physical-memory=" + memoryMb + " virtual-memory="
+ virtualMemoryMb + " virtual-cores=" + virtualCores);
}
这里,发现了很多从YarnConfiguration加载的东西,我们也就知道为什么默认的NM上只加载了8G的内容给Container使用了,也知道虚拟内存和物理内存的2.1的比例,同时默认占用8个核来使用,这就是NodeManager实际占用到的资源,可供分给Container来使用的资源:
接下来,我们看这部分,用于NM资源的监控,并看看其serviceInit方法:
NodeResourceMonitor nodeResourceMonitor = createNodeResourceMonitor();
addService(nodeResourceMonitor);
有点怀疑这是个bug,因为这个类根本没用到,虽然加到service内,但内部不会初始化。
接着,看container的管理器:
containerManager = createContainerManager(context, exec, del, nodeStatusUpdater, this.aclsManager, dirsHandler);
addService(containerManager);
((NMContext) context).setContainerManager(containerManager);
我们看看其初始化,捡重点的代码:
// ContainerManager level dispatcher.
dispatcher = new AsyncDispatcher();
其内部有自己的dispatcher,用于处理下面的事件:
dispatcher.register(ContainerEventType.class, new ContainerEventDispatcher());
dispatcher.register(ApplicationEventType.class, new ApplicationEventDispatcher());
dispatcher.register(LocalizationEventType.class, rsrcLocalizationSrvc);
dispatcher.register(AuxServicesEventType.class, auxiliaryServices);
dispatcher.register(ContainersMonitorEventType.class, containersMonitor);
dispatcher.register(ContainersLauncherEventType.class, containersLauncher);
平时我们需要分配container和启动container,都是由该类来负责的,最重要的就是container启动的时候,可以看到这段代码在ContainerLauncher内:此处不多说了。
该类重要的代码在初始化时候基本实现完毕,所以不看其serviceInit方法了:
WebServer webServer = createWebServer(context, containerManager.getContainersMonitor(), this.aclsManager,
dirsHandler);
addService(webServer);
我们知道,NM自身也是有webapp监控的,而其创建的过程,就是在此处:
public WebServer(Context nmContext, ResourceView resourceView, ApplicationACLsManager aclsManager,
LocalDirsHandlerService dirsHandler) {
super(WebServer.class.getName());
this.nmContext = nmContext;
this.nmWebApp = new NMWebApp(resourceView, aclsManager, dirsHandler);
}
其serviceInit为空,不看了。
dispatcher.register(ContainerManagerEventType.class, containerManager);
dispatcher.register(NodeManagerEventType.class, this);
addService(dispatcher);
DefaultMetricsSystem.initialize("NodeManager");
// StatusUpdater should be added last so that it get started last
// so that we make sure everything is up before registering with RM.
addService(nodeStatusUpdater);
((NMContext) context).setNodeStatusUpdater(nodeStatusUpdater);
剩下的代码如上,不与分析了。
下文,将会讲述下其相关的服务启动。