在关于Yarn那些事的博客里,介绍的主要是针对任务提交的一个动态流程说明,而其中牵涉到的一些细节问题,必须通过Resourcemanager的启动和NodeManager的启动,来更好的说明。
而本系列,就详细说说ResourceManager启动过程中,都发生了什么。
我们都知道,Yarn的启动脚本是start-yan.sh,我们就从这个脚本开始,琢磨琢磨。
"$bin"/yarn-daemon.sh --config $YARN_CONF_DIR start resourcemanager
脚本里这句话,指向了本目录下的yarn-daemon.sh脚本,命令参数指定了resourcemanager,接着看yarn-daemon.sh脚本:
nohup nice -n $YARN_NICENESS "$HADOOP_YARN_HOME"/bin/yarn --config $YARN_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null &
这句话很关键,交代了我们实际的启动脚本是bin目录下的yarn,我们看下:
elif [ "$COMMAND" = "resourcemanager" ] ; then
CLASSPATH=${CLASSPATH}:$YARN_CONF_DIR/rm-config/log4j.properties
CLASSPATH=${CLASSPATH}:"$HADOOP_YARN_HOME/$YARN_DIR/timelineservice/*"
CLASSPATH=${CLASSPATH}:"$HADOOP_YARN_HOME/$YARN_DIR/timelineservice/lib/*"
CLASS='org.apache.hadoop.yarn.server.resourcemanager.ResourceManager'
YARN_OPTS="$YARN_OPTS $YARN_RESOURCEMANAGER_OPTS"
if [ "$YARN_RESOURCEMANAGER_HEAPSIZE" != "" ]; then
JAVA_HEAP_MAX="-Xmx""$YARN_RESOURCEMANAGER_HEAPSIZE""m"
fi
终于顺利找到了根基,原来根据我们指定的脚本,找到的是ResourceManager这个类来启动的,下面就看看这个类。
先来看下注释:
/**
* The ResourceManager is the main class that is a set of components. "I am the
* ResourceManager. All your resources belong to us..."
*
*/
@SuppressWarnings("unchecked")
public class ResourceManager extends CompositeService implements Recoverable
格外霸气,管理整个集群内所有的资源,并且继承了CompositeService类,这是一个服务类,不多介绍了,主要提供了一些服务初始化和启动的方法,供子类使用。
从ResourceManager的成员变量开始看起:
protected ClientToAMTokenSecretManagerInRM clientToAMSecretManager = new ClientToAMTokenSecretManagerInRM();
protected RMContainerTokenSecretManager containerTokenSecretManager;
protected NMTokenSecretManagerInRM nmTokenSecretManager;
protected AMRMTokenSecretManager amRmTokenSecretManager;
private Dispatcher rmDispatcher;
protected ResourceScheduler scheduler;
private ClientRMService clientRM;
protected ApplicationMasterService masterService;
private ApplicationMasterLauncher applicationMasterLauncher;
private AdminService adminService;
private ContainerAllocationExpirer containerAllocationExpirer;
protected NMLivelinessMonitor nmLivelinessMonitor;
protected NodesListManager nodesListManager;
private EventHandler<SchedulerEvent> schedulerDispatcher;
protected RMAppManager rmAppManager;
protected ApplicationACLsManager applicationACLsManager;
protected QueueACLsManager queueACLsManager;
protected RMDelegationTokenSecretManager rmDTSecretManager;
private DelegationTokenRenewer delegationTokenRenewer;
private WebApp webApp;
protected RMContext rmContext;
protected ResourceTrackerService resourceTracker;
private boolean recoveryEnabled;
很多,具体可以参照每个类的用法,在此不多说,其中牵涉到Application较多的,比如RMAppManager,RMContext等,需要细看,这是废话,每个都值得研究。
看Main方法:
Configuration conf = new YarnConfiguration();
ResourceManager resourceManager = new ResourceManager();
ShutdownHookManager.get().addShutdownHook(new CompositeServiceShutdownHook(resourceManager),
SHUTDOWN_HOOK_PRIORITY);
setHttpPolicy(conf);
resourceManager.init(conf);
resourceManager.start();
直接把目光聚焦在这里,我们重点研究下服务的初始化和启动。
this.rmDispatcher = createDispatcher();
addIfService(this.rmDispatcher);
初始化的第一个关键点,创建调度器,这是ResourceManager异步调度的关键,看看这个方法:很简单:
protected Dispatcher createDispatcher() {
return new AsyncDispatcher();
}
很明显,这是个异步调度器,看看这个类的注释和初始化步骤:
/**
* Dispatches {@link Event}s in a separate thread. Currently only single thread
* does that. Potentially there could be multiple channels for each event type
* class and a thread pool can be used to dispatch the events.
*/
@SuppressWarnings("rawtypes")
@Public
@Evolving
public class AsyncDispatcher extends AbstractService implements Dispatcher
这也是一个需要启动的服务,用于事件的调度:
public AsyncDispatcher() {
this(new LinkedBlockingQueue<Event>());
}
public AsyncDispatcher(BlockingQueue<Event> eventQueue) {
super("Dispatcher");
this.eventQueue = eventQueue;
this.eventDispatchers = new HashMap<Class<? extends Enum>, EventHandler>();
}
注意,Dispatcher内部封装了一个阻塞队列,运行过程中会把事件都放在这个池子里,并进行调度处理,同时定义了一个eventDispatchers,后续代码更容易看懂这个map的作用:
我们仔细看看AsyncDispatcher的服务初始化代码:
@Override
protected void serviceInit(Configuration conf) throws Exception {
this.exitOnDispatchException = conf.getBoolean(Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY,
Dispatcher.DEFAULT_DISPATCHER_EXIT_ON_ERROR);
super.serviceInit(conf);
}
目前来说,AsyncDispatcher的初始化代码先到这儿,我们继续看ResourceManager的服务初始化代码:
this.amRmTokenSecretManager = createAMRMTokenSecretManager(conf);
我们看到内部有个这个成员变量:
/**
* AMRM-tokens are per ApplicationAttempt. If users redistribute their
* tokens, it is their headache, god save them. I mean you are not supposed to
* distribute keys to your vault, right? Anyways, ResourceManager saves each
* token locally in memory till application finishes and to a store for restart,
* so no need to remember master-keys even after rolling them.
*/
public class AMRMTokenSecretManager extends
SecretManager<AMRMTokenIdentifier>
看注释,清晰明了,每次提交一个ApplicationAttempt时候,都不用再递交自己的token了,实际上还是一个身份验证工具(自己的理解)。
this.containerAllocationExpirer = new ContainerAllocationExpirer(this.rmDispatcher);
addService(this.containerAllocationExpirer);
看下这两句话ContainerAllocationExpirer,并且加到了serviceList中,用于最后的初始化,我们看看这个是什么作用,注释非常简单,还是留在动态提交ApplicationMaster的时候分析吧,其实主要是用来判断分配的container是否在规定时间内得到启动的:
AMLivelinessMonitor amLivelinessMonitor = createAMLivelinessMonitor();
addService(amLivelinessMonitor);
看这个AMLiveLinessMonitor,顾名思义,是用来检查ApplicationLivenessMonitor是否存活的,其继承了这个类:
/**
* A simple liveliness monitor with which clients can register, trust the
* component to monitor liveliness, get a call-back on expiry and then finally
* unregister.
*/
@Public
@Evolving
public abstract class AbstractLivelinessMonitor<O> extends AbstractService
同时,ContainerAllocationMonitor也继承了这个类,就是用于让客户端监控的:
AMLivelinessMonitor amFinishingMonitor = createAMLivelinessMonitor();
addService(amFinishingMonitor);
下面初始化了一个一样的AMLiveLinessMonitor,但是变量名不同,同样是起监控作用的:
接下来看RMStateStore的初始化:
boolean isRecoveryEnabled = conf.getBoolean(YarnConfiguration.RECOVERY_ENABLED,
YarnConfiguration.DEFAULT_RM_RECOVERY_ENABLED);
RMStateStore rmStore = null;
if (isRecoveryEnabled) {
recoveryEnabled = true;
rmStore = RMStateStoreFactory.getStore(conf);
} else {
recoveryEnabled = false;
rmStore = new NullRMStateStore();
}
对于大部分成员变量的初始化不予多说,先看下RMStateStore的初始化,在我们默认配置下:isRecoveryEnabled为false,所以创建了一个空的RMStateStore,即NullRMStateStore,对于ResourceManager的状态进行存储:
rmStore.init(conf);
rmStore.setRMDispatcher(rmDispatcher);
这里面注意下,rmStore内部的dispatcher与RM的dispatcher不是同一个,代码如下:
private Dispatcher rmDispatcher;
AsyncDispatcher dispatcher;
RMStateStore内部有两个调度器,rmDispatcher是RM的调度器,而dispatcher则是其内部用来调度事件的调度器,对于RMStateStore的init代码有些绕,仔细看下:
@Override
public void init(Configuration conf) {
if (conf == null) {
throw new ServiceStateException("Cannot initialize service " + getName() + ": null configuration");
}
if (isInState(STATE.INITED)) {
return;
}
synchronized (stateChangeLock) {
if (enterState(STATE.INITED) != STATE.INITED) {
setConfig(conf);
try {
serviceInit(config);
if (isInState(STATE.INITED)) {
// if the service ended up here during init,
// notify the listeners
notifyListeners();
}
} catch (Exception e) {
noteFailure(e);
ServiceOperations.stopQuietly(LOG, this);
throw ServiceStateException.convert(e);
}
}
}
}
其实际调用的是AbstractService的init方法,其中调用到了serviceInit方法,而这个方法,则是RMStateStore的方法:
public synchronized void serviceInit(Configuration conf) throws Exception {
// create async handler
dispatcher = new AsyncDispatcher();
dispatcher.init(conf);
dispatcher.register(RMStateStoreEventType.class, new ForwardingEventHandler());
initInternal(conf);
}
很清楚看到了内部封装了一个自己的dispatcher,用于调度RMStateStoreEventType类型的事件:
下面接着看:
this.rmContext = new RMContextImpl(this.rmDispatcher, rmStore, this.containerAllocationExpirer,
amLivelinessMonitor, amFinishingMonitor, delegationTokenRenewer, this.amRmTokenSecretManager,
this.containerTokenSecretManager, this.nmTokenSecretManager, this.clientToAMSecretManager);
这是重头戏,我们必须看看这个拥有如此多成员变量的RMContextImpl到底是什么:
/**
* Context of the ResourceManager.
*/
public interface RMContext {
这是RMContextImpl父类的注释,是ResourceManager的上下文,就相当于管家了,基本大权在握,是ResourceManager的心腹。
接下来是这儿:
this.nodesListManager = new NodesListManager(this.rmContext);
其实就相当于告诉了RM,这里到底有多少个子节点可供使用,而且给新建的NodesListManager内部也安插了RM的心腹,即RMContextImpl。
this.rmDispatcher.register(NodesListManagerEventType.class, this.nodesListManager);
看到这儿,我们又得回去看AsyncDispatcher中的一个register方法:
@SuppressWarnings("unchecked")
@Override
public void register(Class<? extends Enum> eventType, EventHandler handler) {
/* check to see if we have a listener registered */
EventHandler<Event> registeredHandler = (EventHandler<Event>) eventDispatchers.get(eventType);
LOG.info("Registering " + eventType + " for " + handler.getClass());
if (registeredHandler == null) {
eventDispatchers.put(eventType, handler);
} else if (!(registeredHandler instanceof MultiListenerHandler)) {
/* for multiple listeners of an event add the multiple listener handler */
MultiListenerHandler multiHandler = new MultiListenerHandler();
multiHandler.addHandler(registeredHandler);
multiHandler.addHandler(handler);
eventDispatchers.put(eventType, multiHandler);
} else {
/* already a multilistener, just add to it */
MultiListenerHandler multiHandler = (MultiListenerHandler) registeredHandler;
multiHandler.addHandler(handler);
}
}
仔细看来,其实就相当于eventDispatcher的充实,把各类事件即相应的处理,都送给eventdispatcher,方便后续出现类似事件,eventdispatcher能够迅速找到对应的对象来进行处理。
这里,就相当于把对应于NodeManager出现的事情,都交给了NodeListManager,这就是你的工作了。
public enum NodesListManagerEventType {
NODE_USABLE, NODE_UNUSABLE
}
如果出现了节点可用和不可用的事情,你就得迅速予以处理了。
// Initialize the scheduler
this.scheduler = createScheduler();
this.schedulerDispatcher = createSchedulerEventDispatcher();
addIfService(this.schedulerDispatcher);
this.rmDispatcher.register(SchedulerEventType.class, this.schedulerDispatcher);
接下来,创建了一个调度器,这一段得仔细看看了,因为yarn中的调度器非常重要,我们作业的初始化,都离不开它:
protected ResourceScheduler createScheduler() {
String schedulerClassName = conf.get(YarnConfiguration.RM_SCHEDULER, YarnConfiguration.DEFAULT_RM_SCHEDULER);
LOG.info("Using Scheduler: " + schedulerClassName);
try {
Class<?> schedulerClazz = Class.forName(schedulerClassName);
if (ResourceScheduler.class.isAssignableFrom(schedulerClazz)) {
return (ResourceScheduler) ReflectionUtils.newInstance(schedulerClazz, this.conf);
} else {
throw new YarnRuntimeException("Class: " + schedulerClassName + " not instance of "
+ ResourceScheduler.class.getCanonicalName());
}
} catch (ClassNotFoundException e) {
throw new YarnRuntimeException("Could not instantiate Scheduler: " + schedulerClassName, e);
}
}
我这里的代码是hadoop 2.2.0,在默认配置下,配置的调度器是:
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
而接着,我们给调度器自己定义了一个调度器,换句话说,对于调度器接受的事件类型,其会继续调度给其他的handler去处理
最后,把这个调度器也注册给了RM内部的调度器:
protected EventHandler<SchedulerEvent> createSchedulerEventDispatcher() {
return new SchedulerEventDispatcher(this.scheduler);
}
this.rmDispatcher.register(SchedulerEventType.class, this.schedulerDispatcher);
接着,我们的异步调度器又加入了三个成分,负责对Application相关事件,ApplicationAttempt事件,RMNode事件进行调度:
// Register event handler for RmAppEvents
this.rmDispatcher.register(RMAppEventType.class, new ApplicationEventDispatcher(this.rmContext));
// Register event handler for RmAppAttemptEvents
this.rmDispatcher.register(RMAppAttemptEventType.class, new ApplicationAttemptEventDispatcher(this.rmContext));
// Register event handler for RmNodes
this.rmDispatcher.register(RMNodeEventType.class, new NodeEventDispatcher(this.rmContext));
接着,我们看下这个:
this.resourceTracker = createResourceTrackerService();
addService(resourceTracker);
从官方文档中,我们知道RM实现了对于系统全部资源的管控,而这个管控是通过RPC来实现的,NodeManager调用ResourceTracker内的方法来提交自己的资源,而RM端有相应的处理,返回命令,让NM予以执行,而ResourceTrackerSerive就是在此处初始化的:
@Override
protected void serviceInit(Configuration conf) throws Exception {
resourceTrackerAddress = conf.getSocketAddr(YarnConfiguration.RM_RESOURCE_TRACKER_ADDRESS,
YarnConfiguration.DEFAULT_RM_RESOURCE_TRACKER_ADDRESS,
YarnConfiguration.DEFAULT_RM_RESOURCE_TRACKER_PORT);
RackResolver.init(conf);
nextHeartBeatInterval = conf.getLong(YarnConfiguration.RM_NM_HEARTBEAT_INTERVAL_MS,
YarnConfiguration.DEFAULT_RM_NM_HEARTBEAT_INTERVAL_MS);
if (nextHeartBeatInterval <= 0) {
throw new YarnRuntimeException("Invalid Configuration. " + YarnConfiguration.RM_NM_HEARTBEAT_INTERVAL_MS
+ " should be larger than 0.");
}
minAllocMb = conf.getInt(YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB,
YarnConfiguration.DEFAULT_RM_SCHEDULER_MINIMUM_ALLOCATION_MB);
minAllocVcores = conf.getInt(YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_VCORES,
YarnConfiguration.DEFAULT_RM_SCHEDULER_MINIMUM_ALLOCATION_VCORES);
super.serviceInit(conf);
}
可以看到,这里牵涉到了很多的默认配置的地址,所以,看代码是必要的,我们想要把配置文件全部搞通,其实认真搞通代码,就轻而易举了。
masterService = createApplicationMasterService();
addService(masterService);
看看这一段,新建了一个ApplicationMasterService,用于对所有提交的ApplicationMaster进行管理:其中的serviceInit方法不复杂,重点在其serviceStart方法中:
下面,我们注意看下这段代码:
this.rmAppManager = createRMAppManager();
// Register event handler for RMAppManagerEvents
this.rmDispatcher.register(RMAppManagerEventType.class, this.rmAppManager);
this.rmDTSecretManager = createRMDelegationTokenSecretManager(this.rmContext);
rmContext.setRMDelegationTokenSecretManager(this.rmDTSecretManager);
clientRM = createClientRMService();
rmContext.setClientRMService(clientRM);
addService(clientRM);
adminService = createAdminService(clientRM, masterService, resourceTracker);
addService(adminService);
this.applicationMasterLauncher = createAMLauncher();
this.rmDispatcher.register(AMLauncherEventType.class, this.applicationMasterLauncher);
这里有些东西需要注意,新建了一个ClientRMService,负责客户端与RM的所有交互,对于客户端的每个请求,我们都可以在ClientRMService下面找到相应的代码。
下面接着看,AMLauncher,这个很重要,负责ApplicationMaster的启动,必须重点分析下。
this.applicationMasterLauncher = createAMLauncher();
this.rmDispatcher.register(AMLauncherEventType.class, this.applicationMasterLauncher);
基于RM的管家,建立了一个AMLauncher:
protected ApplicationMasterLauncher createAMLauncher() {
return new ApplicationMasterLauncher(this.rmContext);
}
分析下其中的serviceInit方法:
@Override
protected void serviceStart() throws Exception {
launcherHandlingThread.start();
super.serviceStart();
}
看看其中的launchHandlingThread:
private class LauncherThread extends Thread {
public LauncherThread() {
super("ApplicationMaster Launcher");
}
@Override
public void run() {
while (!this.isInterrupted()) {
Runnable toLaunch;
try {
toLaunch = masterEvents.take();
launcherPool.execute(toLaunch);
} catch (InterruptedException e) {
LOG.warn(this.getClass().getName() + " interrupted. Returning.");
return;
}
}
}
}
其内部封装了一个线程池,对于调度给自身的事件不断进行处理:
在serviceInit方法的最后,调用了父类的serviceInit方法,我们看下其父类CompositeService的serviceInit方法:
protected void serviceInit(Configuration conf) throws Exception {
List<Service> services = getServices();
if (LOG.isDebugEnabled()) {
LOG.debug(getName() + ": initing services, size=" + services.size());
}
for (Service service : services) {
service.init(conf);
}
super.serviceInit(conf);
}
public List<Service> getServices() {
synchronized (serviceList) {
return Collections.unmodifiableList(serviceList);
}
}
在看源码的时候,发现RM的serviceInit方法中,所有服务都有一个操作:
protected void addService(Service service) {
if (LOG.isDebugEnabled()) {
LOG.debug("Adding service " + service.getName());
}
synchronized (serviceList) {
serviceList.add(service);
}
}
调用了父类中的addService方法,把所有服务添加到了serviceList中,在这里统一予以初始化,不得不说设计很精妙,在我们分析源码的时候,必须注意看下service相关的类;最顶层的父类是Serivce类,这是个接口,定义了服务的基本操作和生命周期,其唯一的实现类,是个抽象类,为AbstractService,一般来说,并不复杂的服务继承并实现AbstractService即可,复杂的服务如RM就会继承Compositeservice(AbstractService的子类)。
在RM服务初始化完毕之后,我们接着看服务的启动部分。