前言
Hadoop2.x.x版本的底层实现中作了很多优化:用状态机对各种对象生命周期和状态转移进行管理;采用事件机制避免线程同步与阻塞;采用Protocol Buffers优化RPC性能;采用Apache Avro优化日志等。本文主要针对YARN中状态机的实现进行分析,在这个过程中,会捎带一些事件的内容。
事件
YARN中的很多组件之间进行通信,主要借助于事件。为了可读性、可维护性及可扩展性,YARN中的事件由事件名称和事件类型组成。比如JobImpl处理的事件名称为JobEvent,而事件类型为JobEventType。有关Hadoop2.6.0的事件分类与实现可以参考《Hadoop2.6.0的事件分类与实现》一文。
状态
YARN中的每个组件都有其自身所处的一系列状态,比如JobImpl内部的一系列状态都定义在JobStateInternal中,如代码清单1所示。
代码清单1
- public enum JobStateInternal {
- NEW,
- SETUP,
- INITED,
- RUNNING,
- COMMITTING,
- SUCCEEDED,
- FAIL_WAIT,
- FAIL_ABORT,
- FAILED,
- KILL_WAIT,
- KILL_ABORT,
- KILLED,
- ERROR,
- REBOOT
- }
我们看到JobImpl的内部状态包括新建(NEW)、初始化(INITED)、运行中(RUNNING)、提交中(COMMITTING)、成功(SUCCEEDED)、失败(FAILED)等。
转换(过渡)
我们已经了解了事件与状态的基本实现与概念,那么事件与状态有什么关系?从哲学角度讲,状态是一个事物的静止属性,而事件则是一个事物与外界沟通的桥梁,只有静止却没有变化,那么它只是一滩死水。事物只有在接收信息后动起来,才算与外界有了互动。一个事物动起来就会潜移默化的发生改变,它内部就会发生转换。一个对象当前处于状态state0,当对象接收到事件Event后,将引发转换动作transition,最终当前对象的状态过渡到state1,这个过程可以用图1来表示。
图1 状态迁移示例
YARN中与过渡相关的类图如图2所示。
图2 YARN中与过渡相关的类图
YARN中的各个组件的变化都离不开状态的过度与变化,于是对这种行为进行了抽象,这种转换分为两类:单弧过渡与多弧过渡。(这种翻译不知道是否准确,我认为从一个状态到另一个状态的转换发生时,就像是在两个状态之间划了一道弧线一样)
单弧过渡
YARN中单弧过渡的实现代码如代码清单2,它的作用是当有限状态机(FSM)中的状态转换为已经注册到状态机的某种状态时,伴随的行为。
代码清单2
- @Public
- @Evolving
- public interface SingleArcTransition<OPERAND, EVENT> {
-
-
-
-
-
-
-
- public void transition(OPERAND operand, EVENT event);
-
- }
由于SingleArcTransition的具体实现类只负责接收到事件后的具体操作或行为,并没有包含状态相关的信息,所以在状态机执行状态过渡时,并不是直接调用SingleArcTransition具体实现类的transition方法,而是由接口Transition定义(见代码清单3)真正的转态过渡(包括行为和状态改变)。
代码清单3
- private interface Transition<OPERAND, STATE extends Enum<STATE>,
- EVENTTYPE extends Enum<EVENTTYPE>, EVENT> {
- STATE doTransition(OPERAND operand, STATE oldState,
- EVENT event, EVENTTYPE eventType);
- }
SingleInternalArc作为Transition接口的实现类,在代理SingleArcTransition的同时,负责状态变换,见代码清单4。
代码清单4
- private class SingleInternalArc
- implements Transition<OPERAND, STATE, EVENTTYPE, EVENT> {
-
- private STATE postState;
- private SingleArcTransition<OPERAND, EVENT> hook;
-
- SingleInternalArc(STATE postState,
- SingleArcTransition<OPERAND, EVENT> hook) {
- this.postState = postState;
- this.hook = hook;
- }
-
- @Override
- public STATE doTransition(OPERAND operand, STATE oldState,
- EVENT event, EVENTTYPE eventType) {
- if (hook != null) {
- hook.transition(operand, event);
- }
- return postState;
- }
- }
多弧过渡
YARN中多弧过渡的实现代码如代码清单5,它的作用是当有限状态机(FSM)中的状态转换为已经注册到状态机的多个有效状态中的一个时,伴随的行为与操作。
代码清单5
- @Public
- @Evolving
- public interface MultipleArcTransition
- <OPERAND, EVENT, STATE extends Enum<STATE>> {
-
-
-
-
-
-
-
-
-
- public STATE transition(OPERAND operand, EVENT event);
-
- }
由于MultipleArcTransition的具体实现类只负责接收到事件后的具体操作或行为,并没有包含状态相关的信息,所以在状态机执行状态过渡时,并不是直接调用MultipleArcTransition具体实现类的transition方法,而是通过代理类MultipleInternalArc,见代码清单6
。MultipleInternalArc也实现了Transition接口,并在代理MultipleArcTransition的转换行为的同时,负责状态变换。
代码清单6
- private class MultipleInternalArc
- implements Transition<OPERAND, STATE, EVENTTYPE, EVENT>{
-
-
- private Set<STATE> validPostStates;
- private MultipleArcTransition<OPERAND, EVENT, STATE> hook;
-
- MultipleInternalArc(Set<STATE> postStates,
- MultipleArcTransition<OPERAND, EVENT, STATE> hook) {
- this.validPostStates = postStates;
- this.hook = hook;
- }
-
- @Override
- public STATE doTransition(OPERAND operand, STATE oldState,
- EVENT event, EVENTTYPE eventType)
- throws InvalidStateTransitonException {
- STATE postState = hook.transition(operand, event);
-
- if (!validPostStates.contains(postState)) {
- throw new InvalidStateTransitonException(oldState, eventType);
- }
- return postState;
- }
- }
为了将所有状态机中的状态过渡与状态建立起映射关系,YARN中提供了ApplicableTransition接口用于将SingleInternalArc和MultipleInternalArc添加到状态机的拓扑表中,提高在检索状态对应的过渡实现时的性能,ApplicableTransition的实现类为ApplicableSingleOrMultipleTransition类,其apply方法用于代理SingleInternalArc和MultipleInternalArc,将它们添加到状态拓扑表中。ApplicableTransition接口的定义见代码清单7,ApplicableSingleOrMultipleTransition的实现见代码清单8。
代码清单7
- private interface ApplicableTransition
- <OPERAND, STATE extends Enum<STATE>,
- EVENTTYPE extends Enum<EVENTTYPE>, EVENT> {
- void apply(StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> subject);
- }
代码清单8
- static private class ApplicableSingleOrMultipleTransition
- <OPERAND, STATE extends Enum<STATE>,
- EVENTTYPE extends Enum<EVENTTYPE>, EVENT>
- implements ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> {
- final STATE preState;
- final EVENTTYPE eventType;
- final Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition;
-
- ApplicableSingleOrMultipleTransition
- (STATE preState, EVENTTYPE eventType,
- Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition) {
- this.preState = preState;
- this.eventType = eventType;
- this.transition = transition;
- }
-
- @Override
- public void apply
- (StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> subject) {
- Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>> transitionMap
- = subject.stateMachineTable.get(preState);
- if (transitionMap == null) {
-
-
-
- transitionMap = new HashMap<EVENTTYPE,
- Transition<OPERAND, STATE, EVENTTYPE, EVENT>>();
- subject.stateMachineTable.put(preState, transitionMap);
- }
- transitionMap.put(eventType, transition);
- }
- }
可以看到ApplicableSingleOrMultipleTransition的apply方法就是为构建状态拓扑表而开发的。
状态机
YARN中状态机的实现类是StateMachineFactory,它主要包含4个属性信息:
- transitionsListNode:过渡列表节点。根据其名字不太容易理解,我这里说得简单点,就是将状态机的一个个过渡的ApplicableTransition实现串联为一个列表,每个节点包含一个ApplicableTransition实现及指向下一个节点的引用,其实现见代码清单9所示。
代码清单9
- private class TransitionsListNode {
- final ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> transition;
- final TransitionsListNode next;
-
- TransitionsListNode
- (ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> transition,
- TransitionsListNode next) {
- this.transition = transition;
- this.next = next;
- }
- }
transitionsListNode形成的过渡列表节点可以用图3表示。
图3 transitionsListNode过渡链表结构
- stateMachineTable:状态拓扑表,为了提高检索状态对应的过渡map而冗余的数据结构,此结构在optimized为真时,通过对transitionsListNode链表进行处理产生。stateMachineTable的结构可以用图4来表示。
图4 状态拓扑表数据结构
- defaultInitialState:对象创建时,内部有限状态机的默认初始状态。比如:JobImpl的内部状态机默认初始状态是JobStateInternal.NEW。
- optimized:布尔类型,用于标记当前状态机是否需要优化性能,即构建状态拓扑表stateMachineTable。
共有构造器
StateMachineFactory的公有构造器只有一个,其实现见代码清单10。
代码清单10
- public StateMachineFactory(STATE defaultInitialState) {
- this.transitionsListNode = null;
- this.defaultInitialState = defaultInitialState;
- this.optimized = false;
- this.stateMachineTable = null;
- }
可见新建的
StateMachineFactory
实例只有一个默认初始状态参数defaultInitialState。
私有构造器
StateMachineFactory的私有构造器有两个,其中代码清单11中的构造器在addTransition方法中使用。从其实现看出,此构造器的主要作用是构建transitionsListNode链表。
代码清单11
- private StateMachineFactory
- (StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> that,
- ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> t) {
- this.defaultInitialState = that.defaultInitialState;
- this.transitionsListNode
- = new TransitionsListNode(t, that.transitionsListNode);
- this.optimized = false;
- this.stateMachineTable = null;
- }
而代码清单12中的构造器则在installTopology方法中使用。
代码清单12
- private StateMachineFactory
- (StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> that,
- boolean optimized) {
- this.defaultInitialState = that.defaultInitialState;
- this.transitionsListNode = that.transitionsListNode;
- this.optimized = optimized;
- if (optimized) {
- makeStateMachineTable();
- } else {
- stateMachineTable = null;
- }
- }
代码清单12中的构造器当optimized参数为true时,调用了makeStateMachineTable方法,makeStateMachineTable的实现见代码清单13所示。
代码清单13
- private void makeStateMachineTable() {
- Stack<ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT>> stack =
- new Stack<ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT>>();
-
- Map<STATE, Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>>>
- prototype = new HashMap<STATE, Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>>>();
-
- prototype.put(defaultInitialState, null);
-
-
-
- stateMachineTable
- = new EnumMap<STATE, Map<EVENTTYPE,
- Transition<OPERAND, STATE, EVENTTYPE, EVENT>>>(prototype);
-
- for (TransitionsListNode cursor = transitionsListNode;
- cursor != null;
- cursor = cursor.next) {
- stack.push(cursor.transition);
- }
-
- while (!stack.isEmpty()) {
- stack.pop().apply(this);
- }
- }
通过阅读makeStateMachineTable的实现,不难看出其作用:
- 创建堆栈stack,用于将transitionsListNode链表中各个节点持有的ApplicableSingleOrMultipleTransition压入栈中;
- 创建状态拓扑表stateMachineTable,并在此拓扑表中插入一个额外的默认初始状态defaultInitialState与null的映射;
- 迭代访问transitionsListNode链表,并将各个节点持有的ApplicableSingleOrMultipleTransition压入栈中;
- 依次弹出栈顶的ApplicableSingleOrMultipleTransition,并应用其apply方法(已在前面小节介绍),持续不断的构建状态拓扑表stateMachineTable。
至此,关于YARN状态机的基本概念和接口叙述完毕。下面分析状态机构建过程。
状态机构建
为了简化叙述,本节以JobImpl中状态机的构建为例。由于JobImpl的状态机预设的(调用addTransition方法)加入的ApplicableSingleOrMultipleTransition非常多,我们节选其中的2个作为典型进行分析。最后还会分析installTopology方法的实现。JobImpl中状态机的定义见代码清单14。
代码清单14
- protected static final
- StateMachineFactory<JobImpl, JobStateInternal, JobEventType, JobEvent>
- stateMachineFactory
- = new StateMachineFactory<JobImpl, JobStateInternal, JobEventType, JobEvent>
- (JobStateInternal.NEW)
-
-
- .addTransition(JobStateInternal.NEW, JobStateInternal.NEW,
- JobEventType.JOB_DIAGNOSTIC_UPDATE,
- DIAGNOSTIC_UPDATE_TRANSITION)
- .addTransition(JobStateInternal.NEW, JobStateInternal.NEW,
- JobEventType.JOB_COUNTER_UPDATE, COUNTER_UPDATE_TRANSITION)
- .addTransition
- (JobStateInternal.NEW,
- EnumSet.of(JobStateInternal.INITED, JobStateInternal.NEW),
- JobEventType.JOB_INIT,
- new InitTransition())
-
-
- .installTopology();
构建JobImpl的状态机的步骤如下:
- 调用StateMachineFactory构造器创建一个初始的状态机;
- 调用addTransition(STATE preState, STATE postState, EVENTTYPE eventType, SingleArcTransition<OPERAND, EVENT> hook)方法添加单弧过渡。从其实现(见代码清单15)可以知道addTransition方法将SingleArcTransition封装为SingleInternalArc,然后将SingleInternalArc封装为ApplicableSingleOrMultipleTransition,最后调用之前说的第一个私有构造器构建transitionsListNode链表;
- 调用addTransition(STATE preState, Set<STATE> postStates, EVENTTYPE eventType, MultipleArcTransition<OPERAND, EVENT, STATE> hook)方法添加多弧过渡。从其实现(见代码清单16)可以知道addTransition方法将MultipleArcTransition封装为MultipleInternalArc,然后将MultipleInternalArc封装为ApplicableSingleOrMultipleTransition,最后调用之前说的第一个私有构造器构建transitionsListNode链表;
- 最后调用installTopology方法,其实现见代码清单17。installTopology正是在使用之前说的第二个私有构造器构建状态拓扑表stateMachineTable;
代码清单15
- public StateMachineFactory
- <OPERAND, STATE, EVENTTYPE, EVENT>
- addTransition(STATE preState, STATE postState,
- EVENTTYPE eventType,
- SingleArcTransition<OPERAND, EVENT> hook){
- return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT>
- (this, new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT>
- (preState, eventType, new SingleInternalArc(postState, hook)));
- }
代码清单16
- public StateMachineFactory
- <OPERAND, STATE, EVENTTYPE, EVENT>
- addTransition(STATE preState, Set<STATE> postStates,
- EVENTTYPE eventType,
- MultipleArcTransition<OPERAND, EVENT, STATE> hook){
- return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT>
- (this,
- new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT>
- (preState, eventType, new MultipleInternalArc(postStates, hook)));
- }
代码清单17
- public StateMachineFactory
- <OPERAND, STATE, EVENTTYPE, EVENT>
- installTopology() {
- return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT>(this, true);
- }
再来看看代码清单14中列出的DIAGNOSTIC_UPDATE_TRANSITION,其实现如下。
- private static final DiagnosticsUpdateTransition
- DIAGNOSTIC_UPDATE_TRANSITION = new DiagnosticsUpdateTransition();
DiagnosticsUpdateTransition的代码实现如下,可见其类型的确是SingleArcTransition。COUNTER_UPDATE_TRANSITION也是类似,故不再赘述。
- private static class DiagnosticsUpdateTransition implements
- SingleArcTransition<JobImpl, JobEvent> {
- @Override
- public void transition(JobImpl job, JobEvent event) {
- job.addDiagnostic(((JobDiagnosticsUpdateEvent) event)
- .getDiagnosticUpdate());
- }
- }
代码清单14中的InitTransition,其实现如下。具体逻辑此处就不必详述了,有兴趣的同学可以继续进行分析。
- public static class InitTransition
- implements MultipleArcTransition<JobImpl, JobEvent, JobStateInternal> {
-
- @Override
- public JobStateInternal transition(JobImpl job, JobEvent event) {
-
- }
- }
- }
状态转移
StateMachineFactory状态转换的代码如下。
- private STATE doTransition
- (OPERAND operand, STATE oldState, EVENTTYPE eventType, EVENT event)
- throws InvalidStateTransitonException {
-
-
-
- Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>> transitionMap
- = stateMachineTable.get(oldState);
- if (transitionMap != null) {
- Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition
- = transitionMap.get(eventType);
- if (transition != null) {
- return transition.doTransition(operand, oldState, event, eventType);
- }
- }
- throw new InvalidStateTransitonException(oldState, eventType);
- }
通过阅读其实现,doTransition方法的执行步骤如下:
- 根据组件(例如JobImpl)当前状态(oldState)从状态拓扑表stateMachineTable中获取oldState对应的Transition映射表;
- 如果oldState对应的Transition映射表不为null,则根据事件类型EVENTTYPE从映射表中获取对应的Transition;
- 如果存在对应的Transition,那么调用其doTransition方法进行真正的转态转移(过渡)。
后记:个人总结整理的《深入理解Spark:核心思想与源码分析》一书现在已经正式出版上市,目前京东、当当、天猫等网站均有销售,欢迎感兴趣的同学购买。
京东:http://item.jd.com/11846120.html
当当:http://product.dangdang.com/23838168.html