Overview
图 1-1
当Task 被创建出来之后,处于NEW 状态,并等待 T_SCHEDULE 事件,该事件将由Job对像触发。
T_SCHEDULE Handle
在Task收到该 事件后,首先会创建一个Attempt对像并注册,该 对像将用来,执行并跟踪task的执行,对Map task 和Reduce task 分别各自不同的实现,这里以Map为例:
TaskAttemptImpl attempt = createAttempt();
attempt.setAvataar(avataar);
if (LOG.isDebugEnabled()) {
LOG.debug("Created attempt " + attempt.getID());
}
switch (attempts.size()) {
case 0:
attempts = Collections.singletonMap(attempt.getID(),
(TaskAttempt) attempt);
break;
case 1:
Map<TaskAttemptId, TaskAttempt> newAttempts
= new LinkedHashMap<TaskAttemptId, TaskAttempt>(maxAttempts);
newAttempts.putAll(attempts);
attempts = newAttempts;
attempts.put(attempt.getID(), attempt);
break;
default:
attempts.put(attempt.getID(), attempt);
break;
}
@Override
protected TaskAttemptImpl createAttempt() {
return new MapTaskAttemptImpl(getID(), nextAttemptNumber,
eventHandler, jobFile,
partition, taskSplitMetaInfo, conf, taskAttemptListener,
jobToken, credentials, clock, appContext);
}
之后,通知Attempt对像,事件进入TaskAttempt FSM, Attempt将向Master申请运行该 task 所需的资源, Task进入SCHEDULED状态,并等待Attempt申请资源成功后的 T_ATTEMPT_LAUNCHED事件。
T_ATTEMPT_LAUNCHED Handle
收到该 事件之后,Task 记录下对应的Attempt已经提交,并进入Running 状态:开始等待:T_ATTEMPT_COMMIT_PENDING,T_ADD_SPEC_ATTEMPT,只到收到T_ATTEMPT_SUCCEEDED事件。
T_ATTEMPT_COMMIT_PENDING
收到该事件之后,Task记录正在运行中的Attempt,如果task已经有了对应的Attempt 那和以,偿试杀掉新的Attempt:
if (task.commitAttempt == null) {
// TODO: validate attemptID
task.commitAttempt = attemptID;
LOG.info(attemptID + " given a go for committing the task output.");
} else {
// Don't think this can be a pluggable decision, so simply raise an
// event for the TaskAttempt to delete its output.
LOG.info(task.commitAttempt
+ " already given a go for committing the task output, so killing "
+ attemptID);
task.eventHandler.handle(new TaskAttemptEvent(
attemptID, TaskAttemptEventType.TA_KILL));
}
T_ADD_SPEC_ATTEMPT Handle
收到该 事件,task 偿试,创建另一个新的Attempt 对像来运行该 task, Just for speculation now, 以后会用来,并发处理当前task。
T_ATTEMPT_SUCCEEDED Handle
更新task 状态,并退出:
task.handleTaskAttemptCompletion(
taskAttemptId,
TaskAttemptCompletionEventStatus.SUCCEEDED);
task.finishedAttempts.add(taskAttemptId);
task.inProgressAttempts.remove(taskAttemptId);
task.successfulAttempt = taskAttemptId;
task.sendTaskSucceededEvents();
for (TaskAttempt attempt : task.attempts.values()) {
if (attempt.getID() != task.successfulAttempt &&
// This is okay because it can only talk us out of sending a
// TA_KILL message to an attempt that doesn't need one for
// other reasons.
!attempt.isFinished()) {
LOG.info("Issuing kill to other attempt " + attempt.getID());
task.eventHandler.handle(
new TaskAttemptEvent(attempt.getID(),
TaskAttemptEventType.TA_KILL));
}
}
task.finished(TaskStateInternal.SUCCEEDED);
如果在task运行过和出错,task 会偿试创建新的Attempt对像重新运行该 task。