概要
首先启动JobMaster,初始化JobMaster对象时,会将JobGraph转化ExectionGraph并且生成Execution拓扑图,调度器根据拓扑图进行调度,先向ResourceManager申请资源后,而RM会向TaskExecutor申请资源,资源申请完成后,部署Flink任务
整体架构流程
jobMaster.onStart()
->startJobExecution()
->startScheduling()
schedulerNG调度器对象,实现为DefaultScheduler
->startScheduling() 开始调度资源申请和任务部署
->startSchedulingInternal()
schedulingStrategy调度策略接口,当前为流处理模式,PipelinedRegionSchedulingStrategy
是当前的实现类.批处理是另一种实现类
->startScheduling()
->maybeScheduleRegions()
->scheduleRegion()
->allocateSlotsAndDeploy()
->allocateSlotsAndDeploy()
->waitForSlotsAndDeploy()到这一步,开始任务部署
->deployAll(deploymentHandles)
for循环,一个一个subtask的部署
->deployOrHandleError(deploymentHandle)
没有异常->deployTaskSafe(execution)
->executionOperations.deploy(execution)
->execution.deploy()
这个方法代码有点多,粘贴出源代码并注释讲解
public void deploy() throws JobException {
// 检查在JobMaster的主线程中运行
assertRunningInJobMasterMainThread();
// 获取已分配的资源,即逻辑槽位
final LogicalSlot slot = assignedResource;
// 检查资源是否已分配,如果没有分配则抛出异常
checkNotNull(
slot,
"In order to deploy the execution we first have to assign a resource via tryAssignResource.");
// Check if the TaskManager died in the meantime
// This only speeds up the response to TaskManagers failing concurrently to deployments.
// The more general check is the rpcTimeout of the deployment call
if (!slot.isAlive()) {
throw new JobException("Target slot (TaskManager) for deployment is no longer alive.");
}
// make sure exactly one deployment call happens from the correct state
// 确保从正确的状态只发生一个部署调用
ExecutionState previous = this.state;
if (previous == SCHEDULED) {
// 如果状态转换成功,表示当前实例开始部署
if (!transitionState(previous, DEPLOYING)) {
// race condition, someone else beat us to the deploying call.
// this should actually not happen and indicates a race somewhere else
//修改失败抛出异常
throw new IllegalStateException(
"Cannot deploy task: Concurrent deployment call race.");
}
} else {
// vertex may have been cancelled, or it was already scheduled
// 顶点可能已被取消,或者已经处于调度状态
//之前状态不等于SCHEDULED抛出异常
throw new IllegalStateException(
"The vertex must be in SCHEDULED state to be deployed. Found state "
+ previous);
}
// 检查当前实例是否已分配给该Slot
if (this != slot.getPayload()) {
throw new IllegalStateException(
String.format(
"The execution %s has not been assigned to the assigned slot.", this));
}
try {
// race double check, did we fail/cancel and do we need to release the slot?
//判断状态是否等于DEPLOYING
if (this.state != DEPLOYING) {
//释放Slot
slot.releaseSlot(
new FlinkException(
"Actual state of execution "
+ this
+ " ("
+ state
+ ") does not match expected state DEPLOYING."));
return;
}
LOG.info(
"Deploying {} (attempt #{}) with attempt id {} and vertex id {} to {} with allocation id {}",
vertex.getTaskNameWithSubtaskIndex(),
getAttemptNumber(),
attemptId,
vertex.getID(),
getAssignedResourceLocation(),
slot.getAllocationId());
// 获取任务部署描述符的工厂,用于创建一个新的部署描述符
// deployment 变量将保存创建的部署描述符实例
final TaskDeploymentDescriptor deployment =
vertex.getExecutionGraphAccessor()
.getTaskDeploymentDescriptorFactory()
.createDeploymentDescriptor(
this, // 当前对象(可能是任务的某个实例)
slot.getAllocationId(), // 分配给任务的槽的ID
taskRestore, // 任务恢复相关的对象(此处设置为null以便进行垃圾回收)
producedPartitions.values());// 产生的分区值
// null taskRestore to let it be GC'ed
taskRestore = null;
// 获取任务管理器网关,用于与TaskManager进行通信
final TaskManagerGateway taskManagerGateway = slot.getTaskManagerGateway();
// 获取作业主线程执行器,用于在主线程上执行某些操作
final ComponentMainThreadExecutor jobMasterMainThreadExecutor =
vertex.getExecutionGraphAccessor().getJobMasterMainThreadExecutor();
// 通知当前顶点(vertex)一个待处理的部署
getVertex().notifyPendingDeployment(this);
// We run the submission in the future executor so that the serialization of large TDDs
// does not block
// the main thread and sync back to the main thread once submission is completed.
// 使用异步执行器提交任务,以避免序列化大型TDDs(TaskDeploymentDescriptor)时阻塞主线程
// 一旦提交完成,同步回主线程
CompletableFuture.supplyAsync(
() -> taskManagerGateway.submitTask(deployment, rpcTimeout), executor)
.thenCompose(Function.identity())
.whenCompleteAsync(
(ack, failure) -> {
if (failure == null) {
// 如果提交成功,通知顶点已完成部署,之前pending 现在部署成功移除
vertex.notifyCompletedDeployment(this);
} else {
// 提取实际的异常信息
final Throwable actualFailure =
ExceptionUtils.stripCompletionException(failure);
if (actualFailure instanceof TimeoutException) {
String taskname =
vertex.getTaskNameWithSubtaskIndex()
+ " ("
+ attemptId
+ ')';
markFailed(
new Exception(
"Cannot deploy task "
+ taskname
+ " - TaskManager ("
+ getAssignedResourceLocation()
+ ") not responding after a rpcTimeout of "
+ rpcTimeout,
actualFailure));
} else {
markFailed(actualFailure);
}
}
},
// 使用作业主线程执行器来处理完成回调
jobMasterMainThreadExecutor);
} catch (Throwable t) {
markFailed(t);
}
}
最后会通过RPC通信调用到TaskExecutor的submitTask方法,至此任务部署完成。