Flink1.19 Scheduler调度器及任务部署

概要

首先启动JobMaster,初始化JobMaster对象时,会将JobGraph转化ExectionGraph并且生成Execution拓扑图,调度器根据拓扑图进行调度,先向ResourceManager申请资源后,而RM会向TaskExecutor申请资源,资源申请完成后,部署Flink任务

整体架构流程

jobMaster.onStart()

->startJobExecution()

->startScheduling()

schedulerNG调度器对象,实现为DefaultScheduler

->startScheduling() 开始调度资源申请和任务部署

->startSchedulingInternal()

schedulingStrategy调度策略接口,当前为流处理模式,PipelinedRegionSchedulingStrategy

是当前的实现类.批处理是另一种实现类

->startScheduling()

->maybeScheduleRegions()

->scheduleRegion()

->allocateSlotsAndDeploy()

->allocateSlotsAndDeploy()

->waitForSlotsAndDeploy()到这一步,开始任务部署

->deployAll(deploymentHandles)

for循环,一个一个subtask的部署

->deployOrHandleError(deploymentHandle)

没有异常->deployTaskSafe(execution)

->executionOperations.deploy(execution)

->execution.deploy()

这个方法代码有点多,粘贴出源代码并注释讲解

public void deploy() throws JobException {
    // 检查在JobMaster的主线程中运行
    assertRunningInJobMasterMainThread();

    // 获取已分配的资源,即逻辑槽位
    final LogicalSlot slot = assignedResource;

    // 检查资源是否已分配,如果没有分配则抛出异常
    checkNotNull(
            slot,
            "In order to deploy the execution we first have to assign a resource via tryAssignResource.");

    // Check if the TaskManager died in the meantime
    // This only speeds up the response to TaskManagers failing concurrently to deployments.
    // The more general check is the rpcTimeout of the deployment call
    if (!slot.isAlive()) {
        throw new JobException("Target slot (TaskManager) for deployment is no longer alive.");
    }

    // make sure exactly one deployment call happens from the correct state
    // 确保从正确的状态只发生一个部署调用
    ExecutionState previous = this.state;
    if (previous == SCHEDULED) {
        // 如果状态转换成功,表示当前实例开始部署
        if (!transitionState(previous, DEPLOYING)) {
            // race condition, someone else beat us to the deploying call.
            // this should actually not happen and indicates a race somewhere else
           //修改失败抛出异常
            throw new IllegalStateException(
                    "Cannot deploy task: Concurrent deployment call race.");
        }
    } else {
        // vertex may have been cancelled, or it was already scheduled
        // 顶点可能已被取消,或者已经处于调度状态
        //之前状态不等于SCHEDULED抛出异常
        throw new IllegalStateException(
                "The vertex must be in SCHEDULED state to be deployed. Found state "
                        + previous);
    }
    // 检查当前实例是否已分配给该Slot
    if (this != slot.getPayload()) {
        throw new IllegalStateException(
                String.format(
                        "The execution %s has not been assigned to the assigned slot.", this));
    }

    try {

        // race double check, did we fail/cancel and do we need to release the slot?
        //判断状态是否等于DEPLOYING
        if (this.state != DEPLOYING) {
            //释放Slot
            slot.releaseSlot(
                    new FlinkException(
                            "Actual state of execution "
                                    + this
                                    + " ("
                                    + state
                                    + ") does not match expected state DEPLOYING."));
            return;
        }
      
        LOG.info(
                "Deploying {} (attempt #{}) with attempt id {} and vertex id {} to {} with allocation id {}",
                vertex.getTaskNameWithSubtaskIndex(),
                getAttemptNumber(),
                attemptId,
                vertex.getID(),
                getAssignedResourceLocation(),
                slot.getAllocationId());
       // 获取任务部署描述符的工厂,用于创建一个新的部署描述符
       // deployment 变量将保存创建的部署描述符实例
        final TaskDeploymentDescriptor deployment =
                vertex.getExecutionGraphAccessor()
                        .getTaskDeploymentDescriptorFactory()
                        .createDeploymentDescriptor(
                                this, // 当前对象(可能是任务的某个实例)
                                slot.getAllocationId(), // 分配给任务的槽的ID
                                taskRestore,  // 任务恢复相关的对象(此处设置为null以便进行垃圾回收)
                                producedPartitions.values());// 产生的分区值

        // null taskRestore to let it be GC'ed
        taskRestore = null;
        // 获取任务管理器网关,用于与TaskManager进行通信
        final TaskManagerGateway taskManagerGateway = slot.getTaskManagerGateway();
       // 获取作业主线程执行器,用于在主线程上执行某些操作
        final ComponentMainThreadExecutor jobMasterMainThreadExecutor =
                vertex.getExecutionGraphAccessor().getJobMasterMainThreadExecutor();
        // 通知当前顶点(vertex)一个待处理的部署
        getVertex().notifyPendingDeployment(this);
        // We run the submission in the future executor so that the serialization of large TDDs
        // does not block
        // the main thread and sync back to the main thread once submission is completed.
        // 使用异步执行器提交任务,以避免序列化大型TDDs(TaskDeploymentDescriptor)时阻塞主线程
        // 一旦提交完成,同步回主线程
        CompletableFuture.supplyAsync(
                        () -> taskManagerGateway.submitTask(deployment, rpcTimeout), executor)
                .thenCompose(Function.identity())
                .whenCompleteAsync(
                        (ack, failure) -> {
                            if (failure == null) {
                                // 如果提交成功,通知顶点已完成部署,之前pending 现在部署成功移除
                                vertex.notifyCompletedDeployment(this);
                            } else {
                                // 提取实际的异常信息
                                final Throwable actualFailure =
                                        ExceptionUtils.stripCompletionException(failure);

                                if (actualFailure instanceof TimeoutException) {
                                   
                                    String taskname =
                                            vertex.getTaskNameWithSubtaskIndex()
                                                    + " ("
                                                    + attemptId
                                                    + ')';

                                    markFailed(
                                            new Exception(
                                                    "Cannot deploy task "
                                                            + taskname
                                                            + " - TaskManager ("
                                                            + getAssignedResourceLocation()
                                                            + ") not responding after a rpcTimeout of "
                                                            + rpcTimeout,
                                                    actualFailure));
                                } else {
    
                                    markFailed(actualFailure);
                                }
                            }
                        },
                        // 使用作业主线程执行器来处理完成回调
                        jobMasterMainThreadExecutor);
    } catch (Throwable t) {
        markFailed(t);
    }
}
最后会通过RPC通信调用到TaskExecutor的submitTask方法,至此任务部署完成。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值