Flink1.19 Scheduler调度器及任务部署

BigDataLover520

已于 2024-08-15 15:55:49 修改

阅读量715

点赞数 21

文章标签： flink java

于 2024-08-15 15:55:04 首次发布

本文链接：https://blog.csdn.net/m0_73904819/article/details/141222895

版权

文章目录

概要

首先启动JobMaster，初始化JobMaster对象时，会将JobGraph转化ExectionGraph并且生成Execution拓扑图，调度器根据拓扑图进行调度，先向ResourceManager申请资源后，而RM会向TaskExecutor申请资源，资源申请完成后，部署Flink任务

整体架构流程

jobMaster.onStart()

->startJobExecution()

->startScheduling()

schedulerNG调度器对象，实现为DefaultScheduler

->startScheduling() 开始调度资源申请和任务部署

->startSchedulingInternal()

schedulingStrategy调度策略接口，当前为流处理模式，PipelinedRegionSchedulingStrategy

是当前的实现类.批处理是另一种实现类

->startScheduling()

->maybeScheduleRegions()

->scheduleRegion()

->allocateSlotsAndDeploy()

->waitForSlotsAndDeploy（）到这一步，开始任务部署

->deployAll(deploymentHandles)

for循环，一个一个subtask的部署

->deployOrHandleError(deploymentHandle)

没有异常->deployTaskSafe(execution)

->executionOperations.deploy(execution)

->execution.deploy()

这个方法代码有点多，粘贴出源代码并注释讲解

public void deploy() throws JobException {
    // 检查在JobMaster的主线程中运行
    assertRunningInJobMasterMainThread();

    // 获取已分配的资源，即逻辑槽位
    final LogicalSlot slot = assignedResource;

    // 检查资源是否已分配，如果没有分配则抛出异常
    checkNotNull(
            slot,
            "In order to deploy the execution we first have to assign a resource via tryAssignResource.");

    // Check if the TaskManager died in the meantime
    // This only speeds up the response to TaskManagers failing concurrently to deployments.
    // The more general check is the rpcTimeout of the deployment call
    if (!slot.isAlive()) {
        throw new JobException("Target slot (TaskManager) for deployment is no longer alive.");
    }

    // make sure exactly one deployment call happens from the correct state
    // 确保从正确的状态只发生一个部署调用
    ExecutionState previous = this.state;
    if (previous == SCHEDULED) {
        // 如果状态转换成功，表示当前实例开始部署
        if (!transitionState(previous, DEPLOYING)) {
            // race condition, someone else beat us to the deploying call.
            // this should actually not happen and indicates a race somewhere else
           //修改失败抛出异常
            throw new IllegalStateException(
                    "Cannot deploy task: Concurrent deployment call race.");
        }
    } else {
        // vertex may have been cancelled, or it was already scheduled
        // 顶点可能已被取消，或者已经处于调度状态
        //之前状态不等于SCHEDULED抛出异常
        throw new IllegalStateException(
                "The vertex must be in SCHEDULED state to be deployed. Found state "
                        + previous);
    }
    // 检查当前实例是否已分配给该Slot
    if (this != slot.getPayload()) {
        throw new IllegalStateException(
                String.format(
                        "The execution %s has not been assigned to the assigned slot.", this));
    }

    try {

        // race double check, did we fail/cancel and do we need to release the slot?
        //判断状态是否等于DEPLOYING
        if (this.state != DEPLOYING) {
            //释放Slot
            slot.releaseSlot(
                    new FlinkException(
                            "Actual state of execution "
                                    + this
                                    + " ("
                                    + state
                                    + ") does not match expected state DEPLOYING."));
            return;
        }
      
        LOG.info(
                "Deploying {} (attempt #{}) with attempt id {} and vertex id {} to {} with allocation id {}",
                vertex.getTaskNameWithSubtaskIndex(),
                getAttemptNumber(),
                attemptId,
                vertex.getID(),
                getAssignedResourceLocation(),
                slot.getAllocationId());
       // 获取任务部署描述符的工厂，用于创建一个新的部署描述符
       // deployment 变量将保存创建的部署描述符实例
        final TaskDeploymentDescriptor deployment =
                vertex.getExecutionGraphAccessor()
                        .getTaskDeploymentDescriptorFactory()
                        .createDeploymentDescriptor(
                                this, // 当前对象（可能是任务的某个实例）
                                slot.getAllocationId(), // 分配给任务的槽的ID
                                taskRestore,  // 任务恢复相关的对象（此处设置为null以便进行垃圾回收）
                                producedPartitions.values());// 产生的分区值

        // null taskRestore to let it be GC'ed
        taskRestore = null;
        // 获取任务管理器网关，用于与TaskManager进行通信
        final TaskManagerGateway taskManagerGateway = slot.getTaskManagerGateway();
       // 获取作业主线程执行器，用于在主线程上执行某些操作
        final ComponentMainThreadExecutor jobMasterMainThreadExecutor =
                vertex.getExecutionGraphAccessor().getJobMasterMainThreadExecutor();
        // 通知当前顶点（vertex）一个待处理的部署
        getVertex().notifyPendingDeployment(this);
        // We run the submission in the future executor so that the serialization of large TDDs
        // does not block
        // the main thread and sync back to the main thread once submission is completed.
        // 使用异步执行器提交任务，以避免序列化大型TDDs（TaskDeploymentDescriptor）时阻塞主线程
        // 一旦提交完成，同步回主线程
        CompletableFuture.supplyAsync(
                        () -> taskManagerGateway.submitTask(deployment, rpcTimeout), executor)
                .thenCompose(Function.identity())
                .whenCompleteAsync(
                        (ack, failure) -> {
                            if (failure == null) {
                                // 如果提交成功，通知顶点已完成部署,之前pending 现在部署成功移除
                                vertex.notifyCompletedDeployment(this);
                            } else {
                                // 提取实际的异常信息
                                final Throwable actualFailure =
                                        ExceptionUtils.stripCompletionException(failure);

                                if (actualFailure instanceof TimeoutException) {
                                   
                                    String taskname =
                                            vertex.getTaskNameWithSubtaskIndex()
                                                    + " ("
                                                    + attemptId
                                                    + ')';

                                    markFailed(
                                            new Exception(
                                                    "Cannot deploy task "
                                                            + taskname
                                                            + " - TaskManager ("
                                                            + getAssignedResourceLocation()
                                                            + ") not responding after a rpcTimeout of "
                                                            + rpcTimeout,
                                                    actualFailure));
                                } else {
    
                                    markFailed(actualFailure);
                                }
                            }
                        },
                        // 使用作业主线程执行器来处理完成回调
                        jobMasterMainThreadExecutor);
    } catch (Throwable t) {
        markFailed(t);
    }
}