【Flink源码】再谈Flink程序提交流程(下)

书接上回,在 【Flink源码】再谈 Flink 程序提交流程(中) 一文的结尾我们完成了 JobManager 申请 slot 的探究
按照 Flink 程序提交流程,接下来 TaskManager 即将启动
我们来继续往下看


TaskManager 启动

首先,TaskManager 会完成一个初始化阶段,我们从 Yarn 环境下的 main 方法看起

YarnTaskExecutorRunner.java

public static void main(String[] args) {
    EnvironmentInformation.logEnvironmentInfo(LOG, "YARN TaskExecutor runner", args);
    SignalHandler.register(LOG);
    JvmShutdownSafeguard.installAsShutdownHook(LOG);

    runTaskManagerSecurely(args);
}

private static void runTaskManagerSecurely(String[] args) {
    Configuration configuration = null;

    try {
        LOG.debug("All environment variables: {}", ENV);

        final String currDir = ENV.get(Environment.PWD.key());
        LOG.info("Current working Directory: {}", currDir);

        configuration = TaskManagerRunner.loadConfiguration(args);
        setupAndModifyConfiguration(configuration, currDir, ENV);
    } catch (Throwable t) {
        LOG.error("YARN TaskManager initialization failed.", t);
        System.exit(INIT_ERROR_EXIT_CODE);
    }

    TaskManagerRunner.runTaskManagerProcessSecurely(Preconditions.checkNotNull(configuration));
}

这里使用了 TaskManagerRunner 的 runTaskManagerProcessSecurely 方法,我们继续看

TaskManagerRunner.java

public static void runTaskManagerProcessSecurely(Configuration configuration) {
    FlinkSecurityManager.setFromConfiguration(configuration);
    final PluginManager pluginManager =
            PluginUtils.createPluginManagerFromRootFolder(configuration);
    FileSystem.initialize(configuration, pluginManager);

    StateChangelogStorageLoader.initialize(pluginManager);

    int exitCode;
    Throwable throwable = null;

    ClusterEntrypointUtils.configureUncaughtExceptionHandler(configuration);
    try {
        SecurityUtils.install(new SecurityConfiguration(configuration));

        exitCode =
                SecurityUtils.getInstalledContext()
                        .runSecured(() -> runTaskManager(configuration, pluginManager));
    } catch (Throwable t) {
        throwable = ExceptionUtils.stripException(t, UndeclaredThrowableException.class);
        exitCode = FAILURE_EXIT_CODE;
    }

    if (throwable != null) {
        LOG.error("Terminating TaskManagerRunner with exit code {}.", exitCode, throwable);
    } else {
        LOG.info("Terminating TaskManagerRunner with exit code {}.", exitCode);
    }

    System.exit(exitCode);
}

public static int runTaskManager(Configuration configuration, PluginManager pluginManager)
        throws Exception {
    final TaskManagerRunner taskManagerRunner;

    try {
        // 构建一个 TaskManagerRunner
        taskManagerRunner =
                new TaskManagerRunner(
                        configuration,
                        pluginManager,
                        TaskManagerRunner::createTaskExecutorService);
        // 启动 TaskManagerRunner
        taskManagerRunner.start();
    } catch (Exception exception) {
        throw new FlinkException("Failed to start the TaskManagerRunner.", exception);
    }

    try {
        return taskManagerRunner.getTerminationFuture().get().getExitCode();
    } catch (Throwable t) {
        throw new FlinkException(
                "Unexpected failure during runtime of TaskManagerRunner.",
                ExceptionUtils.stripExecutionException(t));
    }
}

在 runTaskManager 方法中,创建并启动了一个 TaskManagerRunner,我们继续看 start 方法

TaskManagerRunner.java

public void start() throws Exception {
    synchronized (lock) {
        startTaskManagerRunnerServices();
        taskExecutorService.start();
    }
}

在 start 方法中,启动了一个同步锁,并开启了 TaskManagerRunner 业务和 TaskExecutor 业务
实际上,TaskManager 启动的所有准备工作都是在这个 TaskManagerRunner 中通过 startTaskManagerRunnerServices 完成的
相信你一定也很好奇 TaskManagerRunner 到底做了哪些准备工作,那么我们先切出一个支线去一探究竟,然后再回到主线看 taskExecutorService.start()

TaskManager 基础服务的初始化

在 TaskManagerRunner.start 方法中通过 startTaskManagerRunnerServices 方法启动 TaskManagerRunner 业务
我们来一探究竟

TaskManagerRunner.java

private void startTaskManagerRunnerServices() throws Exception {
    synchronized (lock) {
        rpcSystem = RpcSystem.load(configuration);

        // TaskManager 内部线程池,用来处理从节点内部各个组件的 IO 的线程池
        // 线程池大小为当前节点的 CPU 核心数
        this.executor =
                Executors.newScheduledThreadPool(
                        Hardware.getNumberCPUCores(),
                        new ExecutorThreadFactory("taskmanager-future"));

        // 高可用服务
        highAvailabilityServices =
                HighAvailabilityServicesUtils.createHighAvailabilityServices(
                        configuration,
                        executor,
                        AddressResolution.NO_ADDRESS_RESOLUTION,
                        rpcSystem,
                        this);
        // JMX 服务,提供监控信息
        JMXService.startInstance(configuration.getString(JMXServerOptions.JMX_SERVER_PORT));
        // 启动 RPC 服务,内部为 Akka 模型的 ActorSystem
        rpcService = createRpcService(configuration, highAvailabilityServices, rpcSystem);
        // 为 TaskManager 生成了一个 ResourceID
        this.resourceId =
                getTaskManagerResourceID(
                        configuration, rpcService.getAddress(), rpcService.getPort());

        this.workingDirectory =
                ClusterEntrypointUtils.createTaskManagerWorkingDirectory(
                        configuration, resourceId);

        LOG.info("Using working directory: {}", workingDirectory);
        // 初始化心跳服务,主要是初始化心跳间隔和心跳超时参数配置
        HeartbeatServices heartbeatServices =
                HeartbeatServices.fromConfiguration(configuration);

        metricRegistry =
                new MetricRegistryImpl(
                        MetricRegistryConfiguration.fromConfiguration(
                                configuration,
                                rpcSystem.getMaximumMessageSizeInBytes(configuration)),
                        ReporterSetup.fromConfiguration(configuration, pluginManager));

        final RpcService metricQueryServiceRpcService =
                MetricUtils.startRemoteMetricsRpcService(
                        configuration,
                        rpcService.getAddress(),
                        configuration.getString(TaskManagerOptions.BIND_HOST),
                        rpcSystem);
        metricRegistry.startQueryService(metricQueryServiceRpcService, resourceId.unwrap());
        // 在主节点启动的时候,事实上已经启动了 BlobServer
        // 从节点启动的时候,会启动一个 BlobService,做文件缓存服务
        blobCacheService =
                BlobUtils.createBlobCacheService(
                        configuration,
                        Reference.borrowed(workingDirectory.unwrap().getBlobStorageDirectory()),
                        highAvailabilityServices.createBlobStore(),
                        null);

        final ExternalResourceInfoProvider externalResourceInfoProvider =
                ExternalResourceUtils.createStaticExternalResourceInfoProviderFromConfig(
                        configuration, pluginManager);
        // 创建得到一个 TaskExecutorService,内部封装了 TaskExecutor,同时 TaskExecutor 的构建也在内部完成
        taskExecutorService =
                taskExecutorServiceFactory.createTaskExecutor(
                        this.configuration,
                        this.resourceId.unwrap(),
                        rpcService,
                        highAvailabilityServices,
                        heartbeatServices,
                        metricRegistry,
                        blobCacheService,
                        false,
                        externalResourceInfoProvider,
                        workingDirectory.unwrap(),
                        this);

        handleUnexpectedTaskExecutorServiceTermination();

        MemoryLogger.startIfConfigured(
                LOG, configuration, terminationFuture.thenAccept(ignored -> {}));
    }
}

总结一下这里的工作:

  1. 初始化了一个 TaskManager 的内部线程池,用来处理从节点内部各个组件的 IO,该线程池的大小为当前节点 CPU 的核心数
  2. 构建了一个高可用服务
  3. 初始化 JMX 服务,用于提供监控信息
  4. 启动 RPC 服务,内部为 Akka 模型的 ActorSystem
  5. 为 TaskManager 生成了一个 ResourceID
  6. 初始化心跳服务,根据配置文件获取心跳间隔时间参数以及心跳超时参数
  7. 初始化 metric 服务
  8. 启动 BlobCacheService 服务,做文件缓存的服务
  9. 构建了一个 TaskExecutorService,内部封装了 TaskExecutor
BlobCacheService 的初始化

在以上各初始化操作中,我们首先来看 BlobCacheService 的初始化

BlobUtils.java

public static BlobCacheService createBlobCacheService(
        Configuration configuration,
        Reference<File> fallbackStorageDirectory,
        BlobView blobView,
        @Nullable InetSocketAddress serverAddress)
        throws IOException {
    final Reference<File> storageDirectory =
            createBlobStorageDirectory(configuration, fallbackStorageDirectory);
    return new BlobCacheService(configuration, storageDirectory, blobView, serverAddress);
}

显然是创建了一个存储目录然后调用 BlobCacheService 构造方法,我们继续往下看

BlobCacheService.java

public BlobCacheService(
        final Configuration blobClientConfig,
        final Reference<File> storageDir,
        final BlobView blobView,
        @Nullable final InetSocketAddress serverAddress)
        throws IOException {

    /**
        * TODO 初始化了两个文件服务
        * 1. 持久化 Blob 缓存服务
        * 2. 临时 Blob 缓存服务
        * 在这两个服务的内部都会在启动的时候启动一个定时服务
        * 就是把过期的某个 Job 的对应资源都删除掉
        */
    this(
            // 持久化
            new PermanentBlobCache(blobClientConfig, storageDir, blobView, serverAddress),
            // 缓存
            new TransientBlobCache(blobClientConfig, storageDir, serverAddress));
}

在这个构造方法中,主要做了两件事:

  1. 初始化一个持久化 Blob 缓存服务
  2. 初始化一个临时 Blob 缓存服务
    在这两个服务的内部,都会在启动的时候启动一个定时服务,将过期的某个 Job 对应的资源都删除掉
    我们以持久化 Blob 缓存服务为例,点进 PermanentBlobCache 的构造方法

PermanentBlobCache.java

public PermanentBlobCache(
        final Configuration blobClientConfig,
        final Reference<File> storageDir,
        final BlobView blobView,
        @Nullable final InetSocketAddress serverAddress,
        BlobCacheSizeTracker blobCacheSizeTracker)
        throws IOException {
    super(
            blobClientConfig,
            storageDir,
            blobView,
            LoggerFactory.getLogger(PermanentBlobCache.class),
            serverAddress);

    // Initializing the clean up task
    // 初始化清理定时器
    this.cleanupTimer = new Timer(true);
    
    // 配置过期时间为 1 小时
    this.cleanupInterval = blobClientConfig.getLong(BlobServerOptions.CLEANUP_INTERVAL) * 1000;
    // 启动定时任务,每 1 小时清理一次
    this.cleanupTimer.schedule(
            new PermanentBlobCleanupTask(), cleanupInterval, cleanupInterval);

    this.blobCacheSizeTracker = blobCacheSizeTracker;

    registerDetectedJobs();
}

首先配置了一个过期时间为 1 小时,然后启动一个定时服务,每 1 小时执行一次 PermanentBlobCleanupTask,我们来看该类的 run 方法

PermanentBlobCleanupTask.java

class PermanentBlobCleanupTask extends TimerTask {
    /** Cleans up BLOBs which are not referenced anymore. */
    @Override
    public void run() {
        // 通过引用计数的方式获取所有 Job 引用的文件
        synchronized (jobRefCounters) {
            Iterator<Map.Entry<JobID, RefCount>> entryIter =
                    jobRefCounters.entrySet().iterator();
            final long currentTimeMillis = System.currentTimeMillis();
            // 遍历所有文件
            while (entryIter.hasNext()) {
                Map.Entry<JobID, RefCount> entry = entryIter.next();
                RefCount ref = entry.getValue();
                // 判断是否过期
                if (ref.references <= 0
                        && ref.keepUntil > 0
                        && currentTimeMillis >= ref.keepUntil) {
                    JobID jobId = entry.getKey();

                    final File localFile =
                            new File(
                                    BlobUtils.getStorageLocationPath(
                                            storageDir.deref().getAbsolutePath(), jobId));

                    /*
                        * NOTE: normally it is not required to acquire the write lock to delete the job's
                        *       storage directory since there should be no one accessing it with the ref
                        *       counter being 0 - acquire it just in case, to always be on the safe side
                        */
                    readWriteLock.writeLock().lock();

                    boolean success = false;
                    try {
                        blobCacheSizeTracker.untrackAll(jobId);
                        // 删除该资源文件
                        FileUtils.deleteDirectory(localFile);
                        success = true;
                    } catch (Throwable t) {
                        log.warn(
                                "Failed to locally delete job directory "
                                        + localFile.getAbsolutePath(),
                                t);
                    } finally {
                        readWriteLock.writeLock().unlock();
                    }

                    // let's only remove this directory from cleanup if the cleanup was
                    // successful
                    // (does not need the write lock)
                    if (success) {
                        entryIter.remove();
                    }
                }
            }
        }
    }
}

可以看到,有以下操作:

  1. 首先在方法中通过引用计数的方式,获取所有 job 引用的资源文件
  2. 遍历这些文件,并判断是否过期
  3. 如果过期则删除该资源文件夹

在临时缓存 Blob 服务中也是一样的工作

TransientBlobCache.java

public TransientBlobCache(
        final Configuration blobClientConfig,
        final Reference<File> storageDir,
        @Nullable final InetSocketAddress serverAddress)
        throws IOException {

    super(
            blobClientConfig,
            storageDir,
            new VoidBlobStore(),
            LoggerFactory.getLogger(TransientBlobCache.class),
            serverAddress);

    // Initializing the clean up task
    this.cleanupTimer = new Timer(true);
    // 1 小时
    this.cleanupInterval = blobClientConfig.getLong(BlobServerOptions.CLEANUP_INTERVAL) * 1000;
    this.cleanupTimer.schedule(
            // 定时服务
            new TransientBlobCleanupTask(blobExpiryTimes, this::deleteInternal, log),
            cleanupInterval,
            cleanupInterval);

    registerBlobExpiryTimes();
}

以上是 BlobCacheService 的构建过程
接下来我们重点看 TaskExecutorService 的构造过程

TaskExecutorService 的构造过程

从前文的 createTaskExecutor 方法开始

public static TaskExecutorService createTaskExecutorService(
        Configuration configuration,
        ResourceID resourceID,
        RpcService rpcService,
        HighAvailabilityServices highAvailabilityServices,
        HeartbeatServices heartbeatServices,
        MetricRegistry metricRegistry,
        BlobCacheService blobCacheService,
        boolean localCommunicationOnly,
        ExternalResourceInfoProvider externalResourceInfoProvider,
        FatalErrorHandler fatalErrorHandler)
        throws Exception {

    // TODO 创建TaskExecutor
    final TaskExecutor taskExecutor =
            startTaskManager(
                    configuration,
                    resourceID,
                    rpcService,
                    highAvailabilityServices,
                    heartbeatServices,
                    metricRegistry,
                    blobCacheService,
                    localCommunicationOnly,
                    externalResourceInfoProvider,
                    fatalErrorHandler);

    /*
        TODO 封装了一下TaskExecutor
        TaskExecutor是TaskExecutorToServiceAdapter的成员变量
        TaskExecutorToServiceAdapter是TaskManagerRunner的成员变量
        */

    return TaskExecutorToServiceAdapter.createFor(taskExecutor);
}

这里真正初始化了一个 TaskExecutor,并将 TaskExecutor 封装了一下,首先看 TaskExecutor 的初始化,我们进入 startTaskManager 方法
在该方法中依然是初始化了一些基础服务
首先是初始化资源配置,获取硬件资源配置

// 初始化资源配置,获取硬件资源配置
final TaskExecutorResourceSpec taskExecutorResourceSpec = TaskExecutorResourceUtils.resourceSpecFromConfig(configuration);

接着获取配置

// 获取配置(args和flink-conf)
TaskManagerServicesConfiguration taskManagerServicesConfiguration =
        TaskManagerServicesConfiguration.fromConfiguration(
                configuration,
                resourceID,
                externalAddress,
                localCommunicationOnly,
                taskExecutorResourceSpec);

在这里 TaskManagerService 初始化了一些核心服务

// 初始化了一些核心服务
TaskManagerServices taskManagerServices =
        TaskManagerServices.fromConfiguration(
                taskManagerServicesConfiguration,
                blobCacheService.getPermanentBlobService(),
                taskManagerMetricGroup.f1,
                ioExecutor,
                fatalErrorHandler);

我们继续进入 fromConfiguration 方法

public static TaskManagerServices fromConfiguration(
        TaskManagerServicesConfiguration taskManagerServicesConfiguration,
        PermanentBlobService permanentBlobService,
        MetricGroup taskManagerMetricGroup,
        ExecutorService ioExecutor,
        FatalErrorHandler fatalErrorHandler,
        WorkingDirectory workingDirectory)
        throws Exception {

    // pre-start checks
    checkTempDirs(taskManagerServicesConfiguration.getTmpDirPaths());
    // 状态机 事件分发器
    final TaskEventDispatcher taskEventDispatcher = new TaskEventDispatcher();

    // start the I/O manager, it will create some temp directories.
    final IOManager ioManager =
            new IOManagerAsync(taskManagerServicesConfiguration.getTmpDirPaths());
    // 作业执行期间 shuffle 相关操作工作
    final ShuffleEnvironment<?, ?> shuffleEnvironment =
            createShuffleEnvironment(
                    taskManagerServicesConfiguration,
                    taskEventDispatcher,
                    taskManagerMetricGroup,
                    ioExecutor);
    final int listeningDataPort = shuffleEnvironment.start();
    
    // state 管理服务
    final KvStateService kvStateService =
            KvStateService.fromConfiguration(taskManagerServicesConfiguration);
    kvStateService.start();

    final UnresolvedTaskManagerLocation unresolvedTaskManagerLocation =
            new UnresolvedTaskManagerLocation(
                    taskManagerServicesConfiguration.getResourceID(),
                    taskManagerServicesConfiguration.getExternalAddress(),
                    // we expose the task manager location with the listening port
                    // iff the external data port is not explicitly defined
                    taskManagerServicesConfiguration.getExternalDataPort() > 0
                            ? taskManagerServicesConfiguration.getExternalDataPort()
                            : listeningDataPort,
                    taskManagerServicesConfiguration.getNodeId());
    
    // 广播变量管理服务
    final BroadcastVariableManager broadcastVariableManager = new BroadcastVariableManager();
    // TaskExecutor 内部,最重要的一个成员变量
    // 一张存放 TaskSlot 的表
    final TaskSlotTable<Task> taskSlotTable =
            createTaskSlotTable(
                    taskManagerServicesConfiguration.getNumberOfSlots(),
                    taskManagerServicesConfiguration.getTaskExecutorResourceSpec(),
                    taskManagerServicesConfiguration.getTimerServiceShutdownTimeout(),
                    taskManagerServicesConfiguration.getPageSize(),
                    ioExecutor);

    final JobTable jobTable = DefaultJobTable.create();
    
    // 监控主节点的 Leader 地址
    final JobLeaderService jobLeaderService =
            new DefaultJobLeaderService(
                    unresolvedTaskManagerLocation,
                    taskManagerServicesConfiguration.getRetryingRegistrationConfiguration());

    final TaskExecutorLocalStateStoresManager taskStateManager =
            new TaskExecutorLocalStateStoresManager(
                    taskManagerServicesConfiguration.isLocalRecoveryEnabled(),
                    taskManagerServicesConfiguration.getLocalRecoveryStateDirectories(),
                    ioExecutor);

    final TaskExecutorStateChangelogStoragesManager changelogStoragesManager =
            new TaskExecutorStateChangelogStoragesManager();

    final boolean failOnJvmMetaspaceOomError =
            taskManagerServicesConfiguration
                    .getConfiguration()
                    .getBoolean(CoreOptions.FAIL_ON_USER_CLASS_LOADING_METASPACE_OOM);
    final boolean checkClassLoaderLeak =
            taskManagerServicesConfiguration
                    .getConfiguration()
                    .getBoolean(CoreOptions.CHECK_LEAKED_CLASSLOADER);
    final LibraryCacheManager libraryCacheManager =
            new BlobLibraryCacheManager(
                    permanentBlobService,
                    BlobLibraryCacheManager.defaultClassLoaderFactory(
                            taskManagerServicesConfiguration.getClassLoaderResolveOrder(),
                            taskManagerServicesConfiguration
                                    .getAlwaysParentFirstLoaderPatterns(),
                            failOnJvmMetaspaceOomError ? fatalErrorHandler : null,
                            checkClassLoaderLeak));

    final SlotAllocationSnapshotPersistenceService slotAllocationSnapshotPersistenceService;

    if (taskManagerServicesConfiguration.isLocalRecoveryEnabled()) {
        slotAllocationSnapshotPersistenceService =
                new FileSlotAllocationSnapshotPersistenceService(
                        workingDirectory.getSlotAllocationSnapshotDirectory());
    } else {
        slotAllocationSnapshotPersistenceService =
                NoOpSlotAllocationSnapshotPersistenceService.INSTANCE;
    }

    return new TaskManagerServices(
            unresolvedTaskManagerLocation,
            taskManagerServicesConfiguration.getManagedMemorySize().getBytes(),
            ioManager,
            shuffleEnvironment,
            kvStateService,
            broadcastVariableManager,
            taskSlotTable,
            jobTable,
            jobLeaderService,
            taskStateManager,
            changelogStoragesManager,
            taskEventDispatcher,
            ioExecutor,
            libraryCacheManager,
            slotAllocationSnapshotPersistenceService);
}

在这里,初始化了事件分发器、IOManager、ShuffleEnvironment、state 管理服务、广播变量服务、TaskSlotJobManager 的 Leader 地址监控服务等等,这里我们着重看一下 TableSlot 表

TaskSlotTable

TaskSlotTable 作为 TaskExecutor 的成员变量,到底是干啥的?
帮助 TaskExecutor 完成一切和 Slot 相关的操作的组件
关于该组件,在 ResourceManager 中也有,即 SlotManager
在 JobMaster 申请资源时,是 ResourceManager 中的 SlotManager 来完成资源分配的,在完成资源分配后,SlotManager 会向 TaskExecutor 发送 RPC 请求,然后 TaskExecutor 再向 ResourceManager 去做汇报表示已完成分配。
首先我们来看其中几个重要变量

TaskSlotTableImpl.java

/** The list of all task slots. */
/** 所有的 slot,在 TaskManager 启动时会将自身的 slot 汇报给 ResourceManager,并将 slot 封装为 taskSlot */
private final Map<Integer, TaskSlot<T>> taskSlots;

/** Mapping from allocation id to task slot. */
/** 所有已被分配的 slot,维护着分配 ID 和 TaskSlot 之间的关系 */
private final Map<AllocationID, TaskSlot<T>> allocatedSlots;
  • 其中,taskSlots 存放着所有的当前节点的 slot,在当前节点的 TaskManager 启动时,会将自身的 slot 汇报给 ResourceManager,并将 slot 封装为 taskSlot
  • allocatedSlots 存放所有已被分配的 slot 的信息,维护着分配 ID 和 TaskSlot 之间的关系
TaskExecutor 的初始化

接下来我们回到 TaskManagerRunner.startTaskManager 方法,看最后一步,初始化 TaskExecutor
在 TaskExecutor 构造方法中,看到 TaskExecutor 继承自 RPCEndpoint,当 TaskExecutor 初始化完成之后回去调用自身的 onStart 方法,此刻还在初始化过程中

public TaskExecutor(
        RpcService rpcService,
        TaskManagerConfiguration taskManagerConfiguration,
        HighAvailabilityServices haServices,
        TaskManagerServices taskExecutorServices,
        ExternalResourceInfoProvider externalResourceInfoProvider,
        HeartbeatServices heartbeatServices,
        TaskManagerMetricGroup taskManagerMetricGroup,
        @Nullable String metricQueryServiceAddress,
        TaskExecutorBlobService taskExecutorBlobService,
        FatalErrorHandler fatalErrorHandler,
        TaskExecutorPartitionTracker partitionTracker) {

    // TaskExecutor 为 RPCEndpoint 的子类,调用父类构造器
    super(rpcService, RpcServiceUtils.createRandomName(TASK_MANAGER_NAME));

    checkArgument(
            taskManagerConfiguration.getNumberSlots() > 0,
            "The number of slots has to be larger than 0.");

    this.taskManagerConfiguration = checkNotNull(taskManagerConfiguration);
    this.taskExecutorServices = checkNotNull(taskExecutorServices);
    this.haServices = checkNotNull(haServices);
    this.fatalErrorHandler = checkNotNull(fatalErrorHandler);
    this.partitionTracker = partitionTracker;
    this.taskManagerMetricGroup = checkNotNull(taskManagerMetricGroup);
    this.taskExecutorBlobService = checkNotNull(taskExecutorBlobService);
    this.metricQueryServiceAddress = metricQueryServiceAddress;
    this.externalResourceInfoProvider = checkNotNull(externalResourceInfoProvider);

    this.libraryCacheManager = taskExecutorServices.getLibraryCacheManager();
    this.taskSlotTable = taskExecutorServices.getTaskSlotTable();
    this.jobTable = taskExecutorServices.getJobTable();
    this.jobLeaderService = taskExecutorServices.getJobLeaderService();
    this.unresolvedTaskManagerLocation =
            taskExecutorServices.getUnresolvedTaskManagerLocation();
    this.localStateStoresManager = taskExecutorServices.getTaskManagerStateStore();
    this.changelogStoragesManager = taskExecutorServices.getTaskManagerChangelogManager();
    this.shuffleEnvironment = taskExecutorServices.getShuffleEnvironment();
    this.kvStateService = taskExecutorServices.getKvStateService();
    this.ioExecutor = taskExecutorServices.getIOExecutor();
    this.resourceManagerLeaderRetriever = haServices.getResourceManagerLeaderRetriever();

    this.hardwareDescription =
            HardwareDescription.extractFromSystem(taskExecutorServices.getManagedMemorySize());
    this.memoryConfiguration =
            TaskExecutorMemoryConfiguration.create(taskManagerConfiguration.getConfiguration());

    this.resourceManagerAddress = null;
    this.resourceManagerConnection = null;
    this.currentRegistrationTimeoutId = null;

    final ResourceID resourceId =
            taskExecutorServices.getUnresolvedTaskManagerLocation().getResourceID();
    // 初始化了两个心跳管理器
    // TaskExecutor 维持和 JobMaster 的心跳
    this.jobManagerHeartbeatManager =
            createJobManagerHeartbeatManager(heartbeatServices, resourceId);
    // TaskExecutor 维持和 ResourceManager 的心跳
    this.resourceManagerHeartbeatManager =
            createResourceManagerHeartbeatManager(heartbeatServices, resourceId);

    ExecutorThreadFactory sampleThreadFactory =
            new ExecutorThreadFactory.Builder()
                    .setPoolName("flink-thread-info-sampler")
                    .build();
    ScheduledExecutorService sampleExecutor =
            Executors.newSingleThreadScheduledExecutor(sampleThreadFactory);
    this.threadInfoSampleService = new ThreadInfoSampleService(sampleExecutor);

    this.slotAllocationSnapshotPersistenceService =
            taskExecutorServices.getSlotAllocationSnapshotPersistenceService();
}

该方法先是进行了一些变量的初始化,然后初始化了两个心跳管理器:

  1. TaskExecutor 维持和 JobMaster 的心跳的管理器
  2. TaskExecutor 维持和 ResourceManager 心跳的管理器

在心跳管理器内部初始化了一个 HeartbeatManagerImpl 对象,在 ResourceManager 中初始化的心跳管理器为 HeartbeatManagerSenderImpl,根据名字能看出这是一个心跳请求发送器,在该方法中会有一个定时任务,每 10s 遍历一次所有的已注册的心跳目标对象,并向每个对象发送心跳请求

private HeartbeatManager<AllocatedSlotReport, TaskExecutorToJobManagerHeartbeatPayload>
        createJobManagerHeartbeatManager(
                HeartbeatServices heartbeatServices, ResourceID resourceId) {
    return heartbeatServices.createHeartbeatManager(
            resourceId, new JobManagerHeartbeatListener(), getMainThreadExecutor(), log);
}

HeartbeatServices.java

public <I, O> HeartbeatManager<I, O> createHeartbeatManager(
        ResourceID resourceId,
        HeartbeatListener<I, O> heartbeatListener,
        ScheduledExecutor mainThreadExecutor,
        Logger log) {

    return new HeartbeatManagerImpl<>(
            heartbeatTimeout,
            failedRpcRequestsUntilUnreachable,
            resourceId,
            heartbeatListener,
            mainThreadExecutor,
            log);
}

至此,TaskExecutor 初始化过程完成。

总结:
TaskExecutor 初始化流程:

  1. 首先构建了一个 TaskManagerRunner,用于完成 TaskManager 启动的准备工作,在完成准备工作后,通过调用 TaskManagerRunner 的 start 方法来启动
  2. 在 TaskManagerRunner 内部初始化了一个 TaskManagerService 对象,用来初始化 TaskExecutor 所需要的基础服务
  3. 在 TaskManagerService 内部,首先会初始化一些基础服务,如 IO 管理器、shuffleEnvironment、state 管理器、TaskSlotTable 等
  4. 在完成基础服务的初始化之后,开始初始化 TaskExecutor,首先初始化了两个心跳管理器,分别来维护和 JobMaster、ResourceManager 的心跳。因为 TaskExecutor 继承了 RPCEndpoint,所以具有生命周期方法 onStart
  5. TaskExecutor 初始化完成

TaskExecutor 启动流程

现在让我们回到主线,继续看 TaskExecutor 启动流程
前面我们讲到 Flink 通过 TaskManagerRunner.start 方法中的 taskExecutorService.start() 启动 TaskExecutor 业务
我们就从这个方法开始,继续挖

TaskManagerRunner.java

public interface TaskExecutorService extends AutoCloseableAsync {
    void start();

    CompletableFuture<Void> getTerminationFuture();
}

可以看到这是一个内部接口,找到其继承类 TaskExecutorToServiceAdapter

TaskExecutorToServiceAdapter.java

public void start() {
    taskExecutor.start();
}

我们发现,最终调用的是 TaskExecutor 的 start 方法,即 TaskExecutor 生命周期的 onStart 方法
回顾我们前面讲到的 TaskManager 初始化流程,最终实现了 TaskExecutor 的初始化,之后在此处通过调用 onStart 方法启动 TaskExecutor

TaskExecutor.java

public void onStart() throws Exception {
    try {
        // 启动从节点相关服务,会进行相关服务的注册
        startTaskExecutorServices();
    } catch (Throwable t) {
        final TaskManagerException exception =
                new TaskManagerException(
                        String.format("Could not start the TaskExecutor %s", getAddress()), t);
        onFatalError(exception);
        throw exception;
    }
    // 开启了一个注册超时服务,如果上面的服务注册成功,则会回调 stopRegistrationTimeout
    startRegistrationTimeout();
}

在这里做了两件事:

  1. 启动从节点的相关服务,并进行注册
  2. 开启注册超时服务

首先来看这个注册超时服务

private void startRegistrationTimeout() {
    final Duration maxRegistrationDuration =
            taskManagerConfiguration.getMaxRegistrationDuration();

    if (maxRegistrationDuration != null) {
        final UUID newRegistrationTimeoutId = UUID.randomUUID();
        currentRegistrationTimeoutId = newRegistrationTimeoutId;
        // 提交一个异步定时任务,如果在时间到达时没有取消,则会执行该任务
        scheduleRunAsync(
                () -> registrationTimeout(newRegistrationTimeoutId), maxRegistrationDuration);
    }
}

在这个注册超时服务力,提交了一个异步的定时任务,可以理解为一个倒计时任务,如果注册成功,则会执行一个 stopRegistrationTimeout 的方法来取消这个定时任务,如果在规定时间内还没有注册成功,则会执行这个超时方法,方法向外抛出 Fatal 级别的异常

private void registrationTimeout(@Nonnull UUID registrationTimeoutId) {
    if (registrationTimeoutId.equals(currentRegistrationTimeoutId)) {
        final Duration maxRegistrationDuration =
                taskManagerConfiguration.getMaxRegistrationDuration();
        // 注册超时则报错:致命错误
        onFatalError(
                new RegistrationTimeoutException(
                        String.format(
                                "Could not register at the ResourceManager within the specified maximum "
                                        + "registration duration %s. This indicates a problem with this instance. Terminating now.",
                                maxRegistrationDuration)));
    }
}

接下来我们再来看 onStart 中启动从节点服务的 startTaskExecutorServices 方法

startTaskExecutorServices.java

private void startTaskExecutorServices() throws Exception {
    try {
        // 监控 ResourceManager
        /**
            * 1.获取 ResourceManager 的地址,同时添加监听
            * 2.获取到 ResourceManager 的地址之后,当前启动的 TaskExecutor 就可以注册了
            * 3.注册之后会收到注册响应
            * 4.如果注册成功则:
            *      维持和 ResourceManager 之间的心跳
            *      做 slot 资源汇报
            */
        // start by connecting to the ResourceManager
        resourceManagerLeaderRetriever.start(new ResourceManagerLeaderListener());
        
        // tell the task slot table who's responsible for the task slot actions
        // 启动 taskSlotTable
        taskSlotTable.start(new SlotActionsImpl(), getMainThreadExecutor());

        // start the job leader service
        // 监控 JobMaster
        jobLeaderService.start(
                getAddress(), getRpcService(), haServices, new JobLeaderListenerImpl());
        // 文件缓存服务
        fileCache =
                new FileCache(
                        taskManagerConfiguration.getTmpDirectories(),
                        taskExecutorBlobService.getPermanentBlobService());

        tryLoadLocalAllocationSnapshots();
    } catch (Exception e) {
        handleStartTaskExecutorServicesException(e);
    }
}

在这个方法里,启动了一些 TaskExecutor 的核心组件:

  1. 启动对 ResourceManager 的监控服务
  2. 启动 taskSlotTable 服务
  3. 启动对 JobMaster 的监控服务
  4. 启动文件缓存服务

在完成了上述四项工作后,TaskManager 就启动完成了
下面我们将分别对前三点进行详述

监控 ResourceManager

启动监听服务

从 resourceManagerLeaderRetriever.start 方法开始
字面意思这个方法是为了获取 ResourceManager 的地址,同时添加针对于 ResourceManager 的监听
在获取到 ResourceManager 的地址之后,就会开始对当前的 TaskExecutor 进行注册。如果注册失败则报错并直接关闭 JVM,如果注册成功,则开始维持和 ResourceManager 的心跳,并向 ResourceManager 做自身 slot 的资源汇报
接下来我们看源码
找到 LeaderRetrievalService 的接口实现类 DefaultLeaderRetrievalService

DefaultLeaderRetrievalService.java

public void start(LeaderRetrievalListener listener) throws Exception {
checkNotNull(listener, "Listener must not be null.");
Preconditions.checkState(
        leaderListener == null,
        "DefaultLeaderRetrievalService can " + "only be started once.");

        synchronized (lock) {
                // 初始化 Leader 监听器
                leaderListener = listener;
                // 所有需要进行注册,从 zookeeper 中获取信息的都被封装成一个 LeaderRetrievalDriver
                leaderRetrievalDriver =
                        leaderRetrievalDriverFactory.createLeaderRetrievalDriver(
                                this, new LeaderRetrievalFatalErrorHandler());
                LOG.info("Starting DefaultLeaderRetrievalService with {}.", leaderRetrievalDriver);

                running = true;
        }
}

该方法中初始化了一个监听器,并对要从 zookeeper 中获取的信息封装成一个 LeaderRetrievalDriver 对象
我们进入 createLeaderRetrievalDriver 方法,选择 zookeeper 的实现

ZooKeeperLeaderRetrievalDriverFactory.java

public ZooKeeperLeaderRetrievalDriver createLeaderRetrievalDriver(
        LeaderRetrievalEventHandler leaderEventHandler, FatalErrorHandler fatalErrorHandler)
        throws Exception {
        return new ZooKeeperLeaderRetrievalDriver(
                client,
                retrievalPath,
                leaderEventHandler,
                leaderInformationClearancePolicy,
                fatalErrorHandler);
}

public ZooKeeperLeaderRetrievalDriver(
            CuratorFramework client,
            String path,
            LeaderRetrievalEventHandler leaderRetrievalEventHandler,
            LeaderInformationClearancePolicy leaderInformationClearancePolicy,
            FatalErrorHandler fatalErrorHandler)
            throws Exception {
        // CuratorFramework 为 zookeeper 框架 Curator,内部封装了一个 zookeeper 类
        this.client = checkNotNull(client, "CuratorFramework client");
        // Curator 框架的 NodeCache 相当于 zookeeper 中的 Watcher
        this.connectionInformationPath = ZooKeeperUtils.generateConnectionInformationPath(path);
        this.cache =
                ZooKeeperUtils.createTreeCache(
                        client,
                        connectionInformationPath,
                        this::retrieveLeaderInformationFromZooKeeper);

        this.leaderRetrievalEventHandler = checkNotNull(leaderRetrievalEventHandler);
        this.leaderInformationClearancePolicy = leaderInformationClearancePolicy;
        this.fatalErrorHandler = checkNotNull(fatalErrorHandler);

        // 开启监听
        // cache 为 NodeCache,维护着节点数据的缓存,当发现缓存中的数据和 zookeeper 上的数据不同时,会回调 cache 的 nodeChanged 方法
        cache.start();

        client.getConnectionStateListenable().addListener(connectionStateListener);

        running = true;
}

这里主要做了两件事:

  1. 初始化了一个 NodeCache 对象,使用了 Curator 框架,这里的 nodeCache 相当于 zookeeper 中的 watcher,用于监听 znode 节点数据变化
  2. 开启监听,nodeCache 维护着节点 zookeeper 节点数据的缓存,当发现缓存中的数据和 zookeeper 节点中的数据不一致时,会触发 cache 的 nodeChanged 方法

我们来看 nodeChanged 方法

@Override
public void nodeChanged() {
        // TODO 从zk中获取leader的信息
        retrieveLeaderInformationFromZooKeeper();
}

private void retrieveLeaderInformationFromZooKeeper() {
        try {
                LOG.debug("Leader node has changed.");

                // TODO 获取znode 节点数据
                final ChildData childData = cache.getCurrentData();

                // TODO 如果有数据
                if (childData != null) {
                        final byte[] data = childData.getData();
                        if (data != null && data.length > 0) {
                                ByteArrayInputStream bais = new ByteArrayInputStream(data);
                                ObjectInputStream ois = new ObjectInputStream(bais);

                                final String leaderAddress = ois.readUTF();
                                final UUID leaderSessionID = (UUID) ois.readObject();
                                // TODO 通知我们拿到了地址
                                leaderRetrievalEventHandler.notifyLeaderAddress(
                                        LeaderInformation.known(leaderSessionID, leaderAddress));
                                return;
                        }
                }
                // TODO 如果没有数据,则通知empty
                leaderRetrievalEventHandler.notifyLeaderAddress(LeaderInformation.empty());
        } catch (Exception e) {
                fatalErrorHandler.onFatalError(
                        new LeaderRetrievalException("Could not handle node changed event.", e));
                ExceptionUtils.checkInterrupted(e);
        }
}

这里一共做了三件事:

  1. 首先从 nodeCache 里去获取 znode 中的节点数据
  2. 如果有数据则调用 notifyLeaderAddress 方法告知我们拿到的节点数据
  3. 如果没有数据则调用 notifyLeaderAddress 方法,但会报告一个空消息 empty

继续看 notifyLeaderAddress 方法

  @Override
@GuardedBy("lock")
public void notifyLeaderAddress(LeaderInformation leaderInformation) {
        final UUID newLeaderSessionID = leaderInformation.getLeaderSessionID();
        final String newLeaderAddress = leaderInformation.getLeaderAddress();
        synchronized (lock) {
            if (running) {
                if (!Objects.equals(newLeaderAddress, lastLeaderAddress)
                        || !Objects.equals(newLeaderSessionID, lastLeaderSessionID)) {
                    if (LOG.isDebugEnabled()) {
                        if (newLeaderAddress == null && newLeaderSessionID == null) {
                            LOG.debug(
                                    "Leader information was lost: The listener will be notified accordingly.");
                        } else {
                            LOG.debug(
                                    "New leader information: Leader={}, session ID={}.",
                                    newLeaderAddress,
                                    newLeaderSessionID);
                        }
                    }

                    lastLeaderAddress = newLeaderAddress;
                    lastLeaderSessionID = newLeaderSessionID;

                    // Notify the listener only when the leader is truly changed.
                    // TODO 如果当前是获取ResourceManager的leader信息,则此处去找TaskExecutor中的ResourceManagerListener的实现
                    leaderListener.notifyLeaderAddress(newLeaderAddress, newLeaderSessionID);
                }
            } else {
                if (LOG.isDebugEnabled()) {
                    LOG.debug(
                            "Ignoring notification since the {} has already been closed.",
                            leaderRetrievalDriver);
                }
            }
        }
}

其中,核心代码为:

// Notify the listener only when the leader is truly changed.
// TODO 如果当前是获取ResourceManager的leader信息,则此处去找TaskExecutor中的ResourceManagerListener的实现
leaderListener.notifyLeaderAddress(newLeaderAddress, newLeaderSessionID);

通过这个代码来获取 ResourceManager 的 leader 信息,由于当前是在 TaskExecutor 内,去获取 ResourceManager 的信息,所以我们选择 TaskExecutor 内部的 ResourceManagerListener 实现

TaskExecutor.java

public void notifyLeaderAddress(final String leaderAddress, final UUID leaderSessionID) {
        // 监听回调,获取 Leader 地址
        runAsync(
                () ->
                        notifyOfNewResourceManagerLeader(
                                leaderAddress,
                                ResourceManagerId.fromUuidOrNull(leaderSessionID)));
}

这里是一个异步监听回调,我们点进 notifyOfNewResourceManagerLeader 方法中

private void notifyOfNewResourceManagerLeader(
            String newLeaderAddress, ResourceManagerId newResourceManagerId) {
        // 将从 zookeeper 中拿到的数据封装为真正的 ResourceManager 地址
        resourceManagerAddress =
                createResourceManagerAddress(newLeaderAddress, newResourceManagerId);
        // 连接这个地址
        // 此处命名为 reconnect 的原因是,只要 ResourceManager 的地址发生改变,这里就会调用一次
        // 先关闭和旧 ResourceManager 的连接,再启动和新 ResourceManager 的连接
        reconnectToResourceManager(
                new FlinkException(
                        String.format(
                                "ResourceManager leader changed to new address %s",
                                resourceManagerAddress)));
}

将从 zookeeper 中拿到的 ResourceManager 信息封装为 ResourceManager 地址对象,并开始去连接这个地址。
方法的名字中带有 reconnect,原因是这个方法只要 ResourceManager 的地址发生改变,就会触发这个方法进行重新连接

private void reconnectToResourceManager(Exception cause) {
        // 关闭和旧 ResourceManager 的连接
        closeResourceManagerConnection(cause);
        // 开启注册超时定时任务
        startRegistrationTimeout();
        // 连接新的 ResourceManager
        tryConnectToResourceManager();
}

这里做了三个工作:

  1. 先关闭和旧的 ResourceManager 的连接
  2. 开启延时注册超时服务
  3. 连接新的 ResourceManager

至此,监听服务启动完毕,并通过 zookeeper 已经拿到了 ResourceManager 的 Leader 节点地址,接下来将进行节点连接以及注册工作

TaskExecutor 对 ResourceManager 注册

我们接着上面继续说,在监听服务启动完毕后,TaskExecutor 拿到了 ResourceManager 的 Leader 节点地址,本节讲述如何进行节点的连接
让我们进入 tryConnectToResourceManager 方法

TaskExecutor.java

private void tryConnectToResourceManager() {
        if (resourceManagerAddress != null) {
                connectToResourceManager();
        }
}

private void connectToResourceManager() {
        assert (resourceManagerAddress != null);
        assert (establishedResourceManagerConnection == null);
        assert (resourceManagerConnection == null);

        log.info("Connecting to ResourceManager {}.", resourceManagerAddress);

        // 封装从节点的一些信息,准备将封装好的信息发送给主节点去注册,但这个对象并不是注册时发送的对象
        final TaskExecutorRegistration taskExecutorRegistration =
                new TaskExecutorRegistration(
                        getAddress(),
                        getResourceID(),
                        unresolvedTaskManagerLocation.getDataPort(),
                        JMXService.getPort().orElse(-1),
                        hardwareDescription,
                        memoryConfiguration,
                        taskManagerConfiguration.getDefaultSlotResourceProfile(),
                        taskManagerConfiguration.getTotalResourceProfile(),
                        unresolvedTaskManagerLocation.getNodeId());
        // 连接 ResourceManager
        resourceManagerConnection =
                new TaskExecutorToResourceManagerConnection(
                        log,
                        getRpcService(),
                        taskManagerConfiguration.getRetryingRegistrationConfiguration(),
                        resourceManagerAddress.getAddress(),
                        resourceManagerAddress.getResourceManagerId(),
                        getMainThreadExecutor(),
                        new ResourceManagerRegistrationListener(),
                        taskExecutorRegistration);
        // 开始注册
        resourceManagerConnection.start();
}

这里一共做了三件事:

  1. 封装从节点的一些信息,准备向主节点进行注册
  2. 开始连接 ResourceManager
  3. 连接完成后开始注册

我们进入 resourceManagerConnection.start() 方法

public void start() {
        checkState(!closed, "The RPC connection is already closed");
        checkState(
                !isConnected() && pendingRegistration == null,
                "The RPC connection is already started");

        // TODO 创建注册对象
        final RetryingRegistration<F, G, S, R> newRegistration = createNewRegistration();

        if (REGISTRATION_UPDATER.compareAndSet(this, null, newRegistration)) {
                // TODO 开始注册,注册完成后的回调代码在createNewRegistration()方法内
                newRegistration.startRegistration();
        } else {
                // concurrent start operation
                newRegistration.cancel();
        }
}

方法中创建注册所用的对象,并使用该对象进行注册

private RetryingRegistration<F, G, S, R> createNewRegistration() {
        // TODO 构建注册对象
        RetryingRegistration<F, G, S, R> newRegistration = checkNotNull(generateRegistration());

        CompletableFuture<RetryingRegistration.RetryingRegistrationResult<G, S, R>> future =
                newRegistration.getFuture();

        // TODO 完成注册后回调,不论成功或是失败
        future.whenCompleteAsync(
                (RetryingRegistration.RetryingRegistrationResult<G, S, R> result,
                        Throwable failure) -> {
                    // TODO 如果注册失败
                    if (failure != null) {
                        // TODO 如果失败原因的因为取消注册
                        if (failure instanceof CancellationException) {
                            // TODO 则不报错,只打印debug日志
                            // we ignore cancellation exceptions because they originate from
                            // cancelling
                            // the RetryingRegistration
                            log.debug(
                                    "Retrying registration towards {} was cancelled.",
                                    targetAddress);
                        } else {
                            // TODO 如果是其他原因失败,回调这个方法
                            // this future should only ever fail if there is a bug, not if the
                            // registration is declined
                            onRegistrationFailure(failure);
                        }
                    } else {
                        // TODO 注册成功
                        if (result.isSuccess()) {
                            targetGateway = result.getGateway();
                            // TODO 回调这个方法
                            onRegistrationSuccess(result.getSuccess());
                        } else if (result.isRejection()) {
                            onRegistrationRejection(result.getRejection());
                        } else {
                            throw new IllegalArgumentException(
                                    String.format(
                                            "Unknown retrying registration response: %s.", result));
                        }
                    }
                },
                executor);

        return newRegistration;
}

这里主要做了:

  1. 构建注册对象
  2. 一个注册完成后的回调方法
  3. 在回调方法中,如果注册失败,且是因为取消注册,则不报错
  4. 在回调方法中,如果注册失败,且因为期待原因失败,则触发 onRegistrationFailure 方法
  5. 在回调方法中,如果注册成功,则回调 onRegistrationSuccess 方法
  6. 在回调方法中,如果注册被拒绝,则回调 onRegistrationRejection 方法

首先来看注册对象的构建,点进 generateRegistration 方法:

@Override
protected RetryingRegistration<
                ResourceManagerId,
                ResourceManagerGateway,
                TaskExecutorRegistrationSuccess,
                TaskExecutorRegistrationRejection>
            generateRegistration() {
        // TODO 生成真正的注册对象
        return new TaskExecutorToResourceManagerConnection.ResourceManagerRegistration(
                log,
                rpcService,
                getTargetAddress(),
                getTargetLeaderId(),
                retryingRegistrationConfiguration,
                taskExecutorRegistration);
}

完成注册对象的初始化后,使用该对象完成向 ResourceManager 注册的流程
找到 newRegistration.startRegistration 方法

public void startRegistration() {
        if (canceled) {
            // we already got canceled
            return;
        }

        try {
            // trigger resolution of the target address to a callable gateway
            final CompletableFuture<G> rpcGatewayFuture;

            // TODO 这里的RPCGateway相当于主节点的一个引用(ActorRef),后续的注册使用的是这个引用
            if (FencedRpcGateway.class.isAssignableFrom(targetType)) {
                rpcGatewayFuture =
                        (CompletableFuture<G>)
                                rpcService.connect(
                                        targetAddress,
                                        fencingToken,
                                        targetType.asSubclass(FencedRpcGateway.class));
            } else {
                // TODO TaskExecutor连接ResourceManager
                rpcGatewayFuture = rpcService.connect(targetAddress, targetType);
            }

            // upon success, start the registration attempts
            // TODO 如果连接建立成功,获取到了RPCGateWay
            CompletableFuture<Void> rpcGatewayAcceptFuture =
                    // TODO 异步注册
                    rpcGatewayFuture.thenAcceptAsync(
                            (G rpcGateway) -> {
                                // TODO 使用这个引用对象进行注册
                                log.info("Resolved {} address, beginning registration", targetName);
                                register(
                                        rpcGateway,
                                        1,
                                        retryingRegistrationConfiguration
                                                .getInitialRegistrationTimeoutMillis());
                            },
                            rpcService.getExecutor());

            // upon failure, retry, unless this is cancelled
            // TODO 异步注册的回调
            rpcGatewayAcceptFuture.whenCompleteAsync(
                    (Void v, Throwable failure) -> {
                        // TODO 如果失败,且并非手动取消
                        if (failure != null && !canceled) {
                            final Throwable strippedFailure =
                                    ExceptionUtils.stripCompletionException(failure);
                            if (log.isDebugEnabled()) {
                                log.debug(
                                        "Could not resolve {} address {}, retrying in {} ms.",
                                        targetName,
                                        targetAddress,
                                        retryingRegistrationConfiguration.getErrorDelayMillis(),
                                        strippedFailure);
                            } else {
                                log.info(
                                        "Could not resolve {} address {}, retrying in {} ms: {}",
                                        targetName,
                                        targetAddress,
                                        retryingRegistrationConfiguration.getErrorDelayMillis(),
                                        strippedFailure.getMessage());
                            }

                            // TODO 如果注册失败,尝试再次注册,延时调度,时长通过cluster.registration.error-delay参数进行配置,默认10s
                            startRegistrationLater(
                                    retryingRegistrationConfiguration.getErrorDelayMillis());
                        }
                    },
                    rpcService.getExecutor());
        } catch (Throwable t) {
            completionFuture.completeExceptionally(t);
            cancel();
        }
}

关于注册流程,我们只讲到这里,感兴趣的读者可以自行深入挖掘,我只能说下面的逻辑很恶心

taskSlotTable 启动

TaskSlotTableImpl.java

public void start(
        SlotActions initialSlotActions, ComponentMainThreadExecutor mainThreadExecutor) {
        Preconditions.checkState(
                state == State.CREATED,
                "The %s has to be just created before starting",
                TaskSlotTableImpl.class.getSimpleName());
        this.slotActions = Preconditions.checkNotNull(initialSlotActions);
        this.mainThreadExecutor = Preconditions.checkNotNull(mainThreadExecutor);

        timerService.start(this);
        // 修改状态标识
        state = State.RUNNING;
}

这里逻辑并不复杂,只是做了状态检查和修改状态标识

监控 JobMaster

DefaultJobLeaderService.java

public void start(
        final String initialOwnerAddress,
        final RpcService initialRpcService,
        final HighAvailabilityServices initialHighAvailabilityServices,
        final JobLeaderListener initialJobLeaderListener) {

        if (DefaultJobLeaderService.State.CREATED != state) {
            throw new IllegalStateException("The service has already been started.");
        } else {
            LOG.info("Start job leader service.");

            this.ownerAddress = Preconditions.checkNotNull(initialOwnerAddress);
            this.rpcService = Preconditions.checkNotNull(initialRpcService);
            this.highAvailabilityServices =
                    Preconditions.checkNotNull(initialHighAvailabilityServices);
            this.jobLeaderListener = Preconditions.checkNotNull(initialJobLeaderListener);
            state = DefaultJobLeaderService.State.STARTED;
        }
}

同样也只是做了状态检查和状态修改
至此 TaskExecutor 已经完成了启动工作

总结:
resourceManagerLeaderRetriever 的启动流程完成了以下工作:

  1. TaskExecutor 对 ResourceManager 的注册
  2. TaskExecutor 维持对 ResourceManager 的心跳
  3. TaskExecutor 汇报自身的 Slot 情况给 ResourceManager

至此,Flink 程序提交流程分析全部完成!
完结,撒花!

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值