Flink启动主要是启动JobManager进程和TaskManager进程,本章我们总结一下TaskManager的启动流程:
TaskManager启动流程:
TaskMananger的启动类是:org.apache.flink.runtime.taskexecutor.TaskManagerRunner
配置TaskManager启动:
TaskManager入口:
public static void main(String[] args) throws Exception {
// startup checks and logging
EnvironmentInformation.logEnvironmentInfo(LOG, "TaskManager", args);
SignalHandler.register(LOG);
JvmShutdownSafeguard.installAsShutdownHook(LOG);
long maxOpenFileHandles = EnvironmentInformation.getOpenFileHandlesLimit();
if (maxOpenFileHandles != -1L) {
LOG.info("Maximum number of open file descriptors is {}.", maxOpenFileHandles);
} else {
LOG.info("Cannot determine the maximum number of open file descriptors");
}
runTaskManagerSecurely(args, ResourceID.generate());
}
TaskManager启动:
public static void runTaskManagerSecurely(String[] args, ResourceID resourceID) {
try {
final Configuration configuration = loadConfiguration(args);
final PluginManager pluginManager = PluginUtils.createPluginManagerFromRootFolder(configuration);
FileSystem.initialize(configuration, pluginManager);
SecurityUtils.install(new SecurityConfiguration(configuration));
SecurityUtils.getInstalledContext().runSecured(() -> {
runTaskManager(configuration, resourceID, pluginManager);
return null;
});
} catch (Throwable t)
初始化服务:
TaskManagerRunner纱线或独立模式下任务管理器的可执行入口点。它构造相关组件(network, I/O manager, memory manager, RPC service, HA service)并启动它们。
public TaskManagerRunner(Configuration configuration, ResourceID resourceId, PluginManager pluginManager) throws Exception {
this.configuration = checkNotNull(configuration);
this.resourceId = checkNotNull(resourceId);
timeout = AkkaUtils.getTimeoutAsTime(configuration);
this.executor = java.util.concurrent.Executors.newScheduledThreadPool(
Hardware.getNumberCPUCores(),
new ExecutorThreadFactory("taskmanager-future"));
//提供对高可用性所需的所有服务的访问注册,分布式计数器和领导人选举
highAvailabilityServices = HighAvailabilityServicesUtils.createHighAvailabilityServices(
configuration,
executor,
HighAvailabilityServicesUtils.AddressResolution.NO_ADDRESS_RESOLUTION);
//RPC服务启动Akka参与者来接收从RpcGateway调用RPC
rpcService = createRpcService(configuration, highAvailabilityServices);
HeartbeatServices heartbeatServices = HeartbeatServices.fromConfiguration(configuration);
//metricRegistry: 跟踪所有已注册的Metric,它作为连接MetricGroup和MetricReporter
metricRegistry = new MetricRegistryImpl(
MetricRegistryConfiguration.fromConfiguration(configuration),
ReporterSetup.fromConfiguration(configuration, pluginManager));
final RpcService metricQueryServiceRpcService = MetricUtils.startRemoteMetricsRpcService(configuration, rpcService.getAddress());
metricRegistry.startQueryService(metricQueryServiceRpcService, resourceId);
//BLOB缓存为永久和瞬态BLOB提供BLOB服务的访问
blobCacheService = new BlobCacheService(
configuration, highAvailabilityServices.createBlobStore(), null
);
//提供外部资源的信息
final ExternalResourceInfoProvider externalResourceInfoProvider =
ExternalResourceUtils.createStaticExternalResourceInfoProvider(
ExternalResourceUtils.getExternalResourceAmountMap(configuration),
ExternalResourceUtils.externalResourceDriversFromConfig(configuration, pluginManager));
//创建TaskExecutor。负责多个任务task的执行
taskManager = startTaskManager(
this.configuration,
this.resourceId,
rpcService,
highAvailabilityServices,
heartbeatServices,
metricRegistry,
blobCacheService,
false,
externalResourceInfoProvider,
this);
this.terminationFuture = new CompletableFuture<>();
this.shutdown = false;
MemoryLogger.startIfConfigured(LOG, configuration, terminationFuture);
}
服务详解:
highAvailabilityServices:
提供对高可用性所需的所有服务的访问注册,分布式计数器和领导人选举
rpcService:
RPC服务启动Akka参与者来接收从RpcGateway调用RPC
HeartbeatServices:
metricRegistry:
跟踪所有已注册的Metric,它作为连接MetricGroup和MetricReporter
blobCacheService:
为永久和瞬态BLOB提供BLOB服务的访问
TaskExecutor:
负责多个任务task的执行