点击查看相关章节 Flink 1.13 源码解析——TaskManager启动流程 之 初始化TaskExecutor
点击查看相关章节 Flink 1.13 源码解析——TaskManager启动流程概览
点击查看相关章节 Flink 1.13 源码解析——JobManager启动流程之ResourceManager启动
目录
二、TaskManager向ResourceManager注册心跳服务
前言
在之前分析ResourceManager的启动时,我们曾看到其中有一个步骤是启动了两个心跳服务,分别数维护ResourceManager和JobMaster的心跳、维护ResourceManager与TaskManager的心跳。在我们上一章讲TaskManager启动过程中,TaskManager也启动了两个心跳服务,分别是维护TaskManager与JobMaster、维护TaskManager与ResourceManager的心跳服务。在这一章里,我们基于心跳的交互流程来分析分析FLink 的心跳机制
一、ResourceManager启动心跳服务
在ResourceManager启动时,会启动两个心跳服务,我们来到ResourceManager类的startHeartbeatServices方法中
private void startHeartbeatServices() {
// TODO ResourceManager(主节点)维持和从节点的心跳
// TODO ResourceManager(逻辑JobManager)维持和TaskExecutor(TaskManager)的心跳
taskManagerHeartbeatManager =
heartbeatServices.createHeartbeatManagerSender(
resourceId,
new TaskManagerHeartbeatListener(),
getMainThreadExecutor(),
log);
// TODO ResourceManager维持和JobMaster(主控程序)的心跳
jobManagerHeartbeatManager =
heartbeatServices.createHeartbeatManagerSender(
resourceId,
new JobManagerHeartbeatListener(),
getMainThreadExecutor(),
log);
}
通过变量名可以看出,这两个心跳服务分别为:
1、维持ResourceManager和TaskManager的心跳服务
2、维持ResourceManager和JobMaster的心跳服务
我们以ResourceManager与TaskManager的心跳服务为例,点进heartbeatServices.createHeartbeatManagerSender方法:
public <I, O> HeartbeatManager<I, O> createHeartbeatManagerSender(
ResourceID resourceId,
HeartbeatListener<I, O> heartbeatListener,
ScheduledExecutor mainThreadExecutor,
Logger log) {
// TODO
return new HeartbeatManagerSenderImpl<>(
heartbeatInterval,
heartbeatTimeout,
resourceId,
heartbeatListener,
mainThreadExecutor,
log);
}
可以看到,在这里构建了一个心跳发送器对象,我们点进HeartbeatManagerSenderImpl对象的构造方法:
HeartbeatManagerSenderImpl(
long heartbeatPeriod,
long heartbeatTimeout,
ResourceID ownResourceID,
HeartbeatListener<I, O> heartbeatListener,
ScheduledExecutor mainThreadExecutor,
Logger log,
HeartbeatMonitor.Factory<O> heartbeatMonitorFactory) {
super(
heartbeatTimeout,
ownResourceID,
heartbeatListener,
mainThreadExecutor,
log,
heartbeatMonitorFactory);
this.heartbeatPeriod = heartbeatPeriod;
// TODO 线程池定时调用this的run方法,由于delay为0L,立即执行
mainThreadExecutor.schedule(this, 0L, TimeUnit.MILLISECONDS);
}
这段代码在之前的ResourceManager启动章节中曾讲过,这里我们再来复习一下。当ResourceManager启动的时候,会第一次执行到这段代码,在这里启动了一个延时调度的线程池,不过这里的延时参数为0,所以在这里会立刻执行 this 的run方法,我们来到run方法:
@Override
public void run() {
if (!stopped) {
log.debug("Trigger heartbeat request.");
for (HeartbeatMonitor<O> heartbeatMonitor : getHeartbeatTargets().values()) {
// TODO 向所有已注册的从节点封装后的heartbeatMonitor对象发送心跳Rpc请求
requestHeartbeat(heartbeatMonitor);
}
//等heartbeatPeriod=10s之后,再次执行this的run方法,来控制上面的for循环每隔10s执行一次,实现心跳的无限循环
getMainThreadExecutor().schedule(this, heartbeatPeriod, TimeUnit.MILLISECONDS);
}
}
可以看到,在这里会遍历所有已注册到ResourceManager上的节点,这些节点被封装成了一个个的心跳对象HeartbeatMonitor,并放在集合中,当代码第一次执行到这里时,会遍历所有的心跳对象,对每个已注册的节点触发一次心跳,在对所有的节点发送完心跳后,又启动了一个延时任务,每隔10秒触发一次当前类的run方法,也就是这个方法本身,换句话说,就是每隔10秒,对所有已注册的节点发送一次心跳,通过此机制来完成无限心跳。
不过目前来说,此时还没有TaskManager节点注册进来,所以此时的心跳对象集合中是没有TaskExecutor的封装对象的,在上一章中,我们讲了TaskExecutor在启动的过程中会向ResourceManager注册,我们现在回过头来看TaskExecutor注册环节中的一些步骤。
二、TaskManager向ResourceManager注册心跳服务
在上一章中,我们讲到,当TaskExecutor向ResourceManager注册时,会去获取ResourceManager的代理对象,并通过调用代理对象的registerTaskExecutor方法,触发ResourceManager的registerTaskExecutor方法,我们来看ResourceManager的registerTaskExecutor方法:
@Override
public CompletableFuture<RegistrationResponse> registerTaskExecutor(
final TaskExecutorRegistration taskExecutorRegistration, final Time timeout) {
// TODO 获取TaskExecutor的代理,准备回复注册响应
CompletableFuture<TaskExecutorGateway> taskExecutorGatewayFuture =
getRpcService()
.connect(
taskExecutorRegistration.getTaskExecutorAddress(),
TaskExecutorGateway.class);
taskExecutorGatewayFutures.put(
taskExecutorRegistration.getResourceId(), taskExecutorGatewayFuture);
return taskExecutorGatewayFuture.handleAsync(
(TaskExecutorGateway taskExecutorGateway, Throwable throwable) -> {
final ResourceID resourceId = taskExecutorRegistration.getResourceId();
if (taskExecutorGatewayFuture == taskExecutorGatewayFutures.get(resourceId)) {
taskExecutorGatewayFutures.remove(resourceId);
if (throwable != null) {
return new RegistrationResponse.Failure(throwable);
} else {
// TODO 内部注册具体实现
return registerTaskExecutorInternal(
taskExecutorGateway, taskExecutorRegistration);
}
} else {
log.debug(
"Ignoring outdated TaskExecutorGateway connection for {}.",
resourceId.getStringWithMetadata());
return new RegistrationResponse.Failure(
new FlinkException("Decline outdated task executor registration."));
}
},
getMainThreadExecutor());
}
在方法里ResourceManager首先去获取TaskExecutor的代理对象,准备回复注册响应,我们来看注册方法的具体实现,我们进入registerTaskExecutorInternal方法:
private RegistrationResponse registerTaskExecutorInternal(
TaskExecutorGateway taskExecutorGateway,
TaskExecutorRegistration taskExecutorRegistration) {
// TODO TaskExecutor的ResourceId
ResourceID taskExecutorResourceId = taskExecutorRegistration.getResourceId();
// TODO 获取TaskExecutor的注册对象,如果存在,则证明注册过,需要更新
WorkerRegistration<WorkerType> oldRegistration =
taskExecutors.remove(taskExecutorResourceId);
// TODO 如果有旧注册信息
if (oldRegistration != null) {
// TODO :: suggest old taskExecutor to stop itself
log.debug(
"Replacing old registration of TaskExecutor {}.",
taskExecutorResourceId.getStringWithMetadata());
// TODO 则先取消旧的TaskManager的注册,在进行新TaskManager的注册
// remove old task manager registration from slot manager
slotManager.unregisterTaskManager(
oldRegistration.getInstanceID(),
new ResourceManagerException(
String.format(
"TaskExecutor %s re-connected to the ResourceManager.",
taskExecutorResourceId.getStringWithMetadata())));
}
final WorkerType newWorker = workerStarted(taskExecutorResourceId);
String taskExecutorAddress = taskExecutorRegistration.getTaskExecutorAddress();
if (newWorker == null) {
log.warn(
"Discard registration from TaskExecutor {} at ({}) because the framework did "
+ "not recognize it",
taskExecutorResourceId.getStringWithMetadata(),
taskExecutorAddress);
return new TaskExecutorRegistrationRejection(
"The ResourceManager does not recognize this TaskExecutor.");
} else {
// 生成注册对象
WorkerRegistration<WorkerType> registration =
new WorkerRegistration<>(
taskExecutorGateway,
newWorker,
taskExecutorRegistration.getDataPort(),
taskExecutorRegistration.getJmxPort(),
taskExecutorRegistration.getHardwareDescription(),
taskExecutorRegistration.getMemoryConfiguration(),
taskExecutorRegistration.getTotalResourceProfile(),
taskExecutorRegistration.getDefaultSlotResourceProfile());
log.info(
"Registering TaskManager with ResourceID {} ({}) at ResourceManager",
taskExecutorResourceId.getStringWithMetadata(),
taskExecutorAddress);
// TODO 完成注册,这个taskExecutors是一个map,维护着ResourceID和注册对象的关系
taskExecutors.put(taskExecutorResourceId, registration);
// TODO 到此为止,注册逻辑完成
// TODO 从节点心跳管理器,保存了注册进来的TaskExecutor的ResourceID和包装的该TaskExecutor的心跳对象
taskManagerHeartbeatManager.monitorTarget(
taskExecutorResourceId,
new HeartbeatTarget<Void>() {
@Override
public void receiveHeartbeat(ResourceID resourceID, Void payload) {
// the ResourceManager will always send heartbeat requests to the
// TaskManager
}
@Override
public void requestHeartbeat(ResourceID resourceID, Void payload) {
// TODO ResourceManager发送心跳Rpc请求给TaskExecutor
taskExecutorGateway.heartbeatFromResourceManager(resourceID);
}
});
// TODO 返回注册成功消息给TaskExecutor的引用
return new TaskExecutorRegistrationSuccess(
registration.getInstanceID(), resourceId, clusterInformation);
}
}
我们着重看里面的这段代码:
// TODO 从节点心跳管理器,保存了注册进来的TaskExecutor的ResourceID和包装的该TaskExecutor的心跳对象
taskManagerHeartbeatManager.monitorTarget(
taskExecutorResourceId,
new HeartbeatTarget<Void>() {
@Override
public void receiveHeartbeat(ResourceID resourceID, Void payload) {
// the ResourceManager will always send heartbeat requests to the
// TaskManager
}
@Override
public void requestHeartbeat(ResourceID resourceID, Void payload) {
// TODO ResourceManager发送心跳Rpc请求给TaskExecutor
taskExecutorGateway.heartbeatFromResourceManager(resourceID);
}
});
我们点进monitorTarget方法,选择HeartbeatManagerImpl实现:
@Override
public void monitorTarget(ResourceID resourceID, HeartbeatTarget<O> heartbeatTarget) {
if (!stopped) {
if (heartbeatTargets.containsKey(resourceID)) {
log.debug(
"The target with resource ID {} is already been monitored.",
resourceID.getStringWithMetadata());
} else {
// TODO 根据HeartbeatTarget 创建 HeartbeatMonitor并注册到heartbeatTargets map中
HeartbeatMonitor<O> heartbeatMonitor =
heartbeatMonitorFactory.createHeartbeatMonitor(
resourceID,
heartbeatTarget,
mainThreadExecutor,
heartbeatListener,
heartbeatTimeoutIntervalMs);
// TODO 加入心跳目标对象集合
heartbeatTargets.put(resourceID, heartbeatMonitor);
// check if we have stopped in the meantime (concurrent stop operation)
// TODO 如果心跳机制HeartbeatManagerImpl已关闭,则取消心跳超时任务
if (stopped) {
heartbeatMonitor.cancel();
heartbeatTargets.remove(resourceID);
}
}
}
}
可以看到,在这里,ResourceManager的从节点管理器将刚才注册进来的从节点注册为一个心跳对象,并加入自身的心跳对象集合,在完成心跳对象的构建和添加之后,我们回到上一级方法,将会触发requestHeartbeat方法:
@Override
public void requestHeartbeat(ResourceID resourceID, Void payload) {
// TODO ResourceManager发送心跳Rpc请求给TaskExecutor
taskExecutorGateway.heartbeatFromResourceManager(resourceID);
}
在这里,会调用TaskExecutor的代理对象的heartbeatFromResourceManager方法发送心跳,我们来看这个方法,选择TaskExecutor实现:
@Override
public void heartbeatFromResourceManager(ResourceID resourceID) {
// TODO TaskExecutor接收到ResourceManager发送过来的心跳请求
resourceManagerHeartbeatManager.requestHeartbeat(resourceID, null);
}
可以看到,在这里会触发TaskExecutor中的主节点心跳管理器的requestHeartbeat方法,我们来看这个方法,选择HeartbeatManagerImpl实现:
@Override
public void requestHeartbeat(final ResourceID requestOrigin, I heartbeatPayload) {
if (!stopped) {
log.debug("Received heartbeat request from {}.", requestOrigin);
// TODO 汇报心跳
// TODO 当TaskExecutor调用此方法,其实就是TaskExecutor自己记录,最近一次和ResourceManager之间的心跳时间
final HeartbeatTarget<O> heartbeatTarget = reportHeartbeat(requestOrigin);
if (heartbeatTarget != null) {
if (heartbeatPayload != null) {
heartbeatListener.reportPayload(requestOrigin, heartbeatPayload);
}
// TODO 给主节点回复心跳,并做负载汇报
heartbeatTarget.receiveHeartbeat(
getOwnResourceID(), heartbeatListener.retrievePayload(requestOrigin));
}
}
}
在这里,首先会调用reportHeartbeat方法记录心跳时间,我们进入这个方法:
HeartbeatTarget<O> reportHeartbeat(ResourceID resourceID) {
if (heartbeatTargets.containsKey(resourceID)) {
HeartbeatMonitor<O> heartbeatMonitor = heartbeatTargets.get(resourceID);
// TODO 记录心跳
// TODO 当从节点回复主节点心跳时,当前HeartbeatMonitor为主节点
// TODO 当主节点回复从节点心跳时,当前HeartbeatMonitor为从节点
heartbeatMonitor.reportHeartbeat();
return heartbeatMonitor.getHeartbeatTarget();
} else {
return null;
}
}
再进入heartbeatMonitor.reportHeartbeat()方法:
@Override
public void reportHeartbeat() {
// TODO 记录最后一次心跳时间
lastHeartbeat = System.currentTimeMillis();
// TODO 重置心跳超时时间
resetHeartbeatTimeout(heartbeatTimeoutIntervalMs);
}
可以看到,首先记录了一下当前的时间戳,再将时间传入resetHeartbeatTimeout方法,我们进入这个方法:
void resetHeartbeatTimeout(long heartbeatTimeout) {
// TODO 判断当前HeartBeatMonitor的状态是否是Running
if (state.get() == State.RUNNING) {
// TODO 先取消超时任务
cancelTimeout();
// TODO 重新进行延时调度
futureTimeout =
scheduledExecutor.schedule(this, heartbeatTimeout, TimeUnit.MILLISECONDS);
// Double check for concurrent accesses (e.g. a firing of the scheduled future)
if (state.get() != State.RUNNING) {
cancelTimeout();
}
}
}
可以看到,在这里做了三件事:
1、首先会判断一下当前心跳对象的运行状态。
2、取消当前的延时调度任务。
3、重新启动一个延时调度任务
我们继续回到requestHeartbeat方法里,看TaskExecutor向ResourceManager的心跳回复:
// TODO 给主节点回复心跳,并做负载汇报
heartbeatTarget.receiveHeartbeat(
getOwnResourceID(), heartbeatListener.retrievePayload(requestOrigin));
我们进入receiveHeartbeat方法中选择HeartbeatManagerImpl实现:
@Override
public void receiveHeartbeat(ResourceID heartbeatOrigin, I heartbeatPayload) {
if (!stopped) {
log.debug("Received heartbeat from {}.", heartbeatOrigin);
// TODO 接收到TaskExecutor的心跳汇报
reportHeartbeat(heartbeatOrigin);
// TODO 如果TaskExecutor本次汇报的负载信息为空,则还以上次汇报的负载信息为准
// TODO 如果不为空则记录
if (heartbeatPayload != null) {
heartbeatListener.reportPayload(heartbeatOrigin, heartbeatPayload);
}
}
}
可以看到,在这个方法里ResourceManager首先通过reportHeartbeat方法接收心跳汇报,我们进入这个方法:
HeartbeatTarget<O> reportHeartbeat(ResourceID resourceID) {
if (heartbeatTargets.containsKey(resourceID)) {
HeartbeatMonitor<O> heartbeatMonitor = heartbeatTargets.get(resourceID);
// TODO 记录心跳
// TODO 当从节点回复主节点心跳时,当前HeartbeatMonitor为主节点
// TODO 当主节点回复从节点心跳时,当前HeartbeatMonitor为从节点
heartbeatMonitor.reportHeartbeat();
return heartbeatMonitor.getHeartbeatTarget();
} else {
return null;
}
}
可以看到又来到了这个方法,在这里通过reportHeartBeat方法更新了一下心跳时间,我们就不再赘述,回到上一级代码:
// TODO 如果TaskExecutor本次汇报的负载信息为空,则还以上次汇报的负载信息为准
// TODO 如果不为空则记录
if (heartbeatPayload != null) {
heartbeatListener.reportPayload(heartbeatOrigin, heartbeatPayload);
}
可以看到这里TaskExecutor还对ResourceManager进行了一次负载的汇报工作,如果 本次的负载信息汇报为空,则此节点的负载信息还以上一次的汇报结果为准,我们点进这个reportPayload方法,选择在ResourceManager内的实现:
@Override
public void reportPayload(
final ResourceID resourceID, final TaskExecutorHeartbeatPayload payload) {
validateRunsInMainThread();
// TODO 获取TaskExecutor的注册信息
final WorkerRegistration<WorkerType> workerRegistration = taskExecutors.get(resourceID);
if (workerRegistration == null) {
log.debug(
"Received slot report from TaskManager {} which is no longer registered.",
resourceID.getStringWithMetadata());
} else {
InstanceID instanceId = workerRegistration.getInstanceID();
// TODO 进行TaskExecutor的slot状态汇报
slotManager.reportSlotStatus(instanceId, payload.getSlotReport());
clusterPartitionTracker.processTaskExecutorClusterPartitionReport(
resourceID, payload.getClusterPartitionReport());
}
}
可以看到,此处来到了我们上一章中讲到的资源注册环节,在这里首先从节点的注册对象WorkerRegistration里获取到相关的TaskExecutor的注册信息,再通过slotManager进行资源的汇报,我们点进SlotManager的reportSlotStatus方法,选择SlotManagerImpl实现:
@Override
public boolean reportSlotStatus(InstanceID instanceId, SlotReport slotReport) {
checkInit();
TaskManagerRegistration taskManagerRegistration = taskManagerRegistrations.get(instanceId);
if (null != taskManagerRegistration) {
LOG.debug("Received slot report from instance {}: {}.", instanceId, slotReport);
// TODO 进行TaskExecutor的所有Slot的状态汇报
for (SlotStatus slotStatus : slotReport) {
// TODO 更新slot状态
updateSlot(
slotStatus.getSlotID(),
slotStatus.getAllocationID(),
slotStatus.getJobID());
}
return true;
} else {
LOG.debug(
"Received slot report for unknown task manager with instance id {}. Ignoring this report.",
instanceId);
return false;
}
}
在这里遍历了所有TaskExecutor注册进来的Slot,然后通过updateSlot方法更新每一个Slot的状态信息,上一章所讲的这里就不再赘述。到这里,TaskManager启动之后的第一轮注册心跳已经完成,在完成注册心跳之后,TaskManager并不会主动向ResourceManager发送心跳,而是当ResourceManager的心跳发送来后,进行心跳的回复,同时回复自身的负载等信息。由于在注册心跳的环节中,TaskManager已经被ResourceManager封装为了心跳对象并存放在ResourceManager的从节点心跳管理器集合中,就像我们开头所讲的,ResourceManager的心跳服务会不停的遍历所有TaskManager的心跳对象发送心跳,我们回到开头ResourceManager注册心跳的那个方法:
@Override
public void run() {
if (!stopped) {
log.debug("Trigger heartbeat request.");
for (HeartbeatMonitor<O> heartbeatMonitor : getHeartbeatTargets().values()) {
// TODO 向所有已注册的从节点封装后的heartbeatMonitor对象发送心跳Rpc请求
requestHeartbeat(heartbeatMonitor);
}
//等heartbeatPeriod=10s之后,再次执行this的run方法,来控制上面的for循环每隔10s执行一次,实现心跳的无限循环
getMainThreadExecutor().schedule(this, heartbeatPeriod, TimeUnit.MILLISECONDS);
}
}
我们点进遍历代码中的requestHeartbeat方法:
private void requestHeartbeat(HeartbeatMonitor<O> heartbeatMonitor) {
O payload = getHeartbeatListener().retrievePayload(heartbeatMonitor.getHeartbeatTargetId());
final HeartbeatTarget<O> heartbeatTarget = heartbeatMonitor.getHeartbeatTarget();
// TODO
heartbeatTarget.requestHeartbeat(getOwnResourceID(), payload);
}
可以看到这里首先获取了ResourceManager从节点心跳管理器中的从节点心跳对象,并调用心跳对象的requestHeartbeat方法,我们来看这个方法:
@Override
public void requestHeartbeat(ResourceID resourceID, Void payload) {
// TODO ResourceManager发送心跳Rpc请求给TaskExecutor
taskExecutorGateway.heartbeatFromResourceManager(resourceID);
}
我们又回到了这里,后续的流程上面已经讲过,就不在赘述,到此为止,ResourceManager和TaskManager的心跳交互就已经讲完了。
总结
Flink的心跳交互机制和Hdfs不一样,我计划在Flink 源码分析完之后,来分析分析Yarn、Hdfs的源码,到时候我们再来详细看不同的地方,在这里就先简述一下:
1、Flink 的心跳是:Resourcemanager 率先启动,然后启动一个向所有心跳目标对象发送心跳请求的定时任务。当有 TaskExecutor 上线并注册成功,则会生成一个 HeartBeatMonitor 加入到心跳目标对象集合,然后 Resourcemanager 开始一视同仁的向所有 TaskExecutor 发送心跳请求。 TaskExecutor 接收到心跳请求,则执行最近心跳时间的修改,和心跳超时定任务的重置。如果超时了,则发起请求,链接新的 resourcemanager。
2、HDFS 的心跳是:namenode 率先启动,然后启动一个超时检查服务,然后 datanode 启动之后过来注 册,当注册成功之后,datanode就是执行定时心跳任务,这种模式中,是 从节点 datanode主动!