Checkpoint调用链分析
JobMaster.triggerSavepoint
JobMaster触发savepoint的时候,会启动checkpoint。schedulerNG是调度flink jobs的接口。
@Override
//方法需要两个参数:checkpoint存储路径,任务是否取消
public CompletableFuture<String> triggerSavepoint(
@Nullable final String targetDirectory, final boolean cancelJob, final Time timeout) {
return schedulerNG.triggerSavepoint(targetDirectory, cancelJob);
}
下面我们看一下SchedulerNG的一个实现SchedulerBase
SchedulerBase.triggerSavepoint
主要流程:
- 从executionGraph获取checkpoint的协调器checkpointCoordinator
- 执行一次savepoint
- 如果之前步骤有异常,作业需要取消,则再次启动checkpointCoordinator,抛出异常
- 如果需要取消作业,之前步骤没有异常,作业取消
@Override
public CompletableFuture<String> triggerSavepoint(
final String targetDirectory, final boolean cancelJob) {
mainThreadExecutor.assertRunningInMainThread();
//获取checkpointCoordinator
final CheckpointCoordinator checkpointCoordinator =
executionGraph.getCheckpointCoordinator();
if (checkpointCoordinator == null) {
throw new IllegalStateException(
String.format("Job %s is not a streaming job.", jobGraph.getJobID()));
} else if (targetDirectory == null
&& !checkpointCoordinator.getCheckpointStorage().hasDefaultSavepointLocation()) {
log.info(
"Trying to cancel job {} with savepoint, but no savepoint directory configured.",
jobGraph.getJobID());
throw new IllegalStateException(
"No savepoint directory configured. You can either specify a directory "
+ "while cancelling via -s :targetDirectory or configure a cluster-wide "
+ "default via key '"
+ CheckpointingOptions.SAVEPOINT_DIRECTORY.key()
+ "'.");
}
log.info(
"Triggering {}savepoint for job {}.",
cancelJob ? "cancel-with-" : "",
jobGraph.getJobID());
//如果取消作业,则停止调度checkpoint
if (cancelJob) {
stopCheckpointScheduler();
}
//首先执行一次savapoint过程,其实就是一次对齐检查点的checkpoint,接下来返回保存checkpoint文件的路径
return checkpointCoordinator
.triggerSavepoint(targetDirectory)
.thenApply(CompletedCheckpoint::getExternalPointer)
.handleAsync(
(path, throwable) -> {
if (throwable != null) {
if (cancelJob) {
startCheckpointScheduler();
}
throw new CompletionException(throwable);
} else if (cancelJob) {
log.info(
"Savepoint stored in {}. Now cancelling {}.",
path,
jobGraph.getJobID());
cancel();
}
return path;
},
mainThreadExecutor);
}
CheckpointCoordinator
CheckpointCoordinator负责协调所有算子的分布式快照和状态。它向相关的
task发送消息来触发快照动作,之后收集它们快照成功的确认消息(ack)。
CheckpointCoordinator.createActivatorDeactivator会产生一个job状态监听器,负责监听job状态的变化。
//监听作业状态变化,以开启或取消任务的checkpoint
public JobStatusListener createActivatorDeactivator() {
synchronized (lock) {
if (shutdown) {
throw new IllegalArgumentException("Checkpoint coordinator is shut down");
}
if (jobStatusListener == null) {
jobStatusListener = new CheckpointCoordinatorDeActivator(this);
}
return jobStatusListener;
}
}
JobStatusListener是一个接口,其具体实现CheckpointCoordinatorDeActivator, CheckpointCoordinatorDeActivator.jobStatusChanges方法如下:
//当作业状态为RUNNING,开启checkpoint周期性的调度
@Override
public void jobStatusChanges(
JobID jobId, JobStatus newJobStatus, long timestamp, Throwable error) {
if (newJobStatus == JobStatus.RUNNING) {
// start the checkpoint scheduler
coordinator.startCheckpointScheduler();
} else {
// anything else should stop the trigger for now
coordinator.stopCheckpointScheduler();
}
}
接下来看一下startCheckpointScheduler:
public void startCheckpointScheduler() {
synchronized (lock) {
if (shutdown) {
throw new IllegalArgumentException("Checkpoint coordinator is shut down");
}
Preconditions.checkState(
isPeriodicCheckpointingConfigured(),
"Can not start checkpoint scheduler, if no periodic checkpointing is configured");
// make sure all prior timers are cancelled
//先停止之前的调度器
stopCheckpointScheduler();
//创建新的调度器并延迟触发(延迟时间为checkpoint间隔最短时间到checkpoint间隔时间+1(开区间)之间的随机值)
periodicScheduling = true;
currentPeriodicTrigger = scheduleTriggerWithDelay(getRandomInitDelay());
}
}
scheduleTriggerWithDelay方法启动了一个定时器,定时执行的逻辑在ScheduledTrigger类中,ScheduledTrigger为CheckpointCoordinator的一个内部类。
private ScheduledFuture<?> scheduleTriggerWithDelay(long initDelay) {
return timer.scheduleAtFixedRate(
new ScheduledTrigger(), initDelay, baseInterval, TimeUnit.MILLISECONDS);
}
private final class ScheduledTrigger implements Runnable {
@Override
public void run() {
try {
triggerCheckpoint(true);
} catch (Exception e) {
LOG.error("Exception while triggering checkpoint for job {}.", job, e);
}
}
}
我们接着往下看triggerCheckpoint方法:
private void startTriggeringCheckpoint(CheckpointTriggerRequest request) {
try {
synchronized (lock) {
preCheckGlobalState(request.isPeriodic);
}
// we will actually trigger this checkpoint!
// 真正开始触发checkpoint
Preconditions.checkState(!isTriggering);
isTriggering = true;
final long timestamp = System.currentTimeMillis();
//计算下一次触发checkpoint的计划,所谓计划就是告诉我们哪些任务需要被触发,哪些任务在等待或提交
CompletableFuture<CheckpointPlan> checkpointPlanFuture =
checkpointPlanCalculator.calculateCheckpointPlan();
final CompletableFuture<PendingCheckpoint> pendingCheckpointCompletableFuture =
checkpointPlanFuture
.thenApplyAsync(
plan -> {
try {
CheckpointIdAndStorageLocation
checkpointIdAndStorageLocation =
initializeCheckpoint(
request.props,
request.externalSavepointLocation);
return new Tuple2<>(
plan, checkpointIdAndStorageLocation);
} catch (Throwable e) {
throw new CompletionException(e);
}
},
executor)
.thenApplyAsync(
(checkpointInfo) ->
//pendingCheckpoint是已经启动但尚未被所有需要确认它的任务确认的检查点。一旦所有任务都确认了它,它就变成了{@link CompletedCheckpoint}。
createPendingCheckpoint(
timestamp,
request.props,
checkpointInfo.f0,
request.isPeriodic,
checkpointInfo.f1.checkpointId,
checkpointInfo.f1.checkpointStorageLocation,
request.getOnCompletionFuture()),
timer);
final CompletableFuture<?> coordinatorCheckpointsComplete =
pendingCheckpointCompletableFuture.thenComposeAsync(
(pendingCheckpoint) ->
OperatorCoordinatorCheckpoints
//触发并确认所有CoordinatorCheckpoints
.triggerAndAcknowledgeAllCoordinatorCheckpointsWithCompletion(
coordinatorsToCheckpoint,
pendingCheckpoint,
timer),
timer);
//oordinator checkpoints检查点完成之后,需要调用master的钩子函数,MasterHook用于生成或回复checkpoint之前通知外部系统
// We have to take the snapshot of the master hooks after the coordinator checkpoints
// has completed.
// This is to ensure the tasks are checkpointed after the OperatorCoordinators in case
// ExternallyInducedSource is used.
final CompletableFuture<?> masterStatesComplete =
coordinatorCheckpointsComplete.thenComposeAsync(
ignored -> {
//代码执行到此,可以确保 pending checkpoint部位空
// If the code reaches here, the pending checkpoint is guaranteed to
// be not null.
//我们使用FutureUtils.getWithoutException()来让编译器乐于接受签名中的受控异常。
// We use FutureUtils.getWithoutException() to make compiler happy
// with checked
// exceptions in the signature.
PendingCheckpoint checkpoint =
FutureUtils.getWithoutException(
pendingCheckpointCompletableFuture);
return snapshotMasterState(checkpoint);
},
timer);
FutureUtils.assertNoException(
CompletableFuture.allOf(masterStatesComplete, coordinatorCheckpointsComplete)
.handleAsync(
(ignored, throwable) -> {
final PendingCheckpoint checkpoint =
FutureUtils.getWithoutException(
pendingCheckpointCompletableFuture);
Preconditions.checkState(
checkpoint != null || throwable != null,
"Either the pending checkpoint needs to be created or an error must have occurred.");
if (throwable != null) {
// the initialization might not be finished yet
// 初始化可能还没有完成
if (checkpoint == null) {
onTriggerFailure(request, throwable);
} else {
onTriggerFailure(checkpoint, throwable);
}
} else {
//这里开始发送checkpoint触发请求
triggerCheckpointRequest(
request, timestamp, checkpoint);
}
return null;
},
timer)
.exceptionally(
error -> {
if (!isShutdown()) {
throw new CompletionException(error);
} else if (findThrowable(
error, RejectedExecutionException.class)
.isPresent()) {
LOG.debug("Execution rejected during shutdown");
} else {
LOG.warn("Error encountered during shutdown", error);
}
return null;
})