Flink Checkpoint源码解析-1

最新推荐文章于 2024-03-28 10:57:21 发布

源码挖掘机

最新推荐文章于 2024-03-28 10:57:21 发布

阅读量493

点赞数 1

分类专栏： flink 文章标签： flink 大数据

本文链接：https://blog.csdn.net/bigdatakenan/article/details/130140733

版权

flink 专栏收录该内容

23 篇文章 3 订阅

订阅专栏

Flink的checkpoint机制包括启动、执行和确认完成三个阶段，由JobManager的checkpointCoordinator组件控制。该过程涉及ExecutionGraph的构建、状态后端创建、容错管理、协调器的协调以及定时调度。当作业状态变为RUNNING时，启动checkpoint调度程序，通过ExecutionVertex触发和执行checkpoint操作。

摘要由CSDN通过智能技术生成

checkpoint的执行过程分为三个阶段：启动，执行以及确认完成，其中checkpoint的启动过程由JobManager管理节点中的checkpointCoordinator组件控制。该组件会周期性的向source节点发送执行checkpoint的请求。执行频率取决于配置的checkpointInterval参数。下面我们一起来看一下checkpoint的执行过程。

checkpoint的执行流程如下图所示：

1.ExecutionGraphBuilder.build

当用户在代码中开启checkpoint的时候此时checkpoint的配置会存在StreamGraph中，然后将streamGraph的checkpoint配置转换为JobCheckpointingSetting数据结构存储在JobGraph中，并伴随着JobGraph提交到集群运行，启动JobMaster服务会调度和执行checkpoint操作。

// configure the state checkpointing
		JobCheckpointingSettings snapshotSettings = jobGraph.getCheckpointingSettings();
		if (snapshotSettings != null) {
			List<ExecutionJobVertex> triggerVertices =
					idToVertex(snapshotSettings.getVerticesToTrigger(), executionGraph);

			List<ExecutionJobVertex> ackVertices =
					idToVertex(snapshotSettings.getVerticesToAcknowledge(), executionGraph);

			List<ExecutionJobVertex> confirmVertices =
					idToVertex(snapshotSettings.getVerticesToConfirm(), executionGraph);

			CompletedCheckpointStore completedCheckpoints;
			CheckpointIDCounter checkpointIdCounter;
			try {
				int maxNumberOfCheckpointsToRetain = jobManagerConfig.getInteger(
						CheckpointingOptions.MAX_RETAINED_CHECKPOINTS);

				if (maxNumberOfCheckpointsToRetain <= 0) {
					// warning and use 1 as the default value if the setting in
					// state.checkpoints.max-retained-checkpoints is not greater than 0.
					log.warn("The setting for '{} : {}' is invalid. Using default value of {}",
							CheckpointingOptions.MAX_RETAINED_CHECKPOINTS.key(),
							maxNumberOfCheckpointsToRetain,
							CheckpointingOptions.MAX_RETAINED_CHECKPOINTS.defaultValue());

					maxNumberOfCheckpointsToRetain = CheckpointingOptions.MAX_RETAINED_CHECKPOINTS.defaultValue();
				}

				completedCheckpoints = recoveryFactory.createCheckpointStore(jobId, maxNumberOfCheckpointsToRetain, classLoader);
				checkpointIdCounter = recoveryFactory.createCheckpointIDCounter(jobId);
			}
			catch (Exception e) {
				throw new JobExecutionException(jobId, "Failed to initialize high-availability checkpoint handler", e);
			}

			// Maximum number of remembered checkpoints
			int historySize = jobManagerConfig.getInteger(WebOptions.CHECKPOINTS_HISTORY_SIZE);

			CheckpointStatsTracker checkpointStatsTracker = new CheckpointStatsTracker(
					historySize,
					ackVertices,
					snapshotSettings.getCheckpointCoordinatorConfiguration(),
					metrics);

			// load the state backend from the application settings
			final StateBackend applicationConfiguredBackend;
			final SerializedValue<StateBackend> serializedAppConfigured = snapshotSettings.getDefaultStateBackend();

			if (serializedAppConfigured == null) {
				applicationConfiguredBackend = null;
			}
			else {
				try {
					applicationConfiguredBackend = serializedAppConfigured.deserializeValue(classLoader);
				} catch (IOException | ClassNotFoundException e) {
					throw new JobExecutionException(jobId,
							"Could not deserialize application-defined state backend.", e);
				}
			}

			final StateBackend rootBackend;
			try {
				rootBackend = StateBackendLoader.fromApplicationOrConfigOrDefault(
						applicationConfiguredBackend, jobManagerConfig, classLoader, log);
			}
			catch (IllegalConfigurationException | IOException | DynamicCodeLoadingException e) {
				throw new JobExecutionException(jobId, "Could not instantiate configured state backend", e);
			}

			// instantiate the user-defined checkpoint hooks

			final SerializedValue<MasterTriggerRestoreHook.Factory[]> serializedHooks = snapshotSettings.getMasterHooks();
			final List<MasterTriggerRestoreHook<?>> hooks;

			if (serializedHooks == null) {
				hooks = Collections.emptyList();
			}
			else {
				final MasterTriggerRestoreHook.Factory[] hookFactories;
				try {
					hookFactories = serializedHooks.deserializeValue(classLoader);
				}
				catch (IOException | ClassNotFoundException e) {
					throw new JobExecutionException(jobId, "Could not instantiate user-defined checkpoint hooks", e);
				}

				final Thread thread = Thread.currentThread();
				final ClassLoader originalClassLoader = thread.getContextClassLoader();
				thread.setContextClassLoader(classLoader);

				try {
					hooks = new ArrayList<>(hookFactories.length);
					for (MasterTriggerRestoreHook.Factory factory : hookFactories) {
						hooks.add(MasterHooks.wrapHook(factory.create(), classLoader));
					}
				}
				finally {
					thread.setContextClassLoader(originalClassLoader);
				}
			}

			final CheckpointCoordinatorConfiguration chkConfig = snapshotSettings.getCheckpointCoordinatorConfiguration();

			executionGraph.enableCheckpointing(
				chkConfig,
				triggerVertices,
				ackVertices,
				confirmVertices,
				hooks,
				checkpointIdCounter,
				completedCheckpoints,
				rootBackend,
				checkpointStatsTracker);
		}

此段代码里面主要功能如下：

1.根据snapshotings配置获取triggervertices,acksVertices,confirmVertices节点集合

2.创建CompletedCheckpointStore组件，用于存储checkpoint过程的元数据

3.创建CheckpointIdCounter计数器，只会存储固定数据的完成的ck.

4.创建chceckpointStatusTracker实例，用于追踪ck执行情况和更新情况，web页面 checkpoint显示的信息就是他提供的

5.创建状态后端

6.初始化ck hook函数

7.调用ExeutionGraph.enableCheckpointing，在作业执行和调度过程中开启checkpoint.

2.ExecutionGraph.enableCheckpointing

此方法中的主要逻辑如下：

1.将taskToTrigger,taskToWaitFor,taskToCommitTo三个ExecutionJobvertex集合转换为ExecutionVertex[]数组

2.创建checkpoint failuremaanger，用于checkpoint过程中的容错管理

3.创建checkpointCoordinatorTimer组件,用于checkpoint异步线程的定时调度和执行

4.创建checkpointCoordinator组件，协调和管理作业中的checkpoint

5.注册checkpointCoordinatorDeActivator监控作业的运行状态，当jobStatus变为running，通过startCheckpointScheduler()启动检查点调度程序

public void enableCheckpointing(
			CheckpointCoordinatorConfiguration chkConfig,
			List<ExecutionJobVertex> verticesToTrigger,
			List<ExecutionJobVertex> verticesToWaitFor,
			List<ExecutionJobVertex> verticesToCommitTo,
			List<MasterTriggerRestoreHook<?>> masterHooks,
			CheckpointIDCounter checkpointIDCounter,
			CompletedCheckpointStore checkpointStore,
			StateBackend checkpointStateBackend,
			CheckpointStatsTracker statsTracker) {

		checkState(state == JobStatus.CREATED, "Job must be in CREATED state");
		checkState(checkpointCoordinator == null, "checkpointing already enabled");

		ExecutionVertex[] tasksToTrigger = collectExecutionVertices(verticesToTrigger);
		ExecutionVertex[] tasksToWaitFor = collectExecutionVertices(verticesToWaitFor);
		ExecutionVertex[] tasksToCommitTo = collectExecutionVertices(verticesToCommitTo);

		final Collection<OperatorCoordinatorCheckpointContext> operatorCoordinators = buildOpCoordinatorCheckpointContexts();

		checkpointStatsTracker = checkNotNull(statsTracker, "CheckpointStatsTracker");

		CheckpointFailureManager failureManager = new CheckpointFailureManager(
			chkConfig.getTolerableCheckpointFailureNumber(),
			new CheckpointFailureManager.FailJobCallback() {
				@Override
				public void failJob(Throwable cause) {
					getJobMasterMainThreadExecutor().execute(() -> failGlobal(cause));
				}

				@Override
				public void failJobDueToTaskFailure(Throwable cause, ExecutionAttemptID failingTask) {
					getJobMasterMainThreadExecutor().execute(() -> failGlobalIfExecutionIsStillRunning(cause, failingTask));
				}
			}
		);

		checkState(checkpointCoordinatorTimer == null);

		checkpointCoordinatorTimer = Executors.newSingleThreadScheduledExecutor(
			new DispatcherThreadFactory(
				Thread.currentThread().getThreadGroup(), "Checkpoint Timer"));

		// create the coordinator that triggers and commits checkpoints and holds the state
		checkpointCoordinator = new CheckpointCoordinator(
			jobInformation.getJobId(),
			chkConfig,
			tasksToTrigger,
			tasksToWaitFor,
			tasksToCommitTo,
			operatorCoordinators,
			checkpointIDCounter,
			checkpointStore,
			checkpointStateBackend,
			ioExecutor,
			new CheckpointsCleaner(),
			new ScheduledExecutorServiceAdapter(checkpointCoordinatorTimer),
			SharedStateRegistry.DEFAULT_FACTORY,
			failureManager);

		// register the master hooks on the checkpoint coordinator
		for (MasterTriggerRestoreHook<?> hook : masterHooks) {
			if (!checkpointCoordinator.addMasterHook(hook)) {
				LOG.warn("Trying to register multiple checkpoint hooks with the name: {}", hook.getIdentifier());
			}
		}

		checkpointCoordinator.setCheckpointStatsTracker(checkpointStatsTracker);

		// interval of max long value indicates disable periodic checkpoint,
		// the CheckpointActivatorDeactivator should be created only if the interval is not max value
		if (chkConfig.getCheckpointInterval() != Long.MAX_VALUE) {
			// the periodic checkpoint scheduler is activated and deactivated as a result of
			// job status changes (running -> on, all other states -> off)
			registerJobStatusListener(checkpointCoordinator.createActivatorDeactivator());
		}

		this.stateBackendName = checkpointStateBackend.getClass().getSimpleName();
	}

3.checkpointCoordinatorDeActivator

这个类其实就是一个监听器，用于监控作业状态变化，当作业状态变为running时，启动checkpint调度程序

public class CheckpointCoordinatorDeActivator implements JobStatusListener {

	private final CheckpointCoordinator coordinator;

	public CheckpointCoordinatorDeActivator(CheckpointCoordinator coordinator) {
		this.coordinator = checkNotNull(coordinator);
	}

	@Override
	public void jobStatusChanges(JobID jobId, JobStatus newJobStatus, long timestamp, Throwable error) {
		if (newJobStatus == JobStatus.RUNNING) {
			// start the checkpoint scheduler
			coordinator.startCheckpointScheduler();
		} else {
			// anything else should stop the trigger for now
			coordinator.stopCheckpointScheduler();
		}
	}
}

	public void startCheckpointScheduler() {
		synchronized (lock) {
			if (shutdown) {
				throw new IllegalArgumentException("Checkpoint coordinator is shut down");
			}

			// make sure all prior timers are cancelled
			stopCheckpointScheduler();

			periodicScheduling = true;
			currentPeriodicTrigger = scheduleTriggerWithDelay(getRandomInitDelay());
		}
	}



private ScheduledFuture<?> scheduleTriggerWithDelay(long initDelay) {
		return timer.scheduleAtFixedRate(
			new ScheduledTrigger(),
			initDelay, baseInterval, TimeUnit.MILLISECONDS);
	}




	private final class ScheduledTrigger implements Runnable {

		@Override
		public void run() {
			try {
				triggerCheckpoint(true);
			}
			catch (Exception e) {
				LOG.error("Exception while triggering checkpoint for job {}.", job, e);
			}
		}
	}




public CompletableFuture<CompletedCheckpoint> triggerCheckpoint(
			CheckpointProperties props,
			@Nullable String externalSavepointLocation,
			boolean isPeriodic,
			boolean advanceToEndOfTime) {

		if (advanceToEndOfTime && !(props.isSynchronous() && props.isSavepoint())) {
			return FutureUtils.completedExceptionally(new IllegalArgumentException(
				"Only synchronous savepoints are allowed to advance the watermark to MAX."));
		}

		CheckpointTriggerRequest request = new CheckpointTriggerRequest(props, externalSavepointLocation, isPeriodic, advanceToEndOfTime);
		chooseRequestToExecute(request).ifPresent(this::startTriggeringCheckpoint);
		return request.onCompletionPromise;
	}

schedulerTrigger也是检查点协调器的内部类，其实现了Runnable接口，scheduledTrigger.run()方法调用了checkpointCoordinator.triggerCheckpoint方法触发和执行checkpoint操作,triggerCheckpoint方法又调用了startTriggerCheckpoint方法。

4.checkpointCoordinator.startTriggerCheckpoint

private void startTriggeringCheckpoint(CheckpointTriggerRequest request) {
		try {
			synchronized (lock) {
				preCheckGlobalState(request.isPeriodic);
			}

			final Execution[] executions = getTriggerExecutions();
			final Map<ExecutionAttemptID, ExecutionVertex> ackTasks = getAckTasks();

			// we will actually trigger this checkpoint!
			Preconditions.checkState(!isTriggering);
			isTriggering = true;

			final long timestamp = System.currentTimeMillis();
			final CompletableFuture<PendingCheckpoint> pendingCheckpointCompletableFuture =
				initializeCheckpoint(request.props, request.externalSavepointLocation)
					.thenApplyAsync(
						(checkpointIdAndStorageLocation) -> createPendingCheckpoint(
							timestamp,
							request.props,
							ackTasks,
							request.isPeriodic,
							checkpointIdAndStorageLocation.checkpointId,
							checkpointIdAndStorageLocation.checkpointStorageLocation,
							request.getOnCompletionFuture()),
						timer);

			final CompletableFuture<?> coordinatorCheckpointsComplete = pendingCheckpointCompletableFuture
					.thenComposeAsync((pendingCheckpoint) ->
							OperatorCoordinatorCheckpoints.triggerAndAcknowledgeAllCoordinatorCheckpointsWithCompletion(
									coordinatorsToCheckpoint, pendingCheckpoint, timer),
							timer);

			// We have to take the snapshot of the master hooks after the coordinator checkpoints has completed.
			// This is to ensure the tasks are checkpointed after the OperatorCoordinators in case
			// ExternallyInducedSource is used.
			final CompletableFuture<?> masterStatesComplete = coordinatorCheckpointsComplete
				.thenComposeAsync(ignored -> {
					// If the code reaches here, the pending checkpoint is guaranteed to be not null.
					// We use FutureUtils.getWithoutException() to make compiler happy with checked
					// exceptions in the signature.
					PendingCheckpoint checkpoint =
						FutureUtils.getWithoutException(pendingCheckpointCompletableFuture);
					return snapshotMasterState(checkpoint);
				}, timer);

			FutureUtils.assertNoException(
				CompletableFuture.allOf(masterStatesComplete, coordinatorCheckpointsComplete)
					.handleAsync(
						(ignored, throwable) -> {
							final PendingCheckpoint checkpoint =
								FutureUtils.getWithoutException(pendingCheckpointCompletableFuture);

							Preconditions.checkState(
								checkpoint != null || throwable != null,
								"Either the pending checkpoint needs to be created or an error must have been occurred.");

							if (throwable != null) {
								// the initialization might not be finished yet
								if (checkpoint == null) {
									onTriggerFailure(request, throwable);
								} else {
									onTriggerFailure(checkpoint, throwable);
								}
							} else {
								if (checkpoint.isDisposed()) {
									onTriggerFailure(
										checkpoint,
										new CheckpointException(
											CheckpointFailureReason.TRIGGER_CHECKPOINT_FAILURE,
											checkpoint.getFailureCause()));
								} else {
									// no exception, no discarding, everything is OK
									final long checkpointId = checkpoint.getCheckpointId();
									snapshotTaskState(
										timestamp,
										checkpointId,
										checkpoint.getCheckpointStorageLocation(),
										request.props,
										executions,
										request.advanceToEndOfTime);

									coordinatorsToCheckpoint.forEach((ctx) -> ctx.afterSourceBarrierInjection(checkpointId));
									// It is possible that the tasks has finished checkpointing at this point.
									// So we need to complete this pending checkpoint.
									if (!maybeCompleteCheckpoint(checkpoint)) {
										return null;
									}
									onTriggerSuccess();
								}
							}
							return null;
						},
						timer)
					.exceptionally(error -> {
						if (!isShutdown()) {
							throw new CompletionException(error);
						} else if (findThrowable(error, RejectedExecutionException.class).isPresent()) {
							LOG.debug("Execution rejected during shutdown");
						} else {
							LOG.warn("Error encountered during shutdown", error);
						}
						return null;
					}));
		} catch (Throwable throwable) {
			onTriggerFailure(request, throwable);
		}
	}

1.检查执行环境，检查点协调器是否挂掉，是否是周期性执行

2.创建需要触发的task集合和需要应答的ExecutionVertex集合

3.异步初始化checkpoint和创建pendingCheckpoint

4.checkpoint的触发与执行，获取checkpoint的执行结果执行失败会抛异常

private void snapshotTaskState(
		long timestamp,
		long checkpointID,
		CheckpointStorageLocation checkpointStorageLocation,
		CheckpointProperties props,
		Execution[] executions,
		boolean advanceToEndOfTime) {

		final CheckpointOptions checkpointOptions = CheckpointOptions.create(
			props.getCheckpointType(),
			checkpointStorageLocation.getLocationReference(),
			isExactlyOnceMode,
			unalignedCheckpointsEnabled,
			alignmentTimeout);

		// send the messages to the tasks that trigger their checkpoint
		for (Execution execution: executions) {
			if (props.isSynchronous()) {
				execution.triggerSynchronousSavepoint(checkpointID, timestamp, checkpointOptions, advanceToEndOfTime);
			} else {
				execution.triggerCheckpoint(checkpointID, timestamp, checkpointOptions);
			}
		}
	}

同步执行时调用execution.triggerSynchronousSavepoint(）

异步执行时调用execution.triggerCheckpoint()

private void triggerCheckpointHelper(long checkpointId, long timestamp, CheckpointOptions checkpointOptions, boolean advanceToEndOfEventTime) {

		final CheckpointType checkpointType = checkpointOptions.getCheckpointType();
		if (advanceToEndOfEventTime && !(checkpointType.isSynchronous() && checkpointType.isSavepoint())) {
			throw new IllegalArgumentException("Only synchronous savepoints are allowed to advance the watermark to MAX.");
		}

		final LogicalSlot slot = assignedResource;

		if (slot != null) {
			final TaskManagerGateway taskManagerGateway = slot.getTaskManagerGateway();

			taskManagerGateway.triggerCheckpoint(attemptId, getVertex().getJobId(), checkpointId, timestamp, checkpointOptions, advanceToEndOfEventTime);
		} else {
			LOG.debug("The execution has no slot assigned. This indicates that the execution is no longer running.");
		}
	}

1.获取当前execution分配的LogicalSlot资源，通过slot获取TaskManagerGateway对象

2.调用taskExecutorGateway的triggerCheckpoint()

3.taskExecutorGateway的triggerCheckpoint调用了TaskExecutor的triggerCheckpoint()

5.TaskExecutor.triggerCheckpoint()

1.检查checkpointType，确保只有同步的savepoint才能把watermark调整为max

2.从taskSlotTable获取当前execution对应的task

3.调用task.triggerCheckpointBarrier()向流中注入barrier

public CompletableFuture<Acknowledge> triggerCheckpoint(
			ExecutionAttemptID executionAttemptID,
			long checkpointId,
			long checkpointTimestamp,
			CheckpointOptions checkpointOptions,
			boolean advanceToEndOfEventTime) {
		log.debug("Trigger checkpoint {}@{} for {}.", checkpointId, checkpointTimestamp, executionAttemptID);

		final CheckpointType checkpointType = checkpointOptions.getCheckpointType();
		if (advanceToEndOfEventTime && !(checkpointType.isSynchronous() && checkpointType.isSavepoint())) {
			throw new IllegalArgumentException("Only synchronous savepoints are allowed to advance the watermark to MAX.");
		}

		final Task task = taskSlotTable.getTask(executionAttemptID);

		if (task != null) {
			task.triggerCheckpointBarrier(checkpointId, checkpointTimestamp, checkpointOptions, advanceToEndOfEventTime);

			return CompletableFuture.completedFuture(Acknowledge.get());
		} else {
			final String message = "TaskManager received a checkpoint request for unknown task " + executionAttemptID + '.';

			log.debug(message);
			return FutureUtils.completedExceptionally(new CheckpointException(message, CheckpointFailureReason.TASK_CHECKPOINT_FAILURE));
		}
	}

task.triggerCheckpointBarrier() --> invokable.triggerCheckpointAsync() -->

sourceStreamTask.triggerCheckpointAsync() -->sourceStreamTask.triggerCheckpoint()

6.streamTask.performCheckpoint

1.执行task实例的checkpoint操作

2.通过checkpointBarrier对齐checkpoint

private boolean performCheckpoint(
			CheckpointMetaData checkpointMetaData,
			CheckpointOptions checkpointOptions,
			CheckpointMetricsBuilder checkpointMetrics,
			boolean advanceToEndOfTime) throws Exception {

		LOG.debug("Starting checkpoint ({}) {} on task {}",
			checkpointMetaData.getCheckpointId(), checkpointOptions.getCheckpointType(), getName());

		if (isRunning) {
			actionExecutor.runThrowing(() -> {

				if (checkpointOptions.getCheckpointType().isSynchronous()) {
					setSynchronousSavepointId(checkpointMetaData.getCheckpointId());

					if (advanceToEndOfTime) {
						advanceToEndOfEventTime();
					}
				}

				subtaskCheckpointCoordinator.checkpointState(
					checkpointMetaData,
					checkpointOptions,
					checkpointMetrics,
					operatorChain,
					this::isCanceled);
			});

			return true;
		} else {
			actionExecutor.runThrowing(() -> {
				// we cannot perform our checkpoint - let the downstream operators know that they
				// should not wait for any input from this operator

				// we cannot broadcast the cancellation markers on the 'operator chain', because it may not
				// yet be created
				final CancelCheckpointMarker message = new CancelCheckpointMarker(checkpointMetaData.getCheckpointId());
				recordWriter.broadcastEvent(message);
			});

			return false;
		}
	}