Flink1.4 Fault Tolerance源码解析-4

一种特殊的Checkpoint (检查点)—Savepoint(保存点)

Savepoint 是一种特殊的 CheckPoint,所以有关 Savepoint 的实现与 CheckPoint 的实现密切相关

因为 Savepoint 涉及到 Client 和 JobManager 之间的 Actor 信息交互,因此单独一文来进行介绍

CheckPoint VS Savepoint

Savepoint本质上就是CheckPoint,因此Flink程序可以从保存点来恢复执行

保存点允许你在更新程序的同时还能保证Flink集群不丢失任何状态

Savepoint区别于一般CheckPoint的地方:

  • Savepoint由人工触发,而CheckPoint是周期性执行快照并产生检查点
  • 新Savepoint生成时,老Savepoint不会自动失效

注:保存点仅仅是一个指向检查点的指针

image

上图是两者区别的一个图示

在上面的例子中,job 0xA312Bc产生了检查点c1,c2,c3和c4。周期性的检查点,c1和c2已经被丢弃了,c4是最新的检查点

而c2有些特别,它的状态关联着保存点s1,它已被用户触发了并且不会自动过期(图中可见c1和c3在新的检查点产生之后,已经自动过期了)

需要注意的是,s1仅仅是一个指向检查点c2的指针。这意味着,真实的状态不会被拷贝给保存点,但是关联的检查点的状态会得到保

Savepoint的trigger机制

前面说到,Savepoint和CheckPoint一个显著区别就是Savepoint是用户手动触发的,用户可以通过命令行客户端进行手动触发。

Client端

此处介绍下Flink一个单独 Client 模块,触发代码位于该模块下的 CliFrontend 类中

org.apache.flink.client.CliFrontend

Client端Savepoint触发代码流程:

命令行  ==>  CliFrontend[Main]
        ==>  CliFrontend[parseParameters]
		==>  CliFrontend[savepoint]
		==>  CliFrontend[runClusterAction]
		==>  CliFrontend[triggerSavepoint]
		==>  ClusterClient[triggerSavepoint]

Client端Savepoint触发关键代码(ClusterClient类):

public CompletableFuture<String> triggerSavepoint(JobID jobId, @Nullable String savepointDirectory) throws FlinkException {
	final ActorGateway jobManager = getJobManagerGateway();

	// 向JobManager发送TriggerSavepoint消息
	Future<Object> response = jobManager.ask(new JobManagerMessages.TriggerSavepoint(jobId, Option.<String>apply(savepointDirectory)),
		new FiniteDuration(1, TimeUnit.HOURS));
	CompletableFuture<Object> responseFuture = FutureUtils.<Object>toJava(response);

	// 根据返回结果给出返回值
	return responseFuture.thenApply((responseMessage) -> {
		if (responseMessage instanceof JobManagerMessages.TriggerSavepointSuccess) {
			JobManagerMessages.TriggerSavepointSuccess success = (JobManagerMessages.TriggerSavepointSuccess) responseMessage;
			return success.savepointPath();
		} else if (responseMessage instanceof JobManagerMessages.TriggerSavepointFailure) {
			JobManagerMessages.TriggerSavepointFailure failure = (JobManagerMessages.TriggerSavepointFailure) responseMessage;
			throw new CompletionException(failure.cause());
		} else {
			throw new CompletionException(
				new IllegalStateException("Unknown JobManager response of type " + responseMessage.getClass()));
		}
	});
}

JobManager端

JobManager收到Actor消息后,Savepoint触发代码流程:

接收Actor消息  ==>  JobManager[handleMessage]
			   ==>  switch 匹配TriggerSavepoint消息类型
			   ==>  CheckpointCoordinator[triggerSavepoint]
			   ==>  CheckpointCoordinator[triggerCheckpoint]
			   ==>  返回CompletableFuture,待真正完成快照进行返回

JobManager端Savepoint触发关键代码(CheckpointCoordinator类):

public CompletableFuture<CompletedCheckpoint> triggerSavepoint(
		long timestamp,
		@Nullable String targetLocation) {
	
	CheckpointProperties props = CheckpointProperties.forSavepoint();
	// CheckPoint的trigger实现
	CheckpointTriggerResult triggerResult = triggerCheckpoint(
		timestamp,
		props,
		targetLocation,
		false);

	if (triggerResult.isSuccess()) {
		// 返回PendingCheckpoint的CompletableFuture对象,根据快照结果给Client相应消息响应
		return triggerResult.getPendingCheckpoint().getCompletionFuture();
	} else {
		Throwable cause = new CheckpointTriggerException("Failed to trigger savepoint.", triggerResult.getFailureReason());
		return FutureUtils.completedExceptionally(cause);
	}
}

JobManager端Savepoint触发关键代码(JobManager类):

// 此处利用JDK1.8的新类CompletableFuture实现异步触发Savepoint
// 因为checkpoint coordinator操作可能包含阻塞操作,比如一些涉及到state backend 或 ZooKeeper的操作
val savepointFuture = checkpointCoordinator.triggerSavepoint(
	System.currentTimeMillis(),
	savepointDirectory.orNull)

savepointFuture.handleAsync[Void](
	new BiFunction[CompletedCheckpoint, Throwable, Void] {
	  override def apply(success: CompletedCheckpoint, cause: Throwable): Void = {
		if (success != null) {
		  if (success.getExternalPointer != null) {
			  // trigger Savepoint成功,则向Client发送成功的“TriggerSavepointSuccess”消息
			senderRef ! TriggerSavepointSuccess(
			  jobId,
			  success.getCheckpointID,
			  success.getExternalPointer,
			  success.getTimestamp
			)
		  } else {
			  // trigger Savepoint失败,则向Client发送成功的“TriggerSavepointFailure”消息
			senderRef ! TriggerSavepointFailure(
			  jobId, new Exception("Savepoint has not been persisted."))
		  }
		} else {
		  senderRef ! TriggerSavepointFailure(
			jobId, new Exception("Failed to complete savepoint", cause))
		}
		null
	  }
	},
	context.dispatcher)

由代码可以发现,CheckpointCoordinator[triggerCheckpoint] 方法仅完成了触发保存点的逻辑,并返回了 CompletableFuture 对象

当触发的检查点转变为已完成的检查点后,JobManager 中 avepointFuture.handleAsync 的匿名回调被触发,如果 Savepoint 成功,则回复 TriggerSavepointSuccess 消息给 Client;

如果Savepoint失败,则回复TriggerSavepointFailure给Client

CheckPoint、Savepoint的状态存取

前面讲述了CheckPoint和Savepoint的trigger过程,下面针对如何存取的state做简要说明

CheckpointStorage

CheckPoint持久化存储需要保存两类信息:

  • Metadata元数据信息:ZK上保存的、CheckPoint存储的实际路径
  • Checkpoint数据:CheckPoint的state快照信息

CheckpointStorage 包含了上述 CheckPoint 存储需要的所有信息,后面将介绍的 CheckpointStorageLocation,也是由 CheckpointStorage 创建而来

CheckpointStorage类如下:

public interface CheckpointStorage {

	/**
	 * 校验此backend是否支持数据的HA存储
	 */
	boolean supportsHighlyAvailableStorage();

	/**
	 * 校验此存储是否配置了默认savepoint Location
	 */
	boolean hasDefaultSavepointLocation();

	/**
	 * 将外部指针解析为CompletedCheckpointStorageLocation对象,此对象可以读取到checkpoint的metadata信息
	 * 以及处理CheckPoint的存储 
	 */
	CompletedCheckpointStorageLocation resolveCheckpoint(String externalPointer) throws IOException;

	/**
	 * 为新的CheckPoint初始化CheckpointStorageLocation
	 */
	CheckpointStorageLocation initializeLocationForCheckpoint(long checkpointId) throws IOException;

	/**
	 * 为新的Savepoint初始化CheckpointStorageLocation
	 */
	CheckpointStorageLocation initializeLocationForSavepoint(
			long checkpointId,
			@Nullable String externalLocationPointer) throws IOException;

	CheckpointStreamFactory resolveCheckpointStorageLocation(
			long checkpointId,
			CheckpointStorageLocationReference reference) throws IOException;

	CheckpointStateOutputStream createTaskOwnedStateStream() throws IOException;
}

其实现的类图如下:

image

下面重点关注FsCheckpointStorage这个类,这也是目前我司实现HA的方法

FsCheckpointStorage

代码逻辑:

JobManager[submitJob]  ==>  ExecutionGraphBuilder[buildGraph]  
					   ==>  ExecutionGraph[enableCheckpointing]  
					   ==>  new CheckpointCoordinator(...)
					   ==>  StateBackend[createCheckpointStorage]
					   ==>  FsStateBackend[createCheckpointStorage]
					   ==>  new FsCheckpointStorage(...)

CheckpointStorageLocation

为一个特定CheckPoint快照提供CheckPoint数据持久化、元数据信息持久化,以及清理功能

此对象代表的 CheckPoint 存储位置不一定已完成,createMetadataOutputStream 方法是用于创建的 CheckpointMetadataOutputStream 对象,CheckpointMetadataOutputStream 就是用来生成已完成的 CheckPoint 存储对象(CompletedCheckpointStorageLocation)

代码:

public interface CheckpointStorageLocation extends CheckpointStreamFactory {

	/**
	 * 创建一个用于持久化保存checkpoint metadata信息的输出流;
	 * CheckpointMetadataOutputStream对象是用来创建CompletedCheckpointStorageLocation的重要对象
	 */
	CheckpointMetadataOutputStream createMetadataOutputStream() throws IOException;

	/**
	 * 在快照失败时,清理所有数据
	 */
	void disposeOnFailure() throws IOException;

	/**
	 * Gets a reference to the storage location. This reference is sent to the
	 * target storage location via checkpoint RPC messages and checkpoint barriers,
	 * in a format avoiding backend-specific classes.
	 */
	CheckpointStorageLocationReference getLocationReference();
}

代码流程:

checkpointStorageLocation = props.isSavepoint() ?
						checkpointStorage.initializeLocationForSavepoint(checkpointID, externalSavepointLocation) :
						checkpointStorage.initializeLocationForCheckpoint(checkpointID);

checkpointStorage会根据配置信息,确定是为Savepoint还是CheckPoint初始化checkpointStorageLocation对象

CompletedCheckpointStorageLocation

此类代表已完成的CheckPoint的存储位置,和CheckpointStorageLocation类似,为一个特定CheckPoint快照提供CheckPoint数据持久化、元数据信息持久化,以及清理功能

代码流程:

JobManager[submitJob]  ==>  ExecutionGraphBuilder[buildGraph]  
					   ==>  ExecutionGraph[enableCheckpointing]  
					   ==>  ExecutionGraph[new CheckpointCoordinator(...)]
					   ==>  CheckpointCoordinator[StateBackend#createCheckpointStorage]
					   ==>  					 [FsStateBackend#createCheckpointStorage]
					   ==>                       [new FsCheckpointStorage(...)]
					   ==>  CheckpointCoordinator[checkpointStorage.initializeLocationForCheckpoint] // 生成原始CheckpointStorageLocation
					   ==>  CheckpointCoordinator[new PendingCheckpoint(checkpointStorageLocation)]	 // 用未完成的CheckPointLocation构造PendingCheckpoint
					   ==>  PendingCheckpoint[finalizeCheckpoint]									 // 在完成CheckPoint时调用
					   ==>  PendingCheckpoint[targetLocation.createMetadataOutputStream()]			 // 用CheckPointLocation创建metadata信息的输出流对象:CheckpointMetadataOutputStream
					   ==>  PendingCheckpoint[CheckpointMetadataOutputStream.closeAndFinalizeCheckpoint()] // 已完成CheckPoint的Location:FsCompletedCheckpointStorageLocation

代码:

public interface CompletedCheckpointStorageLocation extends java.io.Serializable {

	/**
	 * 获取CheckPoint外部指针,可用来从一个Savepoint或是CheckPoint恢复应用。
	 * (就是从命令行恢复Savepoint需要的参数)
	 */
	String getExternalPointer();

	/**
	 * 获取checkpoint metadata信息的句柄
	 */
	StreamStateHandle getMetadataHandle();

	/**
	 * 释放资源
	 */
	void disposeStorageLocation() throws IOException;
}
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值