Flink源码-10-CheckPoint实现

CheckPoint

CheckPoint 是实现一次性语义的核心,接下来我们看一下如何实现

CheckpointCoordinator

checkpoint协调器
单线程,固定周期去触发checkpoint

private ScheduledFuture<?> scheduleTriggerWithDelay(long initDelay) {
   return timer.scheduleAtFixedRate(
      new ScheduledTrigger(),
      initDelay, baseInterval, TimeUnit.MILLISECONDS);
}

checkpoint线程类

private final class ScheduledTrigger implements Runnable {

   @Override
   public void run() {
      try {
         triggerCheckpoint(System.currentTimeMillis(), true);
      }
      catch (Exception e) {
         LOG.error("Exception while triggering checkpoint for job {}.", job, e);
      }
   }
}

具体的实现类

public CompletableFuture<CompletedCheckpoint> triggerCheckpoint(
      long timestamp,
      CheckpointProperties props,
      @Nullable String externalSavepointLocation,
      boolean isPeriodic,
      boolean advanceToEndOfTime) throws CheckpointException {
          // send the messages to the tasks that trigger their checkpoint
for (Execution execution: executions) {
   if (props.isSynchronous()) {
      execution.triggerSynchronousSavepoint(checkpointID, timestamp, checkpointOptions, advanceToEndOfTime);
   } else {
       对每个execution触发checkpoint
      execution.triggerCheckpoint(checkpointID, timestamp, checkpointOptions);
   }
} 
          }

接下来从Execution跳到Task

public void triggerCheckpointBarrier(
			final long checkpointID,
			final long checkpointTimestamp,
			final CheckpointOptions checkpointOptions,
			final boolean advanceToEndOfEventTime) {

         //很熟悉的东西,这个实现类就是具体的task
        final AbstractInvokable invokable = this.invokable;
		invokable.triggerCheckpointAsync(checkpointMetaData, checkpointOptions, advanceToEndOfEventTime);

	}

StreamTask

private boolean performCheckpoint(
			CheckpointMetaData checkpointMetaData,
			CheckpointOptions checkpointOptions,
			CheckpointMetrics checkpointMetrics,
			boolean advanceToEndOfTime) throws Exception {

				// Step (1): Prepare the checkpoint, allow operators to do some pre-bar.
				//让所有的算子,提前做一些准备
				operatorChain.prepareSnapshotPreBarrier(checkpointId);

				// Step (2): Send the checkpoint barrier downstream
				//发送barrier
				operatorChain.broadcastCheckpointBarrier(
						checkpointId,
						checkpointMetaData.getTimestamp(),
						checkpointOptions);

				// Step (3): Take the state snapshot. This should be largely asynchronous, to not
				//           impact progress of the streaming topology
				// 做state持久化,会影响流计算
				checkpointState(checkpointMetaData, checkpointOptions, checkpointMetrics);
		}
	}

发送barrier
从这里可以看出来barrier和普通事件流混在一起

public void broadcastCheckpointBarrier(long id, long timestamp, CheckpointOptions checkpointOptions) throws IOException {
		CheckpointBarrier barrier = new CheckpointBarrier(id, timestamp, checkpointOptions);
		for (RecordWriterOutput<?> streamOutput : streamOutputs) {
			streamOutput.broadcastEvent(barrier);
		}
	}

CheckpointedInputGate
处理遇到Barrier

@Override
	public Optional<BufferOrEvent> pollNext() throws Exception {
		while (true) {
		     //取事件
		    BufferOrEvent bufferOrEvent = next.get();
             //事件是barrier处理barrier
			else if (bufferOrEvent.getEvent().getClass() == CheckpointBarrier.class) {
				CheckpointBarrier checkpointBarrier = (CheckpointBarrier) bufferOrEvent.getEvent();
				if (!endOfInputGate) {
					// process barriers only if there is a chance of the checkpoint completing
					if (barrierHandler.processBarrier(checkpointBarrier, offsetChannelIndex(bufferOrEvent.getChannelIndex()), bufferStorage.getPendingBytes())) {
						bufferStorage.rollOver();
					}
				}
			}
			
		}
	}

CheckpointBarrierAligner
处理barrier对齐

@Override
	public boolean processBarrier(CheckpointBarrier receivedBarrier, int channelIndex, long bufferedBytes) throws Exception {
		final long barrierId = receivedBarrier.getId();

		//barrierId 合法,开始新的对齐过程
		else if (barrierId > currentCheckpointId) {
			beginNewAlignment(barrierId, channelIndex);
		}
		
		// check if we have all barriers - since canceled checkpoints always have zero barriers
		// this can only happen on a non canceled checkpoint
		//对齐了,通知这个task checkpoint完成
		if (numBarriersReceived + numClosedChannels == totalNumberOfInputChannels) {
			// actually trigger checkpoint
			if (LOG.isDebugEnabled()) {
				LOG.debug("{}: Received all barriers, triggering checkpoint {} at {}.",
					taskName,
					receivedBarrier.getId(),
					receivedBarrier.getTimestamp());
			}

			releaseBlocksAndResetBarriers();
			notifyCheckpoint(receivedBarrier, bufferedBytes, latestAlignmentDurationNanos);
			return true;
		}
		return checkpointAborted;
	}

onBarrier
收到barrier的处理
标识收到barrier的管道阻塞,数据不处理,放在buffer里,收到的barrier+1

protected void onBarrier(int channelIndex) throws IOException {
		if (!blockedChannels[channelIndex]) {
			blockedChannels[channelIndex] = true;

			numBarriersReceived++;

			if (LOG.isDebugEnabled()) {
				LOG.debug("{}: Received barrier from channel {}.", taskName, channelIndex);
			}
		}
		else {
			throw new IOException("Stream corrupt: Repeated barrier for same checkpoint on input " + channelIndex);
		}
	}
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

wending-Y

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值