Flink——理解 allowLateness

什么鬼

WindowOperator 里面还有有一个叫做 allowLateness 的东西,这个东西什么鬼呢?简单来说就给迟到的数据第二次机会。我允许它迟到一定的时间。在规定的迟到时间内,只要要数据来了,就会触发第二次窗口计算,那到什么时候就没有第二次机会了呢?下面我们来娓娓道来。

allowLateness 的逻辑过程

二话不说,先来看一下下面的代码,在这段代码中,


```java
WindowOperator 中的成员变量
 
  	/**
	 * The allowed lateness for elements. This is used for:
	 * <ul>
	 *     <li>Deciding if an element should be dropped from a window due to lateness.
	 *     <li>Clearing the state of a window if the system time passes the
	 *         {@code window.maxTimestamp + allowedLateness} landmark.
	 * </ul>
	 */
	protected final long allowedLateness;

从上面的代码中,意思是允许元素迟到多长时间。它两个点作用:

  1. 根据迟到的时间,决定一个元素是否别丢弃。
  2. 如果系统时间超过了运行的最时间,Flink 就会清清理窗口的状态,这个阈值就是 window.maxTimestamp + allowedLateness 。超过这个值就会清楚窗口的状态。

如果想弄清楚这里的逻辑,就要到 WindowOperator onElement 这个方法里面找答案。

先来上代码:

public void processElement(StreamRecord<IN> element) throws Exception {
		final Collection<W> elementWindows = windowAssigner.assignWindows(
			element.getValue(), element.getTimestamp(), windowAssignerContext);

		//if element is handled by none of assigned elementWindows
		boolean isSkippedElement = true;

		final K key = this.<K>getKeyedStateBackend().getCurrentKey();

		if (windowAssigner instanceof MergingWindowAssigner) {
			MergingWindowSet<W> mergingWindows = getMergingWindowSet();

			for (W window: elementWindows) {

				// adding the new window might result in a merge, in that case the actualWindow
				// is the merged window and we work with that. If we don't merge then
				// actualWindow == window
				W actualWindow = mergingWindows.addWindow(window, new MergingWindowSet.MergeFunction<W>() {
					@Override
					public void merge(W mergeResult,
							Collection<W> mergedWindows, W stateWindowResult,
							Collection<W> mergedStateWindows) throws Exception {

						if ((windowAssigner.isEventTime() && mergeResult.maxTimestamp() + allowedLateness <= internalTimerService.currentWatermark())) {
							throw new UnsupportedOperationException("The end timestamp of an " +
									"event-time window cannot become earlier than the current watermark " +
									"by merging. Current watermark: " + internalTimerService.currentWatermark() +
									" window: " + mergeResult);
						} else if (!windowAssigner.isEventTime()) {
							long currentProcessingTime = internalTimerService.currentProcessingTime();
							if (mergeResult.maxTimestamp() <= currentProcessingTime) {
								throw new UnsupportedOperationException("The end timestamp of a " +
									"processing-time window cannot become earlier than the current processing time " +
									"by merging. Current processing time: " + currentProcessingTime +
									" window: " + mergeResult);
							}
						}

						triggerContext.key = key;
						triggerContext.window = mergeResult;

						triggerContext.onMerge(mergedWindows);

						for (W m: mergedWindows) {
							triggerContext.window = m;
							triggerContext.clear();
							deleteCleanupTimer(m);
						}

						// merge the merged state windows into the newly resulting state window
						windowMergingState.mergeNamespaces(stateWindowResult, mergedStateWindows);
					}
				});

				// drop if the window is already late
				if (isWindowLate(actualWindow)) {
					mergingWindows.retireWindow(actualWindow);
					continue;
				}
				isSkippedElement = false;

				W stateWindow = mergingWindows.getStateWindow(actualWindow);
				if (stateWindow == null) {
					throw new IllegalStateException("Window " + window + " is not in in-flight window set.");
				}

				windowState.setCurrentNamespace(stateWindow);
				windowState.add(element.getValue());

				triggerContext.key = key;
				triggerContext.window = actualWindow;

				TriggerResult triggerResult = triggerContext.onElement(element);

				if (triggerResult.isFire()) {
					ACC contents = windowState.get();
					if (contents == null) {
						continue;
					}
					emitWindowContents(actualWindow, contents);
				}

				if (triggerResult.isPurge()) {
					windowState.clear();
				}
				registerCleanupTimer(actualWindow);
			}

			// need to make sure to update the merging state in state
			mergingWindows.persist();
		} else {
			for (W window: elementWindows) {

				// drop if the window is already late
				if (isWindowLate(window)) {
					continue;
				}
				isSkippedElement = false;

				windowState.setCurrentNamespace(window);
				windowState.add(element.getValue());

				triggerContext.key = key;
				triggerContext.window = window;

				TriggerResult triggerResult = triggerContext.onElement(element);

				if (triggerResult.isFire()) {
					ACC contents = windowState.get();
					if (contents == null) {
						continue;
					}
					emitWindowContents(window, contents);
				}

				if (triggerResult.isPurge()) {
					windowState.clear();
				}
				registerCleanupTimer(window);
			}
		}

		// side output input event if
		// element not handled by any window
		// late arriving tag has been set
		// windowAssigner is event time and current timestamp + allowed lateness no less than element timestamp
		if (isSkippedElement && isElementLate(element)) {
			if (lateDataOutputTag != null){
				sideOutput(element);
			} else {
				this.numLateRecordsDropped.inc();
			}
		}
	}

我们来分析一下,上面的代码,它的大体上的顺序这样的。

  1. 先根据元素的时间戳,计算出这个元素属于那个窗口。

  2. 判断这个窗口是否是可以合并的窗口,我们这里重点来看一下非可合并的窗口。

  3. 下面重点来了,isWindowLate(window) 这个方法会判断这个窗口是否已经过期了。它的代码为:

    /**

    • Returns {@code true} if the watermark is after the end timestamp plus the allowed lateness
    • of the given window.
      */
      protected boolean isWindowLate(W window) {
      return (windowAssigner.isEventTime() && (cleanupTime(window) <= internalTimerService.currentWatermark()));
      }

从代码上看,我们选择了 event time 后,才会比较 window.end - 1 + allowLateness 和最新 watermark 的大小关系。这里的逻辑是,如果现在的 watermark 落在 window.end - 1 + allowLateness的后面,说明已经超时了,如果落到了前面,就没有超时,窗口可以再触发一次。
如下图所示:

在这里插入图片描述

  1. sideOutput 这个是secord chance,他的功能是把运行迟到元素放到一个地方,后面再处理。就是
    它的判断逻辑是 , element.timestamp + lateness <= currentWatermark 。可以这么想,开晨会的时候,比领导到的时间迟 5 分钟,就算迟到,迟到就罚款。lateness = 0 ,的意思是必须比领导到的要早。

下面是我测试的结果,还有我对结果的解释:

// 输入   000001,1461756862000, 1461756862000代表 2016-04-27 19:34:22.000 ,所以落到了区间[19:34:21:00,19:34:24:00)
timestamp:000001,1461756862000|2016-04-27 19:34:22.000,1461756862000|2016-04-27 19:34:22.000,Watermark @ -10000
// 输入   000001,1461756866000
timestamp:000001,1461756866000|2016-04-27 19:34:26.000,1461756866000|2016-04-27 19:34:26.000,Watermark @ 1461756852000
// 输入   000001,1461756862000
timestamp:000001,1461756872000|2016-04-27 19:34:32.000,1461756872000|2016-04-27 19:34:32.000,Watermark @ 1461756856000
// 输入   000001,1461756873000
timestamp:000001,1461756873000|2016-04-27 19:34:33.000,1461756873000|2016-04-27 19:34:33.000,Watermark @ 1461756862000
// 输入   000001,1461756874000, 1461756874000 代表 2016-04-27 19:34:34.000,这个值减去 10 s,正好是 19:34:24:00,所以
// 触发了 [19:34:21:00,19:34:24:00) 创建的计算,需要注意的是,计算的元素也是落到这个区间的数据,所以这次计算的窗口里面
// 只有一个元素 000001,1461756862000。
timestamp:000001,1461756874000|2016-04-27 19:34:34.000,1461756874000|2016-04-27 19:34:34.000,Watermark @ 1461756863000
(000001,1461756862000)
6> (000001,1,2016-04-27 19:34:22.000,2016-04-27 19:34:22.000,2016-04-27 19:34:21.000,2016-04-27 19:34:24.000)
// 由于设置了 allowLateness(2s),所以 000001,1461756863000(也在 [19:34:21:00,19:34:24:00)) 来到的时候有触发了窗口的计算,
// 需要注意的是,更新 state 的时候,要对结果进行覆盖操作,不能是累计操作。
timestamp:000001,1461756863000|2016-04-27 19:34:23.000,1461756874000|2016-04-27 19:34:34.000,Watermark @ 1461756864000
(000001,1461756862000)
(000001,1461756863000)
6> (000001,2,2016-04-27 19:34:22.000,2016-04-27 19:34:23.000,2016-04-27 19:34:21.000,2016-04-27 19:34:24.000)
// 000001,1461756861000 同上
timestamp:000001,1461756861000|2016-04-27 19:34:21.000,1461756874000|2016-04-27 19:34:34.000,Watermark @ 1461756864000
(000001,1461756861000)
(000001,1461756862000)
(000001,1461756863000)
6> (000001,3,2016-04-27 19:34:21.000,2016-04-27 19:34:23.000,2016-04-27 19:34:21.000,2016-04-27 19:34:24.000)
// 000001,1461756875000 , 1461756875000 代表的是 2016-04-27 19:34:35.000,对应的时间戳是 2016-04-27 19:34:25.000
timestamp:000001,1461756875000|2016-04-27 19:34:35.000,1461756875000|2016-04-27 19:34:35.000,Watermark @ 1461756864000
// 000001,1461756861000 同上
timestamp:000001,1461756861000|2016-04-27 19:34:21.000,1461756875000|2016-04-27 19:34:35.000,Watermark @ 1461756865000
(000001,1461756861000)
(000001,1461756861000)
(000001,1461756862000)
(000001,1461756863000)
6> (000001,4,2016-04-27 19:34:21.000,2016-04-27 19:34:23.000,2016-04-27 19:34:21.000,2016-04-27 19:34:24.000)
// 直到 000001,1461756876000 到来的时候,才开始 drop 迟到的数据,1461756876000 代表的是 2016-04-27 19:34:36.000,对应的
// watermark 是 2016-04-27 19:34:26.000 = 2016-04-27 19:34:24.000 + 2s ,也就说, 000001,1461756876000 这条记录来到后,
// Flink 框架会把[19:34:21:00,19:34:24:00)窗口的 content 被销毁了,找不到 content ,只能抛弃了
timestamp:000001,1461756876000|2016-04-27 19:34:36.000,1461756876000|2016-04-27 19:34:36.000,Watermark @ 1461756865000
// 碰巧,我们设置了 sideOutputLateData ,于是在 000001,1461756861000 在 outputStream 这个流里面输出了。
timestamp:000001,1461756861000|2016-04-27 19:34:21.000,1461756876000|2016-04-27 19:34:36.000,Watermark @ 1461756866000
6> 000001:outside
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值