1. 空闲等待
1.1 空闲等待
多并行度的flink作业,watermark水位线的传递遵循接收到上游多个水位线时取最小
、往下游多个子任务发送水位线时进行广播
。此时,如果有其中一个子任务没有数据,导致当前Task的水位线无法推进,窗口无法触发,需要等待上游最小的水位线达到触发时间。于是,flink添加了空闲等待
的设置
1.2 withIdleness
在设置WatermarkStrategy时,添加.withIdleness(Duration.ofSeconds(5))
WatermarkStrategy<WaterSensor> waterSensorWatermarkStrategy = WatermarkStrategy
//升序的watermark,没有等待时间
.<WaterSensor>forGenerator(new WatermarkGeneratorSupplier<WaterSensor>() {
@Override
public WatermarkGenerator<WaterSensor> createWatermarkGenerator(Context context) {
return new MyWatermark<>(Duration.ofSeconds(3));
}
})
//指定时间戳分配器,从数据中提取时间戳
.withTimestampAssigner(new SerializableTimestampAssigner<WaterSensor>() {
@Override
public long extractTimestamp(WaterSensor element, long recordTimestamp) {
System.out.println("数据=" + element + ",recordTs=" + recordTimestamp);
//返回的数据为毫秒
return element.getTs() * 1000;
}
})
.withIdleness(Duration.ofSeconds(5));
1.3 源码
其核心逻辑为:
@Public
public class WatermarksWithIdleness<T> implements WatermarkGenerator<T> {
private final WatermarkGenerator<T> watermarks;
private final IdlenessTimer idlenessTimer;
private boolean isIdleNow = false;
/**
* Creates a new WatermarksWithIdleness generator to the given generator idleness detection with
* the given timeout.
*
* @param watermarks The original watermark generator.
* @param idleTimeout The timeout for the idleness detection.
*/
public WatermarksWithIdleness(WatermarkGenerator<T> watermarks, Duration idleTimeout) {
this(watermarks, idleTimeout, SystemClock.getInstance());
}
@VisibleForTesting
WatermarksWithIdleness(WatermarkGenerator<T> watermarks, Duration idleTimeout, Clock clock) {
checkNotNull(idleTimeout, "idleTimeout");
checkArgument(
!(idleTimeout.isZero() || idleTimeout.isNegative()),
"idleTimeout must be greater than zero");
this.watermarks = checkNotNull(watermarks, "watermarks");
this.idlenessTimer = new IdlenessTimer(clock, idleTimeout);
}
@Override
public void onEvent(T event, long eventTimestamp, WatermarkOutput output) {
watermarks.onEvent(event, eventTimestamp, output);
idlenessTimer.activity();
isIdleNow = false;
}
@Override
public void onPeriodicEmit(WatermarkOutput output) {
if (idlenessTimer.checkIfIdle()) {
if (!isIdleNow) {
output.markIdle();
isIdleNow = true;
}
} else {
watermarks.onPeriodicEmit(output);
}
}
// ------------------------------------------------------------------------
@VisibleForTesting
static final class IdlenessTimer {
/** The clock used to measure elapsed time. */
private final Clock clock;
/** Counter to detect change. No problem if it overflows. */
private long counter;
/** The value of the counter at the last activity check. */
private long lastCounter;
/**
* The first time (relative to {@link Clock#relativeTimeNanos()}) when the activity check
* found that no activity happened since the last check. Special value: 0 = no timer.
*/
private long startOfInactivityNanos;
/** The duration before the output is marked as idle. */
private final long maxIdleTimeNanos;
IdlenessTimer(Clock clock, Duration idleTimeout) {
this.clock = clock;
long idleNanos;
try {
idleNanos = idleTimeout.toNanos();
} catch (ArithmeticException ignored) {
// long integer overflow
idleNanos = Long.MAX_VALUE;
}
this.maxIdleTimeNanos = idleNanos;
}
public void activity() {
counter++;
}
public boolean checkIfIdle() {
if (counter != lastCounter) {
// activity since the last check. we reset the timer
lastCounter = counter;
startOfInactivityNanos = 0L;
return false;
} else // timer started but has not yet reached idle timeout
if (startOfInactivityNanos == 0L) {
// first time that we see no activity since the last periodic probe
// begin the timer
startOfInactivityNanos = clock.relativeTimeNanos();
return false;
} else {
return clock.relativeTimeNanos() - startOfInactivityNanos > maxIdleTimeNanos;
}
}
}
}
checkIfIdle()方法用于判断是否触发水位线推进