一、什么是flink-watermark(水印)
1.1官方字面介绍
Flink-watermark(水印)的本质是DataStream中的一种特殊元素,每个水印都携带有一个时间戳。
当时间戳为T的水印出现时,表示事件时间t <= T的数据都已经到达,即水印后面应该只能流入事件时间t > T的数据。
也就是说,水印是Flink判断迟到数据的标准,同时也是窗口触发的标记。
1.2代码层面
public final class Watermark extends StreamElement {
/** The watermark that signifies end-of-event-time. */
public static final Watermark MAX_WATERMARK = new Watermark(Long.MAX_VALUE);
// ------------------------------------------------------------------------
/** The timestamp of the watermark in milliseconds. */
private final long timestamp;
/**
* Creates a new watermark with the given timestamp in milliseconds.
*/
public Watermark(long timestamp) {
this.timestamp = timestamp;
}
/**
* Returns the timestamp associated with this {@link Watermark} in milliseconds.
*/
public long getTimestamp() {
return timestamp;
}
// ------------------------------------------------------------------------
@Override
public boolean equals(Object o) {
return this == o ||
o != null && o.getClass() == Watermark.class && ((Watermark) o).timestamp == this.timestamp;
}
@Override
public int hashCode() {
return (int) (timestamp ^ (timestamp >>> 32));
}
@Override
public String toString() {
return "Watermark @ " + timestamp;
}
}
从watermark类可以看出其继承StreamElement ,并且有一个时间戳的成员变量,标准的带有时间戳的元素。
StreamElement 是数据流中的一个元素。可以是记录或水印。是不是看到这上面的官方字面解释大概能够明白了
上事实:
1.3时间
1.3.1Processing time(处理时间):
Processing time refers to the system time of the machine that is executing the respective operation.
处理时间是指执行相应操作的机器的系统时间。
1.3.2Event time(事件时间):
Event time is the time that each individual event occurred on its producing device.
事件时间是每个单独事件在其生产设备上发生的时间。
1.3.3Ingestion time(摄取时间):
Ingestion time is the time that events enter Flink.
摄入时间是事件进入Flink的时间。
1.4图解示例
先通过简单示例看一下watermark结合eventTime的工作流程
1.4.1单个并行度
图中的方框就是数据元素,其中的数字表示事件时间,W(x)就表示时间戳是x的水印,并有长度为4个时间单位的滚动窗口。假设时间单位为秒,可见事件时间为2、3、1s的元素都会进入区间为[1s, 4s]的窗口,而事件时间为7s的元素会进入区间为[5s, 8s]的窗口。当水印W(4)到达时,表示已经没有t <= 4s的元素了,[1s, 4s]窗口会被触发并计算。同理,水印W(9)到达时,[5s, 8s]窗口会被触发并计算,以此类推。
1.4.2多个并行度
具有事件和水印的并行数据流和运算符
水印在源函数处生成,或者直接在源函数之后生成。源函数的每个并行子任务通常独立地生成其水印。这些水印定义了该特定并行源的事件时间。
当水印流经数据流时,它们会提前获取数据的事件时间。每个算子都会提取对应的事件时间,它为它的后继操作符产生一个新的下游水印。
一些算子使用多个输入流;例如,一个union,或者在一个keyBy(…)或partition(…)函数之后的算子。这种算子的当前事件时间是其输入流的事件时间的最小值。当它的输入流更新它们的事件时间时,算子也更新它们的事件时间。
怎么理解呢?看看下面这个图。
在遇到keyBy(…)或partition(…)等存在shffer算子事,每个并行度产生的watermark会采用广播的形式往下进行广播,下面在代码上可以看到;
注意:时间戳和水印都指定为毫秒。
二、flink-watermark怎么产生的
2.1通过时间戳分配器/水印生成器
DataStream.assignTimestampsAndWatermarks(…)
关于watermark的类依赖关系图:
2.1.1flink-watermark驱动方式
2.1.1.1AssignerWithPeriodicWatermarks(周期性水印)
周期性发出水印,默认周期是200ms
数据平台部 > flink-watermark(水印) > image2020-5-26_15-40-41.png
也能通过env.getConfig.setAutoWatermarkInterval(3000)方法来指定新的周期
数据平台部 > flink-watermark(水印) > image2020-5-26_15-43-26.png
2.1.1.1.1周期性watermark怎么用呢?
从上面的类依赖图可以很明显的看到flink自身已经为我们实现了三种常用的周期性水印类
2.1.1.1.1.1AscendingTimestampExtractor
/**
* A timestamp assigner and watermark generator for streams where timestamps are monotonously
* ascending. In this case, the local watermarks for the streams are easy to generate, because
* they strictly follow the timestamps.
* 一种时间戳分配程序和水印生成器,用于时间戳单调递增的流。在这种情况下,流的本地水印很容易生成,因为它们严格遵循时间戳。
*
* @param <T> The type of the elements that this function can extract timestamps from
*/
@PublicEvolving
public abstract class AscendingTimestampExtractor<T> implements AssignerWithPeriodicWatermarks<T> {
private static final long serialVersionUID = 1L;
/** The current timestamp. */
private long currentTimestamp = Long.MIN_VALUE;
/** Handler that is called when timestamp monotony is violated. */
private MonotonyViolationHandler violationHandler = new LoggingHandler();
/**
* Extracts the timestamp from the given element. The timestamp must be monotonically increasing.
*
* @param element The element that the timestamp is extracted from.
* @return The new timestamp.
*/
public abstract long extractAscendingTimestamp(T element);
/**
* Sets the handler for violations to the ascending timestamp order.
*
* @param handler The violation handler to use.
* @return This extractor.
*/
public AscendingTimestampExtractor<T> withViolationHandler(MonotonyViolationHandler handler) {
this.violationHandler = requireNonNull(handler);
return this;
}
// ------------------------------------------------------------------------
@Override
public final long extractTimestamp(T element, long elementPrevTimestamp) {
final long newTimestamp = extractAscendingTimestamp(element);
if (newTimestamp >= this.currentTimestamp) {
this.currentTimestamp = newTimestamp;
return newTimestamp;
} else {
violationHandler.handleViolation(newTimestamp, this.currentTimestamp);
return newTimestamp;
}
}
@Override
public final Watermark getCurrentWatermark() {
return new Watermark(currentTimestamp == Long.MIN_VALUE ? Long.MIN_VALUE : currentTimestamp - 1);
}
// ------------------------------------------------------------------------
// Handling violations of monotonous timestamps
// ------------------------------------------------------------------------
/**
* Interface for handlers that handle violations of the monotonous ascending timestamps
* property.
*/
public interface MonotonyViolationHandler extends java.io.Serializable {
/**
* Called when the property of monotonously ascending timestamps is violated, i.e.,
* when {@code elementTimestamp < lastTimestamp}.
*
* @param elementTimestamp The timestamp of the current element.
* @param lastTimestamp The last timestamp.
*/
void handleViolation(long elementTimestamp, long lastTimestamp);
}
/**
* Handler that does nothing when timestamp monotony is violated.
* 当违反时间戳单调时不做任何事情的处理程序。
*/
public static final class IgnoringHandler implements MonotonyViolationHandler {
private static final long serialVersionUID = 1L;
@Override
public void handleViolation(long elementTimestamp, long lastTimestamp) {}
}
/**
* Handler that fails the program when timestamp monotony is violated.
* 当违反时间戳单调性时,使程序失败的处理程序。
*/
public static final class FailingHandler implements MonotonyViolationHandler {
private static final long serialVersionUID = 1L;
@Override
public void handleViolation(long elementTimestamp, long lastTimestamp) {
throw new RuntimeException("Ascending timestamps condition violated. Element timestamp "
+ elementTimestamp + " is smaller than last timestamp " + lastTimestamp);
}
}
/**
* Handler that only logs violations of timestamp monotony, on WARN log level.
* 只在WARN日志级别记录违反时间戳单调的事件的处理程序。
*/
public static final class LoggingHandler implements MonotonyViolationHandler {
private static final long serialVersionUID = 1L;
private static final Logger LOG = LoggerFactory.getLogger(AscendingTimestampExtractor.class);
@Override
public void handleViolation(long elementTimestamp, long lastTimestamp) {
LOG.warn("Timestamp monotony violated: {} < {}", elementTimestamp, lastTimestamp);
}
}
}
AscendingTimestampExtractor一般是针对处理的数据eventTime是递增的,当然产生的时间戳和水印必须是递增的;但是也不一定
单调递增的事件时间并不太符合实际情况,所以AscendingTimestampExtractor用得不多。
用户通过覆写extractAscendingTimestamp()方法抽取时间戳。如果产生了递减的时间戳,就要使用名为MonotonyViolationHandler的组件处理异常,有三种方式:
a、打印警告日志(默认)
b、抛出RuntimeException,使程序失败的处理程序
c、不做任何处理,空实现
对于以上的三种异常处理情况,产生的watermark特点:
对于a,c两种情况产生的watermark-timestamp就没有一定的规律
对于b这种情况呢,相当于就是输入的eventTime必须是递增的,不然程序就会挂掉,当然这种情况产生的water-timestamp肯定也是递增的
2.1.1.1.1.2BoundedOutOfOrdernessTimestampExtractor
/**
* This is a {@link AssignerWithPeriodicWatermarks} used to emit Watermarks that lag behind the element with
* the maximum timestamp (in event time) seen so far by a fixed amount of time, <code>t_late</code>. This can
* help reduce the number of elements that are ignored due to lateness when computing the final result for a
* given window, in the case where we know that elements arrive no later than <code>t_late</code> units of time
* after the watermark that signals that the system event-time has advanced past their (event-time) timestamp.
* */
public abstract class BoundedOutOfOrdernessTimestampExtractor<T> implements AssignerWithPeriodicWatermarks<T> {
private static final long serialVersionUID = 1L;
/** The current maximum timestamp seen so far. */
private long currentMaxTimestamp;
/** The timestamp of the last emitted watermark. */
private long lastEmittedWatermark = Long.MIN_VALUE;
/**
* The (fixed) interval between the maximum seen timestamp seen in the records
* and that of the watermark to be emitted.
*/
private final long maxOutOfOrderness;
public BoundedOutOfOrdernessTimestampExtractor(Time maxOutOfOrderness) {
if (maxOutOfOrderness.toMilliseconds() < 0) {
throw new RuntimeException("Tried to set the maximum allowed " +
"lateness to " + maxOutOfOrderness + ". This parameter cannot be negative.");
}
this.maxOutOfOrderness = maxOutOfOrderness.toMilliseconds();
this.currentMaxTimestamp = Long.MIN_VALUE + this.maxOutOfOrderness;
}
public long getMaxOutOfOrdernessInMillis() {
return maxOutOfOrderness;
}
/**
* Extracts the timestamp from the given element.
*
* @param element The element that the timestamp is extracted from.
* @return The new timestamp.
*/
public abstract long extractTimestamp(T element);
@Override
public final Watermark getCurrentWatermark() {
// this guarantees that the watermark never goes backwards.
long potentialWM = currentMaxTimestamp - maxOutOfOrderness;
if (potentialWM >= lastEmittedWatermark) {
lastEmittedWatermark = potentialWM;
}
return new Watermark(lastEmittedWatermark);
}
@Override
public final long extractTimestamp(T element, long previousElementTimestamp) {
long timestamp = extractTimestamp(element);
if (timestamp > currentMaxTimestamp) {
currentMaxTimestamp = timestamp;
}
return timestamp;
}
}
这是最为常见和常用的一种方式,这是一种可以设置一个最大有界乱序的时长,即在设置的允许乱序的时间内延迟的数据是可以进入窗口计算。
构造它时传入的参数maxOutOfOrderness就是乱序区间的长度,而实际发射的水印为通过覆写extractTimestamp()方法提取出来的时间戳减去乱序区间,相当于让水印把步调“放慢一点”。这是Flink为迟到数据提供的第一重保障。
当然,乱序区间的长度要根据实际环境谨慎设定,设定得太短会丢较多的数据,设定得太长会导致窗口触发延迟,实时性减弱。
2.1.1.1.1.3IngestionTimeExtractor
/**
* A timestamp assigner that assigns timestamps based on the machine's wall clock.
* 一种时间戳分配程序,它根据机器的挂钟分配时间戳。
*
* <p>If this assigner is used after a stream source, it realizes "ingestion time" semantics.
* <p>如果在流源之后使用这个分配器,它将实现“摄取时间”语义。
*
* @param <T> The elements that get timestamps assigned.
*/
public class IngestionTimeExtractor<T> implements AssignerWithPeriodicWatermarks<T> {
private static final long serialVersionUID = -4072216356049069301L;
private long maxTimestamp;
@Override
public long extractTimestamp(T element, long previousElementTimestamp) {
// make sure timestamps are monotonously increasing, even when the system clock re-syncs
final long now = Math.max(System.currentTimeMillis(), maxTimestamp);
maxTimestamp = now;
return now;
}
@Override
public Watermark getCurrentWatermark() {
// make sure timestamps are monotonously increasing, even when the system clock re-syncs
final long now = Math.max(System.currentTimeMillis(), maxTimestamp);
maxTimestamp = now;
return new Watermark(now - 1);
}
}
从代码来看,不难可以看出这种方式是通过系统时间进行进行分配时间戳,实际上watermark时间就是最新那一条进入flink时的系统时间。
实际这就是三大时间类型Ingestion time(摄取时间)的实现方式。
2.1.1.1.2flink底层是怎么实现周期性发出watermark的呢?
直接看源码~
先看看extractTimestamp是什么时候调用的
public void processElement(StreamRecord<T> element) throws Exception {
final long newTimestamp = userFunction.extractTimestamp(element.getValue(),
element.hasTimestamp() ? element.getTimestamp() : Long.MIN_VALUE);
output.collect(element.replace(element.getValue(), newTimestamp));
}
可以看到对于extractTimestamp就是没来一条数据都会调用一次,并给record打上eventTime往下输出
再看看什么调用getCurrentWatermark函数的
public void open() throws Exception {
super.open();
//初始化watermark
currentWatermark = Long.MIN_VALUE;
//通过envConfig获取周期间隔时长
watermarkInterval = getExecutionConfig().getAutoWatermarkInterval();
if (watermarkInterval > 0) {
//获取当前ProcessingTime
long now = getProcessingTimeService().getCurrentProcessingTime();
//注册定时器调用onProcessingTime()
getProcessingTimeService().registerTimer(now + watermarkInterval, this);
}
}
public void onProcessingTime(long timestamp) throws Exception {
//获取最新的watermark
Watermark newWatermark = userFunction.getCurrentWatermark();
//判断最新的watermark是否是递增的且不为空,如果新的watermark是大于上一个watermark的,则将新的watermark进行发出
if (newWatermark != null && newWatermark.getTimestamp() > currentWatermark) {
currentWatermark = newWatermark.getTimestamp();
// emit watermark
// 提交watermark
output.emitWatermark(newWatermark);
}
//获取当前ProcessTime(系统事件)
long now = getProcessingTimeService().getCurrentProcessingTime();
//注册下一次触法获取watermark的定时器
getProcessingTimeService().registerTimer(now + watermarkInterval, this);
}
下游算子是keyby等shuffe算子时
都知道flink fuction里的open方法都是在程序初始化的时候会被调用,所以一些初始化的操作都会放到open方法里处理;
从源码可以看到TimestampsAndPeriodicWatermarksOperator的onProcessingTime会周期性通过定时器的调用自定义的getCurrentWatermark()获取最新的watermark
接下来看一下output.emitWatermark(newWatermark)
public void emitWatermark(Watermark mark) {
//为metrics设置最新的watermark值,这个值就是在job的web ui看到的那个watermark值
watermarkGauge.setCurrentWatermark(mark.getTimestamp());
//将watermark通过序列化器进行序列化
serializationDelegate.setInstance(mark);
//判断流的当前状态是否处于active状态
if (streamStatusProvider.getStreamStatus().isActive()) {
try {
//将序列化的watermark进行广播
recordWriter.broadcastEmit(serializationDelegate);
} catch (Exception e) {
throw new RuntimeException(e.getMessage(), e);
}
}
}
recordWriter.broadcastEmit(serializationDelegate)
/**
* This is used to broadcast Streaming Watermarks in-band with records. This ignores
* the {@link ChannelSelector}.
*/
public void broadcastEmit(T record) throws IOException, InterruptedException {
checkErroneous();
serializer.serializeRecord(record);
boolean pruneAfterCopying = false;
//遍历task进行copy转发
for (int channel : broadcastChannels) {
//copy并返回是否可以清理缓冲区的标识
if (copyFromSerializerToTargetChannel(channel)) {
pruneAfterCopying = true;
}
}
// Make sure we don't hold onto the large intermediate serialization buffer for too long
//确保不要占用大型中间序列化缓冲区太长时间
if (pruneAfterCopying) {
//清除缓冲区,并在完成包括中间数据序列化在内的整个序列化过程后进行检查,以减小中间数据序列化缓冲区的大小
serializer.prune();
}
}
下游算子是map等能chain在一起的算子时
output.emitWatermark(newWatermark)
public void emitWatermark(Watermark mark) {
try {
//为metrics设置最新的watermark值,这个值就是在job的web ui看到的那个watermark值
watermarkGauge.setCurrentWatermark(mark.getTimestamp());
if (streamStatusProvider.getStreamStatus().isActive()) {
//直接往下游算子发送
operator.processWatermark(mark);
}
}
catch (Exception e) {
throw new ExceptionInChainedOperatorException(e);
}
}
2.1.1.2AssignerWithPunctuatedWatermarks(打点(即由事件本身的属性触发)水印)
2.1.1.2.1简单介绍一下
AssignerWithPunctuatedWatermarks适用于需要依赖于事件本身的某些属性决定是否发射水印的情况。Flink将首先调用extractTimestamp(…)方法来为元素分配一个时间戳,然后立即调用该元素上的checkAndGetNextWatermark(…)方法。
相当于发射水印的机制完全由用户通过checkAndGetNextWatermark方法来进行控制。
2.1.1.2.2接口源码
public interface AssignerWithPunctuatedWatermarks<T> extends TimestampAssigner<T> {
/**
* Asks this implementation if it wants to emit a watermark. This method is called right after
* the {@link #extractTimestamp(Object, long)} method.
*
* <p>The returned watermark will be emitted only if it is non-null and its timestamp
* is larger than that of the previously emitted watermark (to preserve the contract of
* ascending watermarks). If a null value is returned, or the timestamp of the returned
* watermark is smaller than that of the last emitted one, then no new watermark will
* be generated.
*只有当返回的水印是非空且其时间戳大于先前发出的水印(以保留升序水印的契约)时,才会发出返回的水印。
*如果返回null值,或者返回的水印的时间戳小于最后发出的水印的时间戳,则不会生成新水印。
*
* <p>For an example how to use this method, see the documentation of
* {@link AssignerWithPunctuatedWatermarks this class}.
*
* @return {@code Null}, if no watermark should be emitted, or the next watermark to emit.
*/
@Nullable
Watermark checkAndGetNextWatermark(T lastElement, long extractedTimestamp);
}
从以上接口中可以看到只有一个方法checkAndGetNextWatermark,该方法就是用户可以通过输入元素进行自定义发出水印的规则
2.1.1.2.3官方实现案例
class PunctuatedAssigner extends AssignerWithPunctuatedWatermarks[MyEvent] {
override def extractTimestamp(element: MyEvent, previousElementTimestamp: Long): Long = {
element.getCreationTime
}
override def checkAndGetNextWatermark(lastElement: MyEvent, extractedTimestamp: Long): Watermark = {
if (lastElement.hasWatermarkMarker()) new Watermark(extractedTimestamp) else null
}
}
checkAndGetNextWatermark(…)方法传递在extractTimestamp(…)方法中分配的时间戳,并可以决定是否要生成水印。每当checkAndGetNextWatermark(…)方法返回一个非空水印,并且该水印大于最新的前一个水印时,就会发出该新水印。
2.1.1.2.4注意
但是这种方式可以在每个事件上生成水印。然而,由于每个水印都会导致一些下游的计算,过多的水印会降低性能。所以合理的定制水印规则很关键!
2.1.1.2.5flink底层是怎么实现按打点发出daiwatermark的呢?
直接看看源码~
public void processElement(StreamRecord<T> element) throws Exception {
//获取记录的值
final T value = element.getValue();
//调用用户自定义function里的extractTimestamp获取eventTime
final long newTimestamp = userFunction.extractTimestamp(value,
element.hasTimestamp() ? element.getTimestamp() : Long.MIN_VALUE);
//向下输出带有时间戳的StreamRecord
output.collect(element.replace(element.getValue(), newTimestamp));
//调用用户自定义获取watermark的函数
final Watermark nextWatermark = userFunction.checkAndGetNextWatermark(value, newTimestamp);
//判断新的watermark如果不为空,且产生的是递增的watermark,则向下广播输出
if (nextWatermark != null && nextWatermark.getTimestamp() > currentWatermark) {
currentWatermark = nextWatermark.getTimestamp();
output.emitWatermark(nextWatermark);
}
}
从这段代码可以很清晰的看到,每来一条数据都会直接调用extractTimestamp方法获取eventTime,然后将recod附上一个eventTime往下发,然后就会调用checkAndGetNextWatermark方法判断是否符合产生watermark条件,按用户自定义规则进行判断;
然后就是判断新的watermark如果不为空,且产生的是递增的watermark,则向下广播输出
2.2直接在数据流源中产生
kafkaSource按kafka分区提取eventTime产生
当使用kafka作为stream source时,可以在Kafka使用者内部生成每个Kafka分区的水印,每个分区的水印的合并方式与在stream shuffles上合并水印的方式相同。
如果事件时间戳严格按照Kafka分区升序,那么使用升序时间戳水印生成器生成每个分区的水印将得到完美的整体水印。
下面的插图展示了如何使用每个kafka分区的水印生成,以及在这种情况下如何通过流数据流传播水印。
Generating Watermarks with awareness for Kafka-partitions
注意:
如果水印分配器依靠从Kafka读取的记录来推进其水印(通常是这种情况),则所有主题和分区都需要具有连续的记录流。
否则,整个应用程序的水印将无法前进,并且所有基于时间的操作(例如时间窗口或带有计时器的功能)都无法取得进展。单个空闲的Kafka分区会导致此行为。
当分区空闲时,FlinkKafkaConsumer中的按分区水印机制也将无法进行水印处理。空闲分区的水印始终是Long.MIN_VALUE,因此,消费者子任务的所有分区上的总体最小水印将永远不会进行。
计划对Flink进行改进以防止这种情况的发生(请参阅FLINK-5479:FlinkKafkaConsumer中的按分区水印应考虑空闲分区)。
同时,一种可能的解决方法是将心跳消息发送到所有消耗的分区,从而提高空闲分区的水印。
使用案例
Properties properties = new Properties();
properties.setProperty("bootstrap.servers", "localhost:9092");
// only required for Kafka 0.8
properties.setProperty("zookeeper.connect", "localhost:2181");
properties.setProperty("group.id", "test");
FlinkKafkaConsumer08<String> myConsumer =
new FlinkKafkaConsumer08<>("topic", new SimpleStringSchema(), properties);
myConsumer.assignTimestampsAndWatermarks(new CustomWatermarkEmitter());
DataStream<String> stream = env
.addSource(myConsumer)
.print();
三、多并行度下watermark合并机制
看看
public boolean processInput() throws Exception {
.........................
while (true) {
if (currentRecordDeserializer != null) {
DeserializationResult result = currentRecordDeserializer.getNextRecord(deserializationDelegate);
if (result.isBufferConsumed()) {
currentRecordDeserializer.getCurrentBuffer().recycleBuffer();
currentRecordDeserializer = null;
}
if (result.isFullRecord()) {
StreamElement recordOrMark = deserializationDelegate.getInstance();
//判断数据元素类型,如果是watermark
if (recordOrMark.isWatermark()) {
// handle watermark
//watermark处理
statusWatermarkValve.inputWatermark(recordOrMark.asWatermark(), currentChannel);
continue;
} else if (recordOrMark.isStreamStatus()) {
// handle stream status
statusWatermarkValve.inputStreamStatus(recordOrMark.asStreamStatus(), currentChannel);
continue;
} else if (recordOrMark.isLatencyMarker()) {
// handle latency marker
synchronized (lock) {
streamOperator.processLatencyMarker(recordOrMark.asLatencyMarker());
}
continue;
} else {
// now we can do the actual processing
StreamRecord<IN> record = recordOrMark.asRecord();
synchronized (lock) {
numRecordsIn.inc();
streamOperator.setKeyContextElement1(record);
streamOperator.processElement(record);
}
return true;
}
}
}
..................................................
}
}
inputWatermark
/**
* Feed a {@link Watermark} into the valve. If the input triggers the valve to output a new Watermark,
* {@link ValveOutputHandler#handleWatermark(Watermark)} will be called to process the new Watermark.
*
* @param watermark the watermark to feed to the valve
* @param channelIndex the index of the channel that the fed watermark belongs to (index starting from 0)
*/
public void inputWatermark(Watermark watermark, int channelIndex) {
// ignore the input watermark if its input channel, or all input channels are idle (i.e. overall the valve is idle).
//如果其输入通道,或所有输入通道是空闲的(即整个阀门是空闲的),忽略输入水印。
if (lastOutputStreamStatus.isActive() && channelStatuses[channelIndex].streamStatus.isActive()) {
long watermarkMillis = watermark.getTimestamp();
// if the input watermark's value is less than the last received watermark for its input channel, ignore it also.
//如果输入水印的值小于其输入通道最后收到的水印的值,也忽略它。
if (watermarkMillis > channelStatuses[channelIndex].watermark) {
channelStatuses[channelIndex].watermark = watermarkMillis;
// previously unaligned input channels are now aligned if its watermark has caught up
//以前未对齐的输入通道现在如果它的水印已经跟上,就会对齐
if (!channelStatuses[channelIndex].isWatermarkAligned && watermarkMillis >= lastOutputWatermark) {
channelStatuses[channelIndex].isWatermarkAligned = true;
}
// now, attempt to find a new min watermark across all aligned channels
//尝试在所有对齐的通道中找到一个新的最小水印
findAndOutputNewMinWatermarkAcrossAlignedChannels();
}
}
}
findAndOutputNewMinWatermarkAcrossAlignedChannels
private void findAndOutputNewMinWatermarkAcrossAlignedChannels() {
long newMinWatermark = Long.MAX_VALUE;
boolean hasAlignedChannels = false;
// determine new overall watermark by considering only watermark-aligned channels across all channels
//通过只考虑所有通道上的水印对齐通道来确定新的整体水印
for (InputChannelStatus channelStatus : channelStatuses) {
if (channelStatus.isWatermarkAligned) {
hasAlignedChannels = true;
//对比当前watermark,取最小的watermark
newMinWatermark = Math.min(channelStatus.watermark, newMinWatermark);
}
}
// we acknowledge and output the new overall watermark if it really is aggregated
// from some remaining aligned channel, and is also larger than the last output watermark
//如果已经确认对齐watermark且大于上一个watermark
if (hasAlignedChannels && newMinWatermark > lastOutputWatermark) {
lastOutputWatermark = newMinWatermark;
outputHandler.handleWatermark(new Watermark(lastOutputWatermark));
}
}
四、flink怎么处理延迟数据的呢?
3.1窗口允许延迟
在使用事件时间窗口时,可能会发生元素到达较晚的情况,即 Flink用于跟踪事件时间进度的水印已经超过了元素所属窗口的结束时间戳。请参阅 event time ,尤其是 late elements ,以更全面地讨论Flink如何处理事件时间。
默认情况下,当水印超过窗口结束时间时,将删除迟到数据。但是,Flink允许为窗口运算符指定最大允许延迟。允许延迟指定数据删除之前可以延迟多少时间,其默认值为0。
在水印通过窗口结束时间之后但在通过窗口结束之前到达的元素加上允许延迟,仍添加到窗口中。根据使用的触发器,延迟但未丢掉的数据可能会导致窗口再次触发。的情况就是这样EventTimeTrigger。
为了使此工作正常进行,Flink保持窗口的状态,直到允许的延迟过期为止。一旦超过允许延迟时间,Flink将删除该窗口并删除其状态,如“ 窗口生命周期”部分中所述。
默认情况下,允许的延迟设置为 0。也就是说,到达水印后的元素将被丢弃。
数据平台部 > flink-watermark(水印) > image2020-5-28_17-38-34.png
3.2实际开发中怎么玩呢
来段官方案例体会一下
DataStream<T> input = ...;
result=input
.keyBy(<key selector>)
.window(<window assigner>)
.allowedLateness(<time>)
.sideOutputLateData(lateData)
.<windowed transformation>(<window function>);
result.getSideOutput(lateData)
很明显代码中的allowedLateness(
3.3看看flink底层是怎么做这个功能的呢
直接看看WindowOperator.processElement是怎么处理元素的呢
public void processElement(StreamRecord<IN> element) throws Exception {
final Collection<W> elementWindows = windowAssigner.assignWindows(
element.getValue(), element.getTimestamp(), windowAssignerContext);
//if element is handled by none of assigned elementWindows
boolean isSkippedElement = true;
final K key = this.<K>getKeyedStateBackend().getCurrentKey();
if (windowAssigner instanceof MergingWindowAssigner) {
MergingWindowSet<W> mergingWindows = getMergingWindowSet();
for (W window: elementWindows) {
..............................
if (triggerResult.isPurge()) {
windowState.clear();
}
registerCleanupTimer(actualWindow);
}
// need to make sure to update the merging state in state
mergingWindows.persist();
} else {
for (W window: elementWindows) {
..............................
if (triggerResult.isPurge()) {
windowState.clear();
}
registerCleanupTimer(window);
}
}
再往下看看registerCleanupTimer(window)
/**
* Registers a timer to cleanup the content of the window.
* @param window
* the window whose state to discard
*/
protected void registerCleanupTimer(W window) {
long cleanupTime = cleanupTime(window);
if (cleanupTime == Long.MAX_VALUE) {
// don't set a GC timer for "end of time"
return;
}
if (windowAssigner.isEventTime()) {
triggerContext.registerEventTimeTimer(cleanupTime);
} else {
triggerContext.registerProcessingTimeTimer(cleanupTime);
}
}
从这个函数第一行可以看到获取清理时间,看到这个是不是能感觉到就快找到和我们代码里allowedLateness(Time.seconds(2L))设置的时间快扯上关系了
继续往下看cleanupTime(window)
/**
* Returns the cleanup time for a window, which is
* {@code window.maxTimestamp + allowedLateness}. In
* case this leads to a value greater than {@link Long#MAX_VALUE}
* then a cleanup time of {@link Long#MAX_VALUE} is
* returned.
*
* @param window the window whose cleanup time we are computing.
*/
private long cleanupTime(W window) {
if (windowAssigner.isEventTime()) {
long cleanupTime = window.maxTimestamp() + allowedLateness;
return cleanupTime >= window.maxTimestamp() ? cleanupTime : Long.MAX_VALUE;
} else {
return window.maxTimestamp();
}
}
是不是到这就看到真相了。。。。。。。。。。。。。。
会在延迟数据来的时候将延迟数据打上延迟标记,正常来说对于延迟数据来说不加旁路输出就会正常跳过丢掉,加上旁路输出的话,就会将延迟数据以特殊的方式输出。
未完待续。。。
五、走一个案列试试
4.1周期性水印demo代码
这是一个周期性产生水印的demo,允许乱序时间是3s,最大延迟时间是2s,水印周期是3s
package test.watermark_demo
import org.apache.commons.lang3.time.FastDateFormat
import org.apache.flink.api.java.tuple.Tuple
import org.apache.flink.api.scala._
import org.apache.flink.streaming.api.TimeCharacteristic
import org.apache.flink.streaming.api.functions.AssignerWithPeriodicWatermarks
import org.apache.flink.streaming.api.scala.function.WindowFunction
import org.apache.flink.streaming.api.scala.{DataStream, OutputTag, StreamExecutionEnvironment}
import org.apache.flink.streaming.api.watermark.Watermark
import org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows
import org.apache.flink.streaming.api.windowing.time.Time
import org.apache.flink.streaming.api.windowing.windows.TimeWindow
import org.apache.flink.util.Collector
import scala.collection.mutable.ArrayBuffer
object PeriodicWaterMarkDemo {
// 线程安全的时间格式化对象
val sdf: FastDateFormat = FastDateFormat.getInstance("yyyy-MM-dd HH:mm:ss")
def main(args: Array[String]): Unit = {
val delimiter = '\n'
val env = StreamExecutionEnvironment.getExecutionEnvironment
/* 将EventTime设置为流数据时间类型 */
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
env.setParallelism(1)
env.getConfig.setAutoWatermarkInterval(3000)
val streams: DataStream[String] = env.socketTextStream("192.168.32.110", 7000, delimiter)
val data = streams.map(logEtl).filter(data => !data._1.equals("0") && data._2 != 0L)
//为数据流中的元素分配时间戳,并定期创建水印以监控事件时间进度
val waterStream: DataStream[(String, Long)] = data.assignTimestampsAndWatermarks(new MyAssignerWithPeriodicWatermarks())
/*定义输出标签格式*/
val lateData = new OutputTag[(String, Long)]("late")
val result: DataStream[String] = waterStream.keyBy(0) // 根据name值进行分组
.window(TumblingEventTimeWindows.of(Time.seconds(5L))) // 5s跨度的基于事件时间的翻滚窗口
.allowedLateness(Time.seconds(2L)) //设置最大延迟事件
.sideOutputLateData(lateData) //旁路数据延迟数据
.apply(new MywindowFunction()) //定义窗口函数,处理窗口数据
result.print("window计算结果:")
/*输出延迟数据*/
val late = result.getSideOutput(lateData)
late.print("迟到的数据:")
env.execute(this.getClass.getName)
}
/*etllog*/
val logEtl = (data: String) => {
// println(s"log ${data}")
val items = data.split(":")
if (items.length == 2 && items(1).toLong > 0)
(items(0), items(1).toLong)
else ("0", 0l)
}
/*AssignerWithPeriodicWatermarks*/
class MyAssignerWithPeriodicWatermarks extends AssignerWithPeriodicWatermarks[(String, Long)] {
// 事件时间
var currentMaxTimestamp = 0L
/* 最大乱序时间 */
val maxOutOfOrderness = 3000L
/*当前watermark时间*/
var lastEmittedWatermark: Long = Long.MinValue
/*Returns the current watermark*/
/*周期性调用该函数获取新的watermark值*/
override def getCurrentWatermark: Watermark = {
// 保证水印能依次递增
if (currentMaxTimestamp - maxOutOfOrderness >= lastEmittedWatermark) {
lastEmittedWatermark = currentMaxTimestamp - maxOutOfOrderness
}
val waterMark = new Watermark(lastEmittedWatermark)
println(s"getCurrentWatermark:\t" + s"newTime:${sdf.format(System.currentTimeMillis())}\t" + s" waterMark:${sdf.format(waterMark.getTimestamp)}")
waterMark
}
/*Assigns a timestamp to an element, in milliseconds since the Epoch*/
/*每条日志进来都会调用该函数获取当前最大的事件时间戳*/
override def extractTimestamp(element: (String, Long), previousElementTimestamp: Long): Long = {
// 将元素的时间字段值作为该数据的timestamp
val time = element._2
if (time > currentMaxTimestamp) {
currentMaxTimestamp = time
}
println(s"extractTimestamp:\t" + s"key: ${element._1}\t" + s"EventTime: ${sdf.format(time)}\t")
time
}
}
/*windowFunction*/
class MywindowFunction extends WindowFunction[(String, Long), String, Tuple, TimeWindow] {
override def apply(key: Tuple, window: TimeWindow, input: Iterable[(String, Long)], out: Collector[String]): Unit = {
window.maxTimestamp()
val timeArr = ArrayBuffer[String]()
val iterator = input.iterator
while (iterator.hasNext) {
val tup2 = iterator.next()
timeArr.append(sdf.format(tup2._2))
}
val outData = s"key:${key.toString}\tdata:${timeArr.mkString("-")}\tstartTime:${sdf.format(window.getStart)}\tendTime:${sdf.format(window.getEnd)}"
out.collect(outData)
}
}
}
4.2测试数据
flink:1590280321000
flink:1590280323000
flink:1590280324000
flink:1590280326000
flink:1590280329000