从源码看 flink-1.11.1 中的时间水位和时间戳

 

 

 


SystemProcessingTimeService

getCurrentProcessingTime() 方法返回 System.currentTimeMillis()

ScheduledFuture<?> registerTimer(long timestamp, ProcessingTimeCallback callback) 注册一个要在不早于时间 timestamp 的情况下执行的任务。这里输入的 timestamp指的是任务启动的时间如果设定的时间早于当前时间,则会在当前时间的一毫秒后执行(这里的一毫秒是为了超过水位线,避免因为和水位线相同而被抛弃),否则在设定的时间执行;定时任务使用的是java里的 ScheduledThreadPoolExecutor方法实现定时任务,ProcessingTimeCallback这个接口里有一个onProcessingTime方法也是定时任务最终会定时调用的接口。

ScheduledTask 实现了Runnable接口


TimestampsAndWatermarksOperator

对单个流操作,从事件中提取时间戳并生成水印。open()方法中进行初始化这里有一个watermarkInterval参数,默认为0;每次通过registerTimer方法注册定时任务(生成时间水印)。

private transient TimestampAssigner<T> timestampAssigner //提取时间戳的方法

private transient WatermarkGenerator<T> watermarkGenerator //生成时间水印

	public TimestampsAndWatermarksOperator(
			WatermarkStrategy<T> watermarkStrategy) {

		this.watermarkStrategy = checkNotNull(watermarkStrategy);
		this.chainingStrategy = ChainingStrategy.ALWAYS;
	}

	@Override
	public void open() throws Exception {
		super.open();

		timestampAssigner = watermarkStrategy.createTimestampAssigner(this::getMetricGroup);
		watermarkGenerator = watermarkStrategy.createWatermarkGenerator(this::getMetricGroup);

		wmOutput = new WatermarkEmitter(output, getContainingTask().getStreamStatusMaintainer());

		watermarkInterval = getExecutionConfig().getAutoWatermarkInterval();
		if (watermarkInterval > 0) {
			final long now = getProcessingTimeService().getCurrentProcessingTime();
			getProcessingTimeService().registerTimer(now + watermarkInterval, this);
		}
	}

	@Override
	public void processElement(final StreamRecord<T> element) throws Exception {
		final T event = element.getValue();
		final long previousTimestamp = element.hasTimestamp() ? element.getTimestamp() : Long.MIN_VALUE;
		final long newTimestamp = timestampAssigner.extractTimestamp(event, previousTimestamp);

		element.setTimestamp(newTimestamp);
		output.collect(element);
		watermarkGenerator.onEvent(event, newTimestamp, wmOutput);
	}

	@Override
	public void onProcessingTime(long timestamp) throws Exception {
		watermarkGenerator.onPeriodicEmit(wmOutput);

		final long now = getProcessingTimeService().getCurrentProcessingTime();
		getProcessingTimeService().registerTimer(now + watermarkInterval, this);
	}

这里可以注意到watermarkStrategy这个变量是由用户自定义的(从assignTimestampsAndWatermarks传进来的)

open中通过这个watermarkStrategy对上面说的两个变量初始化,

这里可以注意到registerTimer方法分别在open和onProcessingTime调用,使用的延迟都是在配置中的 每隔一段时间发出水印的延迟  这里理解前者的registerTimer是对本对象的初始化只被执行一次,后者onProcessingTime则会根据实际情况被多次调度


WatermarkStrategy

继承了TimestampAssignerSupplier, WatermarkGeneratorSupplier 这二者都是用的是供应商模式,分别返回TimestampAssigner和WatermarkGenerator。

在flink 1.11 后assignTimestampsAndWatermarks支持传入WatermarkStrategy对象用于生成时间水位,以下是对相关的接口进行的简单复现。这里对接口的运用非常值得学习。

@PublicEvolving
@FunctionalInterface
interface TimestampAssignerSupplier<T> {

    TimestampAssigner<T> createTimestampAssigner(Context context);

    static <T> TimestampAssignerSupplier<T> of(SerializableTimestampAssigner<T> assigner) {
        return new SupplierFromSerializableTimestampAssigner<>(assigner);
    }

    interface Context {
        int getMetricGroup();
    }

    class SupplierFromSerializableTimestampAssigner<T> implements TimestampAssignerSupplier<T> {

        private final SerializableTimestampAssigner<T> assigner;

        public SupplierFromSerializableTimestampAssigner(SerializableTimestampAssigner<T> assigner) {
            this.assigner = assigner;
        }

        @Override
        public TimestampAssigner<T> createTimestampAssigner(Context context) {
            return assigner;
        }
    }
}

@PublicEvolving
@FunctionalInterface
interface WatermarkGeneratorSupplier<T> {

    WatermarkGenerator<T> createWatermarkGenerator(Context context);

    interface Context {
        int getMetricGroup();
    }
}

@Public
@FunctionalInterface
interface TimestampAssigner<T> {
    long extractTimestamp(T element, long recordTimestamp);
}

@PublicEvolving
@FunctionalInterface
interface SerializableTimestampAssigner<T> extends TimestampAssigner<T>, Serializable {
}

@Public
interface WatermarkGenerator<T> {
    void onEvent(T event, long eventTimestamp, long output);

    void onPeriodicEmit(long output);
}

@Public
class BoundedOutOfOrdernessWatermarks<T> implements WatermarkGenerator<T> {
    private long maxTimestamp;
    private final long outOfOrdernessMillis;

    public BoundedOutOfOrdernessWatermarks(Duration maxOutOfOrderness) {
        this.outOfOrdernessMillis = maxOutOfOrderness.toMillis();
        this.maxTimestamp = Long.MIN_VALUE + outOfOrdernessMillis + 1;
    }

    @Override
    public void onEvent(T event, long eventTimestamp, long output) {
        maxTimestamp = Math.max(maxTimestamp, eventTimestamp);
    }

    @Override
    public void onPeriodicEmit(long output) {
        output = maxTimestamp - outOfOrdernessMillis - 1;
    }
}

@Public
final class WatermarkStrategyWithTimestampAssigner<T> implements WatermarkStrategy<T> {

    private final WatermarkStrategy<T> baseStrategy;
    private final TimestampAssignerSupplier<T> timestampAssigner;

    WatermarkStrategyWithTimestampAssigner(
            WatermarkStrategy<T> baseStrategy,
            TimestampAssignerSupplier<T> timestampAssigner) {
        this.baseStrategy = baseStrategy;
        this.timestampAssigner = timestampAssigner;
    }

    @Override
    public TimestampAssigner<T> createTimestampAssigner(TimestampAssignerSupplier.Context context) {
        return timestampAssigner.createTimestampAssigner(context);
    }

    @Override
    public WatermarkGenerator<T> createWatermarkGenerator(WatermarkGeneratorSupplier.Context context) {
        return baseStrategy.createWatermarkGenerator(context);
    }
}

@Public
interface WatermarkStrategy<T> extends TimestampAssignerSupplier<T>, WatermarkGeneratorSupplier<T> {

    @Override
    default TimestampAssigner<T> createTimestampAssigner(TimestampAssignerSupplier.Context context) {
        return null;
    }

    @Override
    WatermarkGenerator<T> createWatermarkGenerator(WatermarkGeneratorSupplier.Context context);

    static <T> WatermarkStrategy<T> forMonotonousTimestamps() {
        return (context) -> new BoundedOutOfOrdernessWatermarks<>(Duration.ofMillis(0));
    }

    default WatermarkStrategy<T> withTimestampAssigner(TimestampAssignerSupplier<T> timestampAssigner) {
        return new WatermarkStrategyWithTimestampAssigner<>(this, timestampAssigner);
    }

    default WatermarkStrategy<T> withTimestampAssigner(SerializableTimestampAssigner<T> timestampAssigner) {
        return new WatermarkStrategyWithTimestampAssigner<>(this,
                TimestampAssignerSupplier.of(timestampAssigner));
    }
}

 


WatermarkGenerator

这个接口可以根据事件并定期(以固定间隔)生成水印。

onEvent 为每个事件调用,允许水印生成器检查并记住事件时间戳

onPeriodicEmit 定期调用,会根据当前记录的时间和允许迟到的时间发出新的水印。


TimestampAssigner

这个接口内只有一个方法  long extractTimestamp(T element, long recordTimestamp) element为将要标记时间戳的对象,recordTimestamp的值是前一个时间戳分配器分配的时间戳,如果没有分配则为Long.MIN_VALUE

 

 

 

 

 

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值