在DataStreamSource对象上调用assignTimestampsAndWatermarks方法,自定义Timestamp提取规则和Watermark生成规则。在flink1.11版本之前,flink内置的Timestamp分配器有以下4种:
文章目录
基于AssignerWithPeriodicWatermarks接口
AssignerWithPeriodicWatermarks接口扩展自TimestampAssigner类,其中extractTimestamp方法定义抽取Timestamp,getCurrentWatermark方法定义Watermark生成规则,该接口会周期性进行调用。
//在assignTimestampsAndWatermarks中,通过AssignerWithPeriodicWatermarks抽取Timestamp和生成周期性水位线示例
public class Test{
public static void main(String[] args) throws Exception{
//创建流处理环境
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
//设置EventTime语义
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
//设置周期性生成Watermark间隔(10毫秒)
env.getConfig().setAutoWatermarkInterval(10L);
//并行度1
env.setParallelism(1);
//演示数据
DataStreamSource<ClickEvent> mySource = env.fromElements(
new ClickEvent("user1", 1L, 1),
new ClickEvent("user1", 2L, 2),
new ClickEvent("user1", 3L, 3),
new ClickEvent("user1", 4L, 4),
new ClickEvent("user1", 5L, 5),
new ClickEvent("user1", 6L, 6),
new ClickEvent("user1", 7L, 7),
new ClickEvent("user1", 8L, 8)
);
//AssignerWithPeriodicWatermarks周期性生成水位线
SingleOutputStreamOperator<ClickEvent> streamTS = mySource.assignTimestampsAndWatermarks(
new AssignerWithPeriodicWatermarks<ClickEvent>(){
private long maxTimestamp = 0L;
//延迟
private long delay = 0L;
@Override
//自定义Timestamp提取规则
public long extractTimestamp(ClickEvent event, long l) {
try {
//放慢处理速度,否则可能只会生成一条水位线
Thread.sleep(100L);
}
catch (Exception ex){
}
//比较当前事件时间和最大时间戳maxTimestamp(并更新)
maxTimestamp = Math.max(event.getDateTime(), maxTimestamp);
System.out.println("时间:"+event.getDateTime());
//提取时间戳
return event.getDateTime();
}