基本使用直接参照官方文档:
https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/connectors/kafka.html
以下分享以下我的个性化使用方式
1.EventTime使用数据源时间,注册水印
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
// 设置时间使用事件时间
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
//添加数据源(kafka consumer)
SingleOutputStreamOperator<BaseMessage> timeData = env.addSource(consumer)
// 注册水印和时间戳
.assignTimestampsAndWatermarks(new MyAssignerWithPeriodicWatermarks(60)).name("water_marker");
MyAssignerWithPeriodicWatermarks.java(我用的泛型kafka消息BaseMessage的子类都可以)
public class MyAssignerWithPeriodicWatermarks<T extends BaseMessage> implements AssignerWithPeriodicWatermarks<T> {
long currentMaxTimestamp = 0L;
long maxOutOfOrderness = 0L;//最大允许的乱序时间
public MyAssignerWithPeriodicWatermarks(long second) {
maxOutOfOrderness = 1000 * second;
}
@Nullable
@Override
public Watermark getCurrentWatermark() {
return new Watermark(currentMaxTimestamp - maxOutOfOrderness);
}
@Override
public long extractTimestamp(T msg, long previousElementTimestamp) {
long timestamp = msg.getTime();
currentMaxTimestamp = Math.max(timestamp, currentMaxTimestamp);
return timestamp;
}
}
2.解决窗口8小时时差问题
keyby之后用.window(TumblingEventTimeWindows.of(Time.minutes(1), Time.hours(-8)))替换.timeWindow(Time.minutes(1))
SingleOutputStreamOperator dsPf = timeData.keyBy("field1", "field2")
.window(TumblingEventTimeWindows.of(Time.minutes(windowInterval), Time.hours(-8)));
3.每隔一段时间触发window计算和sink,需要自己写Trigger,参照EventTimeTrigger和ProcessingTimeTrigger
SingleOutputStreamOperator dsPf = timeData.keyBy("field1", "field2")
.window(TumblingEventTimeWindows.of(Time.minutes(windowInterval), Time.hours(-8)))
.trigger(EventProcessTimeTrigger.create(triggerInterval, "stat_name"));
自定义trigger,同时实现EventTime和ProcessingTime
public class EventProcessTimeTrigger extends Trigger<Object, TimeWindow> {
// 状态保存(statName唯一)
private ReducingStateDescriptor<Long> stateDesc;
// 间隔时间,秒
private final long interval;
private EventProcessTimeTrigger(long interval, String statName) {
this.interval = interval * 1000;
this.stateDesc =
new ReducingStateDescriptor<>(statName, (v1, v2) -> Math.min(v1, v2), LongSerializer.INSTANCE);
}
@Override
public boolean canMerge() {
return super.canMerge();
}
@Override
public void onMerge(TimeWindow window, OnMergeContext ctx) throws Exception {
long windowMaxTimestamp = window.maxTimestamp();
if (windowMaxTimestamp > ctx.getCurrentWatermark()) {
ctx.registerEventTimeTimer(windowMaxTimestamp);
}
}
@Override
public TriggerResult onElement(Object element, long timestamp, TimeWindow window, TriggerContext ctx) throws Exception {
if (window.maxTimestamp() <= ctx.getCurrentWatermark()) {
return TriggerResult.FIRE;
} else {
ctx.registerEventTimeTimer(window.maxTimestamp());
ReducingState<Long> fireTimestamp = ctx.getPartitionedState(stateDesc);
//距当前时间少于间隔时间的消息注册触发时间
if (fireTimestamp.get() == null && (System.currentTimeMillis() - timestamp < interval)) {
long start = timestamp - (timestamp % interval);