Flink 自定义触发器实现带超时时间的 CountWindow

本文介绍如何在Apache Flink中实现一个自定义的窗口触发器,该触发器结合了时间窗口和计数窗口的特点,可在数据量达到预设阈值或超时情况下触发窗口操作,适用于实时流处理场景。

点击上方蓝色字体,选择“设为星标

回复”资源“获取更多资源

Flink 的 window 有两个基本款,TimeWindow 和 CountWindow。
TimeWindow 是到时间就触发窗口,CountWindow 是到数量就触发。

如果我需要到时间就触发,并且到时间之前如果已经积累了足够数量的数据;或者在限定时间内没有积累足够数量的数据,我依然希望触发窗口业务,那么就需要自定义触发器。

import org.apache.flink.api.common.functions.ReduceFunction;
import org.apache.flink.api.common.state.ReducingState;
import org.apache.flink.api.common.state.ReducingStateDescriptor;
import org.apache.flink.api.common.typeutils.base.LongSerializer;
import org.apache.flink.streaming.api.TimeCharacteristic;
import org.apache.flink.streaming.api.windowing.triggers.Trigger;
import org.apache.flink.streaming.api.windowing.triggers.TriggerResult;
import org.apache.flink.streaming.api.windowing.windows.TimeWindow;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;


/**
 * 带超时的计数窗口触发器
 */
public class CountTriggerWithTimeout<T> extends Trigger<T, TimeWindow> {
    private static Logger LOG = LoggerFactory.getLogger(CountTriggerWithTimeout.class);


    /**
     * 窗口最大数据量
     */
    private int maxCount;
    /**
     * event time / process time
     */
    private TimeCharacteristic timeType;
    /**
     * 用于储存窗口当前数据量的状态对象
     */
    private ReducingStateDescriptor<Long> countStateDescriptor =
            new ReducingStateDescriptor("counter", new Sum(), LongSerializer.INSTANCE);




    public CountTriggerWithTimeout(int maxCount, TimeCharacteristic timeType) {


        this.maxCount = maxCount;
        this.timeType = timeType;
    }




    private TriggerResult fireAndPurge(TimeWindow window, TriggerContext ctx) throws Exception {
        clear(window, ctx);
        return TriggerResult.FIRE_AND_PURGE;
    }




    @Override
    public TriggerResult onElement(T element, long timestamp, TimeWindow window, TriggerContext ctx) throws Exception {
        ReducingState<Long> countState = ctx.getPartitionedState(countStateDescriptor);
        countState.add(1L);


        if (countState.get() >= maxCount) {
            LOG.info("fire with count: " + countState.get());
            return fireAndPurge(window, ctx);
        }
        if (timestamp >= window.getEnd()) {
            LOG.info("fire with tiem: " + timestamp);
            return fireAndPurge(window, ctx);
        } else {
            return TriggerResult.CONTINUE;
        }
    }


    @Override
    public TriggerResult onProcessingTime(long time, TimeWindow window, TriggerContext ctx) throws Exception {
        if (timeType != TimeCharacteristic.ProcessingTime) {
            return TriggerResult.CONTINUE;
        }


        if (time >= window.getEnd()) {
            return TriggerResult.CONTINUE;
        } else {
            LOG.info("fire with process tiem: " + time);
            return fireAndPurge(window, ctx);
        }
    }


    @Override
    public TriggerResult onEventTime(long time, TimeWindow window, TriggerContext ctx) throws Exception {
        if (timeType != TimeCharacteristic.EventTime) {
            return TriggerResult.CONTINUE;
        }


        if (time >= window.getEnd()) {
            return TriggerResult.CONTINUE;
        } else {
            LOG.info("fire with event tiem: " + time);
            return fireAndPurge(window, ctx);
        }
    }


    @Override
    public void clear(TimeWindow window, TriggerContext ctx) throws Exception {
        ReducingState<Long> countState = ctx.getPartitionedState(countStateDescriptor);
        countState.clear();
    }


    /**
     * 计数方法
     */
    class Sum implements ReduceFunction<Long> {


        @Override
        public Long reduce(Long value1, Long value2) throws Exception {
            return value1 + value2;
        }
    }
}

使用示例(超时时间 10 秒,数据量上限 1000):

stream
        .timeWindowAll(Time.seconds(10))
        .trigger(
                new CountTriggerWithTimeout(1000, TimeCharacteristic.ProcessingTime)
        )
        .process(new XxxxWindowProcessFunction())
        .addSink(new XxxSinkFunction())
        .name("Xxx");

即可。

欢迎点赞+收藏+转发朋友圈素质三连

文章不错?点个【在看】吧! ????

要通过自定义触发器实现窗口数据求和,首先需要了解自定义触发器实现方法。用户可以通过实现 `Trigger` 接口来创建自定义触发器,通常需要实现 `onElement`、`onEventTime`、`onProcessingTime`、`onMerge` 和 `canMerge` 等方法,其主要作用是控制窗口的计算时机,让 Flink 的窗口操作更灵活强大,优化流处理应用的性能和资源利用率[^1]。 以下是一个自定义触发器实现窗口数据求和的示例代码: ```java import org.apache.flink.api.common.functions.AggregateFunction; import org.apache.flink.api.java.tuple.Tuple2; import org.apache.flink.streaming.api.TimeCharacteristic; import org.apache.flink.streaming.api.datastream.DataStream; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import org.apache.flink.streaming.api.functions.windowing.ProcessWindowFunction; import org.apache.flink.streaming.api.functions.windowing.Trigger; import org.apache.flink.streaming.api.functions.windowing.TriggerResult; import org.apache.flink.streaming.api.windowing.assigners.TumblingProcessingTimeWindows; import org.apache.flink.streaming.api.windowing.time.Time; import org.apache.flink.streaming.api.windowing.windows.TimeWindow; import org.apache.flink.util.Collector; // 自定义触发器 class CustomSumTrigger extends Trigger<Tuple2<String, Integer>, TimeWindow> { @Override public TriggerResult onElement(Tuple2<String, Integer> element, long timestamp, TimeWindow window, TriggerContext ctx) throws Exception { // 当元素被添加到窗口时,检查是否触发窗口计算 if (element.f1 > 100) { return TriggerResult.FIRE; } return TriggerResult.CONTINUE; } @Override public TriggerResult onProcessingTime(long time, TimeWindow window, TriggerContext ctx) throws Exception { // 当窗口的处理时间到达时,触发窗口计算 return TriggerResult.FIRE; } @Override public TriggerResult onEventTime(long time, TimeWindow window, TriggerContext ctx) throws Exception { return TriggerResult.CONTINUE; } @Override public void clear(TimeWindow window, TriggerContext ctx) throws Exception { // 清理触发器状态 } } // 自定义聚合函数,用于求和 class SumAggregateFunction implements AggregateFunction<Tuple2<String, Integer>, Integer, Integer> { @Override public Integer createAccumulator() { return 0; } @Override public Integer add(Tuple2<String, Integer> value, Integer accumulator) { return accumulator + value.f1; } @Override public Integer getResult(Integer accumulator) { return accumulator; } @Override public Integer merge(Integer a, Integer b) { return a + b; } } // 自定义窗口处理函数 class SumProcessWindowFunction extends ProcessWindowFunction<Integer, Tuple2<String, Integer>, String, TimeWindow> { @Override public void process(String key, Context context, Iterable<Integer> elements, Collector<Tuple2<String, Integer>> out) throws Exception { int sum = elements.iterator().next(); out.collect(new Tuple2<>(key, sum)); } } public class CustomTriggerSumExample { public static void main(String[] args) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setStreamTimeCharacteristic(TimeCharacteristic.ProcessingTime); // 模拟输入数据流 DataStream<Tuple2<String, Integer>> input = env.fromElements( new Tuple2<>("key1", 10), new Tuple2<>("key1", 20), new Tuple2<>("key1", 30), new Tuple2<>("key2", 40), new Tuple2<>("key2", 50) ); // 应用窗口操作,使用自定义触发器和聚合函数 DataStream<Tuple2<String, Integer>> result = input .keyBy(value -> value.f0) .window(TumblingProcessingTimeWindows.of(Time.seconds(5))) .trigger(new CustomSumTrigger()) .aggregate(new SumAggregateFunction(), new SumProcessWindowFunction()); result.print(); env.execute("Custom Trigger Sum Example"); } } ``` 在上述代码中,`CustomSumTrigger` 是自定义触发器,`SumAggregateFunction` 用于对窗口内的数据进行求和,`SumProcessWindowFunction` 用于处理窗口计算结果。
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值