GlobalWindow之自定义Trigger (一)

前言

之前看到GlobalWindow需要自己定义trigger,写了个测试用例简单实现了下。

背景

前面文章讲到了窗口,在窗口中我们一般都会去使用api中定义好的滑动滚动窗口等等。但在一些特殊场景下,我们需要自定义去实现窗口的定义以及窗口的触发。
举个例子:如何去实现1min窗口的每10s输出一次该窗口的值。比如在10:00-10:10中每隔10s输出这个窗口的总和。

Trigger

今天主要讲下以下三个方法:

	/**
	 * Called for every element that gets added to a pane. The result of this will determine
	 * whether the pane is evaluated to emit results.
	 *
	 * @param element The element that arrived.
	 * @param timestamp The timestamp of the element that arrived.
	 * @param window The window to which the element is being added.
	 * @param ctx A context object that can be used to register timer callbacks.
	 */
	public abstract TriggerResult onElement(T element, long timestamp, W window, TriggerContext ctx) throws Exception;
		/**
	 * Called when a processing-time timer that was set using the trigger context fires.
	 *
	 * @param time The timestamp at which the timer fired.
	 * @param window The window for which the timer fired.
	 * @param ctx A context object that can be used to register timer callbacks.
	 */
	public abstract TriggerResult onProcessingTime(long time, W window, TriggerContext ctx) throws Exception;

	/**
	 * Called when an event-time timer that was set using the trigger context fires.
	 *
	 * @param time The timestamp at which the timer fired.
	 * @param window The window for which the timer fired.
	 * @param ctx A context object that can be used to register timer callbacks.
	 */
	public abstract TriggerResult onEventTime(long time, W window, TriggerContext ctx) throws Exception;

onElement: 每条数据进来处理一下。参数 timestamp: 应该是assignTimestampsAndWatermarks设置的时间戳,没设置的话不能用哈
onProcessingTime:触发processtimer进来,time为触发的时间 。
onEventTime:同上。

测试代码:

package com.realtime.flink.trigger;

import com.realtime.flink.dto.OrderDto;
import org.apache.flink.api.common.functions.ReduceFunction;
import org.apache.flink.api.common.state.ReducingState;
import org.apache.flink.api.common.state.ReducingStateDescriptor;
import org.apache.flink.api.common.typeutils.base.LongSerializer;
import org.apache.flink.streaming.api.windowing.triggers.Trigger;
import org.apache.flink.streaming.api.windowing.triggers.TriggerResult;
import org.apache.flink.streaming.api.windowing.windows.Window;

// 实现在1min中窗口内,每10s输出一次结果
public class MyTrigger extends Trigger<OrderDto,Window> {
    // 1min窗口
    private Long windowStep = 60000L;
    // 10s触发
    private Long outputStep = 10000L;
    private final ReducingStateDescriptor<Long> stateDesc =
            new ReducingStateDescriptor<>("count", new Sum(), LongSerializer.INSTANCE);
    // 每条数据进来
    public TriggerResult onElement(OrderDto element, long timestamp, Window window, TriggerContext ctx) throws Exception {
        timestamp = element.getOrderTime();
        ReducingState<Long> reducerState = ctx.getPartitionedState(stateDesc);
        if(reducerState.get()==null){
            // 第一次进入,设置start为起始的触发时间
            long start = timestamp - timestamp%outputStep;
            // nextfire为下一次触发时间
            long nextFire = outputStep+start;
            reducerState.add(nextFire);
            System.out.println("SSSSSSSSSSSSSS"+timestamp+"-->"+nextFire);
            ctx.registerProcessingTimeTimer(nextFire);
            return TriggerResult.CONTINUE;
        }
        return TriggerResult.CONTINUE;
    }
    // 触发eventtime窗口
    public TriggerResult onProcessingTime(long time, Window window, TriggerContext ctx) throws Exception {
        ReducingState<Long> reducerState = ctx.getPartitionedState(stateDesc);
        if(time==reducerState.get()){
            long nextFire = outputStep+time;
            reducerState.add(outputStep);
            ctx.registerProcessingTimeTimer(nextFire);
            System.out.println("KKKKKKKKKKKKKKKK"+nextFire+"-->"+time%windowStep);
            // 判断是否需要清空窗口,比如到了10:01:00时需要清空触发并窗口
            if(time%windowStep==0){
                // 触发并清空窗口数据
                return TriggerResult.FIRE_AND_PURGE;
            }
            //
            return TriggerResult.FIRE;
        }
        return null;
    }
    // 触发processtime的窗口
    public TriggerResult onEventTime(long time, Window window, TriggerContext ctx) throws Exception {
        return null;
    }
    //
    public void clear(Window window, TriggerContext ctx) throws Exception {
    }
    private static class Sum implements ReduceFunction<Long> {
        private static final long serialVersionUID = 1L;

        @Override
        public Long reduce(Long value1, Long value2) throws Exception {
            return value1 + value2;
        }
    }
}

测试类:

package com.realtime.flink.test

import java.time.Duration

import com.realtime.flink.dto.OrderDto
import com.realtime.flink.source.OrderSource
import com.realtime.flink.trigger.MyTrigger
import org.apache.flink.api.common.eventtime.{SerializableTimestampAssigner, WatermarkStrategy}
import org.apache.flink.api.common.functions.AggregateFunction
import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
import org.apache.flink.streaming.api.scala._
import org.apache.flink.streaming.api.windowing.assigners.GlobalWindows

import scala.swing.Action.Trigger

object GlobalWindowTest {
  def main(args: Array[String]): Unit = {
    val env = StreamExecutionEnvironment.getExecutionEnvironment
//    env.getConfig.setAutoWatermarkInterval(1000)
    env.getConfig.setParallelism(1)
//    val strategy = WatermarkStrategy.forBoundedOutOfOrderness(Duration.ofSeconds(2))
//      .withTimestampAssigner(new SerializableTimestampAssigner[OrderDto] {
//        override def extractTimestamp(element: OrderDto, recordTimestamp: Long): Long = {
//          element.getOrderTime
//        }
//      }
//      )  ;
    env.addSource(new OrderSource)
//      .assignTimestampsAndWatermarks(strategy)
      .windowAll(GlobalWindows.create())
      .trigger(new MyTrigger)
      .aggregate (new AggregateFunction[OrderDto,(String,Double),(String,Double)] {
      override def createAccumulator(): (String, Double) = {
        ("",0L)
      }

      override def add(in: OrderDto, acc: (String, Double)): (String, Double) = {
        ("",in.getOrderPrice+acc._2)
      }

      override def getResult(acc: (String, Double)): (String, Double) = {
        acc
      }

      override def merge(acc: (String, Double), acc1: (String, Double)): (String, Double) = {
        ("",acc._2+acc1._2)
      }
    }).map(x=>{
      println(x._2)
    })
    env.execute("tttt")
  }

}

输出结果: 可以看到 12=5+7 38=7+5+3+3+2+2+1+7+3+5… 后面那分钟数据清空重新计算,计算频率为10s一次~符合预期。

AAAAAAAAAAA数据:2021-03-29 00:38:47-->7
SSSSSSSSSSSSSS1616949527985-->1616949530000
AAAAAAAAAAA数据:2021-03-29 00:38:49-->5
AAAAAAAAAAA数据:2021-03-29 00:38:50-->3
KKKKKKKKKKKKKKKK1616949540000-->50000
12.0
AAAAAAAAAAA数据:2021-03-29 00:38:51-->3
AAAAAAAAAAA数据:2021-03-29 00:38:52-->2
AAAAAAAAAAA数据:2021-03-29 00:38:53-->0
AAAAAAAAAAA数据:2021-03-29 00:38:54-->2
AAAAAAAAAAA数据:2021-03-29 00:38:55-->1
AAAAAAAAAAA数据:2021-03-29 00:38:56-->7
AAAAAAAAAAA数据:2021-03-29 00:38:57-->3
AAAAAAAAAAA数据:2021-03-29 00:38:58-->5
AAAAAAAAAAA数据:2021-03-29 00:38:59-->0
KKKKKKKKKKKKKKKK1616949550000-->0
38.0
AAAAAAAAAAA数据:2021-03-29 00:39:00-->1
AAAAAAAAAAA数据:2021-03-29 00:39:01-->4
AAAAAAAAAAA数据:2021-03-29 00:39:02-->1
AAAAAAAAAAA数据:2021-03-29 00:39:03-->3
AAAAAAAAAAA数据:2021-03-29 00:39:04-->2
AAAAAAAAAAA数据:2021-03-29 00:39:05-->3
AAAAAAAAAAA数据:2021-03-29 00:39:06-->8
AAAAAAAAAAA数据:2021-03-29 00:39:07-->1
AAAAAAAAAAA数据:2021-03-29 00:39:08-->1
AAAAAAAAAAA数据:2021-03-29 00:39:09-->9
KKKKKKKKKKKKKKKK1616949560000-->10000
33.0
AAAAAAAAAAA数据:2021-03-29 00:39:10-->1
AAAAAAAAAAA数据:2021-03-29 00:39:11-->1
AAAAAAAAAAA数据:2021-03-29 00:39:12-->0
AAAAAAAAAAA数据:2021-03-29 00:39:13-->9
AAAAAAAAAAA数据:2021-03-29 00:39:14-->3
AAAAAAAAAAA数据:2021-03-29 00:39:15-->3
AAAAAAAAAAA数据:2021-03-29 00:39:16-->0
AAAAAAAAAAA数据:2021-03-29 00:39:17-->9
AAAAAAAAAAA数据:2021-03-29 00:39:18-->3
AAAAAAAAAAA数据:2021-03-29 00:39:19-->4
KKKKKKKKKKKKKKKK1616949570000-->20000
66.0
AAAAAAAAAAA数据:2021-03-29 00:39:20-->8
AAAAAAAAAAA数据:2021-03-29 00:39:21-->9
AAAAAAAAAAA数据:2021-03-29 00:39:22-->1
AAAAAAAAAAA数据:2021-03-29 00:39:23-->9
AAAAAAAAAAA数据:2021-03-29 00:39:24-->7
AAAAAAAAAAA数据:2021-03-29 00:39:25-->1
AAAAAAAAAAA数据:2021-03-29 00:39:26-->6
AAAAAAAAAAA数据:2021-03-29 00:39:27-->0
AAAAAAAAAAA数据:2021-03-29 00:39:28-->7
AAAAAAAAAAA数据:2021-03-29 00:39:29-->9
KKKKKKKKKKKKKKKK1616949580000-->30000
123.0
AAAAAAAAAAA数据:2021-03-29 00:39:30-->2
AAAAAAAAAAA数据:2021-03-29 00:39:31-->9
AAAAAAAAAAA数据:2021-03-29 00:39:32-->1
AAAAAAAAAAA数据:2021-03-29 00:39:33-->3
AAAAAAAAAAA数据:2021-03-29 00:39:34-->8
AAAAAAAAAAA数据:2021-03-29 00:39:35-->7
AAAAAAAAAAA数据:2021-03-29 00:39:36-->2
AAAAAAAAAAA数据:2021-03-29 00:39:37-->9
AAAAAAAAAAA数据:2021-03-29 00:39:38-->7
AAAAAAAAAAA数据:2021-03-29 00:39:39-->7
KKKKKKKKKKKKKKKK1616949590000-->40000
178.0
AAAAAAAAAAA数据:2021-03-29 00:39:40-->6
AAAAAAAAAAA数据:2021-03-29 00:39:41-->7
AAAAAAAAAAA数据:2021-03-29 00:39:42-->6
AAAAAAAAAAA数据:2021-03-29 00:39:43-->6
AAAAAAAAAAA数据:2021-03-29 00:39:44-->9
AAAAAAAAAAA数据:2021-03-29 00:39:45-->1
AAAAAAAAAAA数据:2021-03-29 00:39:46-->6
AAAAAAAAAAA数据:2021-03-29 00:39:47-->2
AAAAAAAAAAA数据:2021-03-29 00:39:48-->9
AAAAAAAAAAA数据:2021-03-29 00:39:49-->6
KKKKKKKKKKKKKKKK1616949600000-->50000
236.0
AAAAAAAAAAA数据:2021-03-29 00:39:50-->5
AAAAAAAAAAA数据:2021-03-29 00:39:51-->1
AAAAAAAAAAA数据:2021-03-29 00:39:52-->7
AAAAAAAAAAA数据:2021-03-29 00:39:53-->2
AAAAAAAAAAA数据:2021-03-29 00:39:54-->4
AAAAAAAAAAA数据:2021-03-29 00:39:55-->9
AAAAAAAAAAA数据:2021-03-29 00:39:56-->4
AAAAAAAAAAA数据:2021-03-29 00:39:57-->5
AAAAAAAAAAA数据:2021-03-29 00:39:58-->5
AAAAAAAAAAA数据:2021-03-29 00:39:59-->3
KKKKKKKKKKKKKKKK1616949610000-->0
281.0
AAAAAAAAAAA数据:2021-03-29 00:40:00-->0
AAAAAAAAAAA数据:2021-03-29 00:40:01-->3
AAAAAAAAAAA数据:2021-03-29 00:40:02-->6
AAAAAAAAAAA数据:2021-03-29 00:40:03-->4
AAAAAAAAAAA数据:2021-03-29 00:40:04-->1
AAAAAAAAAAA数据:2021-03-29 00:40:05-->3
AAAAAAAAAAA数据:2021-03-29 00:40:06-->2
AAAAAAAAAAA数据:2021-03-29 00:40:07-->8
AAAAAAAAAAA数据:2021-03-29 00:40:08-->1
AAAAAAAAAAA数据:2021-03-29 00:40:09-->9
KKKKKKKKKKKKKKKK1616949620000-->10000
37.0
AAAAAAAAAAA数据:2021-03-29 00:40:10-->6
AAAAAAAAAAA数据:2021-03-29 00:40:11-->8
AAAAAAAAAAA数据:2021-03-29 00:40:12-->4

Process finished with exit code -1

总结

工作中以报表统计为主,基本不使用自定义trigger。感觉线上业务开发可能会用到。之后的文章还会继续讲下trigger剩下的几个方法用法,以及如何处理延迟数据。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值