使用Flink ProcessFunction发现的问题
应用Flin目录
实现功能
代码
测试
问题
官网描述:https://ci.apache.org/projects/flink/flink-docs-release-1.10/zh/dev/stream/operators/process_function.html
The ProcessFunction is a low-level stream processing operation, giving access to the basic building blocks of all (acyclic) streaming applications:
events (stream elements)
state (fault-tolerant, consistent, only on keyed stream)
timers (event time and processing time, only on keyed stream)
The ProcessFunction can be thought of as a FlatMapFunction with access to keyed state and timers. It handles events by being invoked for each event received in the input stream(s).
For fault-tolerant state, the ProcessFunction gives access to Flink’s keyed state, accessible via the RuntimeContext, similar to the way other stateful functions can access keyed state.
The timers allow applications to react to changes in processing time and in event time. Every call to the function processElement(…) gets a Context object which gives access to the element’s event time timestamp, and to the TimerService. The TimerService can be used to register callbacks for future event-/processing-time instants. With event-time timers, the onTimer(…) method is called when the current watermark is advanced up to or beyond the timestamp of the timer, while with processing-time timers, onTimer(…) is called when wall clock time reaches the specified time. During that call, all states are again scoped to the key with which the timer was created, allowing timers to manipulate keyed state.
ProcessFunction是一个低阶的流处理操作,它可以访问流处理程序的基础构建模块:
1.事件(event)(流元素)。
2.状态(state)(容错性,一致性,仅在keyed stream中)。
3.定时器(timers)(event time和processing time, 仅在keyed stream中)。
state和timers 仅在keyed stream中使用,这里我们先介绍KeyedProcessFunction方法使用
实现功能
通过socketTextStream读取9999端口数据,统计在一定时间内不同类型商品的销售总额度,如果持续销售额度为0,则执行定时器通知老板,是不是卖某种类型商品的员工偷懒了(只做功能演示,根据个人业务来使用,比如统计UV等操作)
代码
import org.apache.flink.api.common.state.{
ValueState, ValueStateDescriptor}
import org.apache.flink.api.common.typeinfo.TypeInformation
import org.apache.flink.api.java.tuple.Tuple
import org.apache.flink.api.scala.typeutils.Types
import org.apache.flink.streaming.api.functions.KeyedProcessFunction
import org.apache.flink.streaming.api.scala._
import org.apache.flink.util.Collector
object ProcessFuncationScala {
def main(args: Array[String]): Unit =