一、flink窗口
1、什么是窗口
我们的flink主要是用来处理无界数据流,一种方式就是将我们的无界数据流切割成有限的“数据块”进行处理,这就是我们的窗口(window)。
2、窗口分类
滚动窗口、滑动窗口、会话窗口
滚动窗口:timeWindow(Time.seconds(3)) count-tumbling-window
滑动窗口:timeWindow(Time.seconds(10),Time.seconds(5)) count-sliding-window
3、窗口能解决什么问题(为什么使用窗口)
首先flink是一个实现了流批一体的计算框架,当我们使用批处理时我们引入了窗口计算,实现我们的批处理。
4、滚动窗口(每个区消费总额Top3的公司)
public class CityShopNameTopN {
public static void main(String[] args) throws Exception{
// TODO: 2022/9/2创建Flink流式处理环境
StreamExecutionEnvironment environment = StreamExecutionEnvironment.getExecutionEnvironment();
// TODO: 2022/9/2 设置并行度
environment.setParallelism(1);
String uu = UUID.randomUUID().toString().substring(0, 6).replace("-", "");
String groupId = "ware_goods_group"+uu;
FlinkKafkaConsumer<String> kafkaSource = MyKafkaUtil.getKafkaSource("dwd_foo_order_detail",groupId);
DataStreamSource<String> order_detail = environment.addSource(kafkaSource);
SingleOutputStreamOperator<JSONObject> map1 = order_detail.map(d -> JSON.parseObject(d));
//水位线
SingleOutputStreamOperator<JSONObject> watermarks = map1.assignTimestampsAndWatermarks(
WatermarkStrategy.<JSONObject>forBoundedOutOfOrderness(Duration.ofSeconds(3))
.withTimestampAssigner(new SerializableTimestampAssigner<JSONObject>() {
@Override
public long extractTimestamp(JSONObject element, long recordTimestamp) {
long time = 0;
try {
time = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").parse(element.getString("createTime")).getTime();
} catch (ParseException e) {
e.printStackTrace();
}
return time;
}
}));
SingleOutputStreamOperator<Tuple3<String, String, Double>> map = watermarks.map(new MapFunction<JSONObject, Tuple3<String, String, Double>>() {
@Override
public Tuple3<String, String, Double> map(JSONObject value) throws Exception {
String goodsNum = value.getString("goodsNum");
String goodsPrice = value.getString("goodsPrice");
return new Tuple3<>(value.getString("regionName"), value.getString("cityName"), Integer.valueOf(goodsNum) * Double.valueOf(goodsPrice));
}
});
SingleOutputStreamOperator<Tuple3<String, String, Double>> process = map.keyBy(data -> data.f0 + "," + data.f1).sum(2).keyBy(data -> data.f0 + "," + data.f1)
.window(TumblingProcessingTimeWindows.of(Time.seconds(1))).process(new ProcessWindowFunction<Tuple3<String, String, Double>, Tuple3<String, String, Double>, String, TimeWindow>() {
@Override
public void process(String s, Context context, Iterable<Tuple3<String, String, Double>> iterable, Collector<Tuple3<String, String, Double>> collector) throws Exception {
ArrayList<Tuple3<String, String, Double>> list = new ArrayList<>();
for (Tuple3<String, String, Double> value : iterable) {
list.add(value);
}
list.sort(new Comparator<Tuple3<String, String, Double>>() {
@Override
public int compare(Tuple3<String, String, Double> o1, Tuple3<String, String, Double> o2) {
return (int) (o2.f2 - o1.f2);
}
});
for (int i = 0; i < list.size() && i < 3; i++) {
collector.collect(list.get(i));
}
}
});
process.print();
//落地
process.addSink(new SinkPG());