Flink开发-全局窗口GlobalWindows

最新推荐文章于 2022-04-26 14:28:09 发布

知白zz

最新推荐文章于 2022-04-26 14:28:09 发布

阅读量1.8k

点赞数 3

分类专栏： Flink 文章标签：实时大数据 flink java

本文链接：https://blog.csdn.net/qq884121435/article/details/119669974

版权

Flink 专栏收录该内容

15 篇文章 2 订阅

订阅专栏

Flink开发-全局窗口GlobalWindows

1.Non-Keyed Count Windows
2.Keyed Count Windows

全局窗口没有结束的边界，使用的Trigger（触发器）是NeverTrigger。如果不对全局窗口指定一个触发器，窗口是不会触发计算的。

1.Non-Keyed Count Windows

是按照窗口中接收到数据的条数划分窗口的，跟时间无关。Non-Keyed Windows，就仅有一个全局窗口。Count Windows属于Global Windows并指定了CountTrigger。如果没有达到指定的条数，窗口不会被触发执行。

1.1 aggregates增量聚合

    public static void main(String[] args) throws Exception {
        //CountWindowAll是GlobalWindow的一种
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        DataStreamSource<String> socketStream = env.socketTextStream("localhost", 8888);
        SingleOutputStreamOperator<Integer> mapStream = socketStream.map(Integer::parseInt);
        //划分Non-Keyed countWindowAll，并行度为1
        AllWindowedStream<Integer, GlobalWindow> windowAll = mapStream.countWindowAll(5);
        //把窗口数据进行聚合
        SingleOutputStreamOperator<Integer> sum = windowAll.sum(0);
        sum.print();
        env.execute("");
    }

输入内容：

C:\Users\zhibai>nc -lp 8888
1
2
3
4
5
6
7
8
9
10

输出结果：

4> 15
5> 40

1.2 reduce增量聚合

    public static void main(String[] args) throws Exception {
        //local模式默认的并行度是当前机器的逻辑核数
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        DataStreamSource<String> socketStream = env.socketTextStream("localhost", 8888);
        SingleOutputStreamOperator<Integer> mapStream = socketStream.map(Integer::parseInt);
        //划分Non-Keyed countWindowAll，并行度为1
        AllWindowedStream<Integer, GlobalWindow> windowAll = mapStream.countWindowAll(5);
        //把窗口数据进行增量聚合，每次数据流入都会计算结果，内存中只保留中间状态，效率更高更节省资源。
        SingleOutputStreamOperator<Integer> reduce = windowAll.reduce(new ReduceFunction<Integer>() {
            @Override
            public Integer reduce(Integer t1, Integer t2) throws Exception {
                return t1 + t2;
            }
        });
        reduce.print();
        env.execute("");
    }

输入内容：

C:\Users\zhibai>nc -lp 8888
1
2
3
4
5
6
7
8
9
10

输出结果：

1> 15
2> 40

1.3 apply全量聚合

程序运行时将窗口中的数据先在window state中存起来，当满足触发条件后再将状态中的数据取出来进行计算。

    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        DataStreamSource<String> socketStream = env.socketTextStream("localhost", 8888);
        SingleOutputStreamOperator<Integer> mapStream = socketStream.map(Integer::parseInt);
        //划分Non-Keyed countWindowAll，并行度为1
        AllWindowedStream<Integer, GlobalWindow> windowAll = mapStream.countWindowAll(5);
        SingleOutputStreamOperator<Integer> apply = windowAll.apply(new AllWindowFunction<Integer, Integer, GlobalWindow>() {
            @Override
            public void apply(GlobalWindow window, Iterable<Integer> values, Collector<Integer> out) throws Exception {
                Integer sum = 0;
                for (Integer value : values) {
                    sum += value;
                }
                out.collect(sum);
            }
        });
        apply.print();
        env.execute("");
    }

输入内容：

C:\Users\zhibai>nc -lp 8888
1
2
3
4
5
6
7
8
9
10

输出结果：

4> 15
5> 40

因为apply是对全量数据进行处理我们也可以利用这一特点，对窗口内的数据进行排序等操作。

    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        DataStreamSource<String> socketStream = env.socketTextStream("localhost", 8888);
        SingleOutputStreamOperator<Integer> mapStream = socketStream.map(Integer::parseInt);
        //划分Non-Keyed countWindowAll，并行度为1
        AllWindowedStream<Integer, GlobalWindow> windowAll = mapStream.countWindowAll(5);
        SingleOutputStreamOperator<Integer> apply = windowAll.apply(new AllWindowFunction<Integer, Integer, GlobalWindow>() {
            @Override
            public void apply(GlobalWindow window, Iterable<Integer> values, Collector<Integer> out) throws Exception {
                ArrayList<Integer> lst = new ArrayList<>();
                for (Integer value : values) {
                    lst.add(value);
                }
                lst.sort(new Comparator<Integer>() {
                    @Override
                    public int compare(Integer o1, Integer o2) {
                        //return o1 - o2;
                        return Integer.compare(o1,o2);
                    }
                });
                for (Integer i : lst) {
                    out.collect(i);
                }
            }
        });
        apply.print().setParallelism(1);
        env.execute("");
    }

输入内容：

C:\Users\zhibai>nc -lp 8888
5
4
3
2
1

输出结果：

2.Keyed Count Windows

全局窗口将key相同的数据都分配到一个单独的窗口中，每一种key对应一个全局窗口，多个全局窗口之间是相互独立的，每个组可以分别触发，不需要等待所有组都满足结果再触发。

    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        DataStreamSource<String> socketStream = env.socketTextStream("localhost", 8888);
        SingleOutputStreamOperator<Tuple2<String, Integer>> wordAndOne = socketStream.map(new MapFunction<String, Tuple2<String, Integer>>() {
            @Override
            public Tuple2<String, Integer> map(String s) throws Exception {
                String[] fields = s.split(" ");
                return Tuple2.of(fields[0], Integer.parseInt(fields[1]));
            }
        });
        KeyedStream<Tuple2<String, Integer>, String> keyedStream = wordAndOne.keyBy(new KeySelector<Tuple2<String, Integer>, String>() {
            @Override
            public String getKey(Tuple2<String, Integer> s) throws Exception {
                return s.f0;
            }
        });
        SingleOutputStreamOperator<Tuple2<String, Integer>> reduce = keyedStream.countWindow(3).reduce(new ReduceFunction<Tuple2<String, Integer>>() {
            @Override
            public Tuple2<String, Integer> reduce(Tuple2<String, Integer> value1, Tuple2<String, Integer> value2) throws Exception {
                value1.f1 = value1.f1 + value2.f1;
                return value1;
            }
        });
        reduce.print();
        env.execute("");
    }

输入内容：

C:\Users\zhibai>nc -lp 8888
hadoop 2
flink 1
spark 3
hadoop 3
hadoop 1
flink 2
spark 4
flink 4
spark 7

输出结果：

8> (hadoop,6)
7> (flink,7)
1> (spark,14)

知白zz

关注

3
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
Flink开发-全局窗口GlobalWindows

Flink开发-滚动窗口TumblingWindows1.Flink的三种时间语义1.Flink的三种时间语义
复制链接

扫一扫

专栏目录

Flink开发-全局窗口GlobalWindows

Flink开发-全局窗口GlobalWindows

1.Non-Keyed Count Windows

1.1 aggregates增量聚合

1.2 reduce增量聚合

1.3 apply全量聚合

2.Keyed Count Windows

“相关推荐”对你有帮助么？