代码用例:
在上篇:基于flink无增量CountWindow算子的使用改造过的使用方式,其结果是一致
唯一不同是:增量聚合,不是满足触发条件在计算,效率更高,更节省资源
package cn._51doit.flink.day04;
import org.apache.flink.api.common.functions.ReduceFunction;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.AllWindowedStream;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.windowing.windows.GlobalWindow;
/**
* 全局窗口
* 当调用window或windowAll方法时,所传入的参数就是Window Assigner(窗口分配器),其作用是决定划分什么样类型的窗口
*/
public class CountWindowAllReduceDemo {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(new Configuration());
//1
//2
//3
DataStreamSource<String> lines = env.socketTextStream("Master", 8888);
//将字符串转为数字【lamber表达式】
//本地执行,执行并行度为4,所以调用map返回的DataStream的并行度为4
SingleOutputStreamOperator<Integer> nums = lines.map(Integer::parseInt);
//划分window
//GlobalWindow有几个并行度?并行度:1,只有一个分区(在这个窗口内只有一个subTask)
AllWindowedStream<Integer, GlobalWindow> windowed = nums.countWindowAll(5);
//把窗口数据进行聚合
// SingleOutputStreamOperator<Integer> sum = windowed.sum(0);
SingleOutputStreamOperator<Integer> reduced = windowed.reduce(new ReduceFunction<Integer>() {
@Override
public Integer reduce(Integer value1, Integer value2) throws Exception {
return value1 + value2; //增量聚合,不是满足触发条件在计算,效率更高,更节省资源
}
});
reduced.print();
env.execute();
}
}
debug调试
- 首先,在CountWindowAllReduceDemo类的return value1 + value2打一个断点
- 然后,在nc -lk 8888窗口下输入一个数字,发现成功进入到debug界面上
源码分析
(1)ctrl+n搜索“HeapReducingState”,在126行代码
@Override
public V apply(V previousState, V value) throws Exception {
return previousState != null ? reduceFunction.reduce(previousState, value) : value;
}
}