KeyBy
**DataStream → KeyedStream **
逻辑地将一个流拆分成不相交的分区,每个分区包含具有相同key的元素,在内部以hash的形式实现的。
根据Key,累计统计
package quick;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
public class KeyByExample {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
DataStream<Tuple2<Integer, Integer>> dataStream = env
.fromElements(Tuple2.of(1,1),Tuple2.of(1,2),Tuple2.of(2,2),Tuple2.of(2,2))
.keyBy(value -> value.f0)
.sum(1);
dataStream.print();
env.execute("KeyByExample job");
}
}
输出
(1,1)
(1,3)
(2,2)
(2,4)
修改pom
<version>1.0.1</version>
<mainClass>quick.KeyByExample </mainClass>
然后,将打包应用程序提交,Flink 的Web UI来提交作业监控集群的状态和正在运行的作业。