前言
我们在前一篇文章中《编写一个简单的flinkdemo聚合例子-流计算(一)》,通过流计算实现单词统计的demo,我们知道flink也提供了批计算功能,今天就通过批计算来实现单词统计,接下来看看代码
代码
pom文件省略,参考上一篇文章
- 源数据(txt)
aaa bbb ccc
bbb ddd ee
ccc fff fff gggg
hhh a h gg
如图
- Java实现
package com.hy.flinktest;
import org.apache.commons.lang3.StringUtils;
import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.api.java.operators.AggregateOperator;
import org.apache.flink.api.java.operators.DataSource;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.core.fs.FileSystem;
import org.apache.flink.util.Collector;
/**
* ClassName: BatchWordCountJava
* Descript