在上一篇博客:基于flink实现的worldcount通过Flink自带的Tuple实现了单词统计。需要注意Flink不是K,V格式的编程模型,我们可以对Flink使用非K,V格式编程来统计单词个数,这里可以使用一个对象的方式实现,本文在上一篇博客环境基础之上,通过封装对象实现。
1、创建WordDto
public class WordDto {
private String word;
private Integer count;
public WordDto() {
}
public WordDto(String word, Integer count) {
this.word = word;
this.count = count;
}
public String getWord() {
return word;
}
public void setWord(String word) {
this.word = word;
}
public Integer getCount() {
return count;
}
public void setCount(Integer count) {
this.count = count;
}
@Override
public String toString() {
return "word="+this.getWord()+",count="+this.getCount();
}
}
FlinkWorldCount2 主类内容如下:
public class FlinkWorldCount2 {
public static void main(String[] args) throws Exception {
ExecutionEnvironment env=ExecutionEnvironment.getExecutionEnvironment();
DataSet<String>lines=env.readTextFile("./data/words");
FlatMapOperator<String,String>words= lines.flatMap(new FlatMapFunction<String, String>() {
@Override
public void flatMap(String value, Collector<String> out) throws Exception {
for (String word : value.split(" ")) {
out.collect(word);
}
}
});
MapOperator<String,WordDto>mapOperator=words.map(new MapFunction<String, WordDto>() {
@Override
public WordDto map(String word) throws Exception {
return new WordDto(word,1);
}
});
UnsortedGrouping<WordDto>grouping=mapOperator.groupBy("word");
ReduceOperator<WordDto>reduce=grouping.reduce(new ReduceFunction<WordDto>() {
@Override
public WordDto reduce(WordDto w1, WordDto w2) throws Exception {
return new WordDto(w1.getWord(),w1.getCount()+w2.getCount());
}
});
reduce.print();
}
}
通过封装对象实现flink算法时,需要注意以下几点:
- 1 类的访问级别必须是public
- 2.类中必须实现无参对象
- 3.类中的属性必须是public或者private【必须实现getter /setter方法】
- 4.类必须是可序列化的
程序运行结果如下