learn flink
什么是flink?
Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams
Apache Flink:是一个框架和分布式处理引擎,用于对无届和有届的数据流进行有状态的计算
–摘自官网
特性:真正流处理
特点:低延迟、高吞吐、结果的准确性和良好的容错性
事件驱动
API:SQL/Table API==>DataStream API==>ProcessFunction
逐级到底层;越顶层越抽象越简单;越底层越具体越灵活
配置依赖
<dependencies>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<version>1.11.2</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java_2.11</artifactId>
<version> 1.11.2</version>
</dependency>
</dependencies>
创建测试txt文档
aaa aaa
aaa bbbb
ccc ccca
ddd aaa
111 2222
222 3333
5555 4444
编写计数WordCount类
public class WordCount {
public static void main(String[] args) throws Exception {
// 1.创建执行环境
ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
// 2.从文件中读取数据
String inputPuth = "F:\\project\\learnflink\\src\\main\\resources\\test.txt";
DataSource<String> inputDataSet = env.readTextFile(inputPuth);
// 3.对数据进行处理,按照空格分词,转换成(word,1)二元组进行统计
DataSet<Tuple2<String,Integer>> resultSet= inputDataSet.flatMap(new MyFlatMapper())
.groupBy(0)//安照第一个位置的word分组
.sum(1);//将第二个位置上的数据求和
resultSet.print();
}
// 自定义类,用于实现FlatMapFunction接口
public static class MyFlatMapper implements FlatMapFunction<String, Tuple2<String ,Integer>>{
@Override
public void flatMap(String value, Collector<Tuple2<String,Integer>> out) throws Exception{
// 按空格分词
String[] words = value.split(" ");
// 遍历所有的word,包成二元组输出
for (String word:words){
out.collect(new Tuple2<String, Integer>(word,1));
}
}
}
}
执行以上代码,报错
No ExecutorFactory found to execute the application
经排查发现
原因:缺少 flink-client jar
解决办法:
在pom.xml中引入包即可
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.11</artifactId>
<version>1.11.2</version>
</dependency>
再次执行
控制台输出
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
(222,1)
(3333,1)
(4444,1)
(ccc,1)
(ccca,1)
(ddd,1)
(111,1)
(2222,1)
(5555,1)
(aaa,4)
(bbbb,1)
Process finished with exit code 0
未完待续…