前言
1.引入依赖
<!-- https://mvnrepository.com/artifact/org.apache.flink/flink-connector-kafka -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka_2.12</artifactId>
<version>1.14.4</version>
</dependency>
2.安装kafka
参考之前的博客
3.启动生产者
进入kafka的 /bin 目录,启动生产者,topoc 是 test,生产几条消息
[root@flink-node01 bin]# ./kafka-console-producer.sh --broker-list 172.16.10.159:9092 --topic test
>java,python,c++
>java,python,flink
>
4.启动消费者
@Test
public void fromKafkaTest() throws Exception {
// Kafka参数
Properties properties = new Properties();
properties.setProperty("bootstrap.servers", "172.16.10.159:9092");
properties.setProperty("group.id", "flink-group");
String topic = "test";
// kafka Source
FlinkKafkaConsumer<String> consumer = new FlinkKafkaConsumer<String>(topic, new SimpleStringSchema(), properties);
// flink 流执行环境
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
//设置模式 STREAMING
env.setRuntimeMode(RuntimeExecutionMode.STREAMING);
//添加 kafka 数据源
env.addSource(consumer)
//扁平化
.flatMap(new FlatMapFunction<String, String>() {
@Override
public void flatMap(String value, Collector<String> out) throws Exception {
Arrays.stream(value.split(",")).forEach(v -> out.collect(v));
}
})
//映射
.map(new MapFunction<String, Tuple2<String, Integer>>() {
@Override
public Tuple2<String, Integer> map(String value) throws Exception {
return Tuple2.of(value, 1);
}
})
//分组
.keyBy((KeySelector<Tuple2<String, Integer>, String>) value -> value.f0)
//求和
.sum(1)
//打印结果
.print();
//开始执行
env.execute("flink streaming from mysql");
}
打印结果:
2> (java,1)
3> (python,1)
3> (c++,1)
7> (flink,1)
2> (java,2)
3> (python,2)