上一章记录了flink的分流操作,那么有分流是不是应该有合流呢?
当然是有这样的操作啦
一、合流场景
Stream1和Stream2流需要合并为Stream流
二、合流方式
1. Union合流
2. Connect合流
前置配置
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
env.setParallelism(1);
//kafka基本配置,0.8版本
Properties properties = new Properties();
properties.setProperty("zookeeper.connect", "zk地址");
properties.put("bootstrap.servers", "kafka地址");
properties.put("group.id", "groupid");
properties.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
properties.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
properties.put("auto.offset.reset", "earliest");
//获取日志
FlinkKafkaConsumer08<String> kafkaSource1 = new FlinkKafkaConsumer08<>(
"topic1",
new SimpleStringSchema(), properties);
//获取flink数据流
DataStream<String> logSource1 = env.addSource(kafkaSource);
FlinkKafkaConsumer08<String> kafkaSource2 = new FlinkKafkaConsumer08<>(
"topic2",
new SimpleStringSchema(), properties);
//获取flink数据流
DataStream<String> logSource2 = env.addSource(kafkaSource);
DataStream<Object> dataStream1 = logSource1.map(new MapFunction<String, Object>() {
@Override
public Object map(String s) throws Exception {
//转换为Object对象
ObjectMapper objectMapper = new ObjectMapper();
objectMapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);
Object obj= objectMapper.readValue(s.getBytes(), Object.class);
return obj;
}
});
DataStream<Object> dataStream2 = logSource2.map(new MapFunction<String, Object>() {
@Override
public Object map(String s) throws Exception {
//转换为Object对象
ObjectMapper objectMapper = new ObjectMapper();
objectMapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);
Object obj= objectMapper.readValue(s.getBytes(), Object.class);
return obj;
}
});
Union合流
注意:union合流只能合并相同类型的流
DataStream<Object> union = DataStream1.union(DataStream2);
Connect合流
注意:connect合流可以合并不同类型的流
ConnectedStreams<Object1, Object2> connect = DataStream1.connect(DataStream2);
那么合并好的流如何处理呢,如下所示,将两个流转换为一个流进行处理
//方式一:用CoMap处理合并后的流
SingleOutputStreamOperator<Object> result = connect.map(new CoMapFunction<Object1, Object2, Object>() {
//定义第一个流的处理逻辑
@Override
public String map1(Object1 object1) throws Exception {
return null;
}
//定义第二个流的处理逻辑
@Override
public String map2(Object2 object2) throws Exception {
return null;
}
});
//方式二:用CoFlatMap处理合并后的流
SingleOutputStreamOperator<String> result = connect.flatMap(new CoFlatMapFunction<Object1, Object2, Object>() {
@Override
public void flatMap1(Object1 object1, Collector<Object> collector) throws Exception {
//数据转换
Object obj;
collector.collect(obj);
}
@Override
public void flatMap2(Object1 object2, Collector<Object> collector) throws Exception {
//数据转换
Object obj;
collector.collect(obj);
}
});