Flink SplitStream的简单使用
在Flink DataStream api中有一个split()算子,它的功能是将一个Stream,通过split()设置多个标记,划分成多个流。再通过select()获取对应标记的流。
像种算子,可以根据不同的tag,进行不同的逻辑处理,可谓是非常的方便。
具体看代码:
public class SplitOperator {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment sEnv = StreamExecutionEnvironment.getExecutionEnvironment();
sEnv.setParallelism(1);
Properties p = new Properties();
p.setProperty("bootstrap.servers", "localhost:9092");
DataStreamSource<String> source = sEnv.addSource(new FlinkKafkaConsumer010<String>("people", new SimpleStringSchema(), p));
SplitStream<People> splitStream = source.map(new MapFunction<String, People>() {
@Override
public People map(String value) throws Exception {
return new Gson().fromJson(value, People.class);
}
}).split(new OutputSelector<People>() { // split可以将一个流,通过打Tag的方式,split成多个流
@Override
public Iterable<String> select(People value) {
List<String> list = new ArrayList<>();
if (value.sex().equals("male")) {
list.add("male");
} else {
list.add("female");
}
return list;
}
});
// SplitStream流 通过select("tag")获取DataStream流
DataStream<People> male = splitStream.select("male");
male.print("male:");
DataStream<People> female = splitStream.select("female");
female.print("female:");
// 将流合并
DataStream<People> union = male.union(female);
union.print("union:");
sEnv.execute("SplitOperator");
}
}
输出结果:
…