Flink Side OutPut分流

本文介绍了在Flink中处理分流场景的三种方法:Filter分流、Split分流和推荐的SideOutput分流。通过示例代码展示了如何使用SideOutput进行数据拆分,强调其可以进行多次拆分的优势。
摘要由CSDN通过智能技术生成

分流场景

      在生产实践中经常会遇到这样的场景,需把输入源按照需要进行拆分,比如预期把订单流按照金额大小进行拆分,或者把用户访问日志按照访问者的地理位置进行拆分等。面对这样的需求应该如何操作?

分流的方法

      针对不同的场景,有以下三种方法进行拆分

Filter分流

      Filter方法在之前的文章中(Flink常用的DataSet和DataStream API)讲过。这个算子用来根据用户输入的条件进行过滤,每个元素都会被filter()函数处理,如果filter()函数返回true则保留,否则丢弃。那么用分流的场景,我们可以做多次filter,把我们需要的不同数据生成不同的流。看下面的例子:

public static void main(String[] args) throws Exception {

    StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    //获取数据源
    List data = new ArrayList<Tuple3<Integer,Integer,Integer>>();
    data.add(new Tuple3<>(0,1,0));
    data.add(new Tuple3<>(0,1,1));
    data.add(new Tuple3<>(0,2,2));
    data.add(new Tuple3<>(0,1,3));
    data.add(new Tup
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
Flink Multiple Output is a feature in Apache Flink that allows users to write data to multiple destinations from a single data stream. This feature is particularly useful when you need to write the same data to different sinks, for example, when you need to store data in multiple databases or send it to multiple messaging systems. To use Flink Multiple Output, you need to define your sinks and the output tags that correspond to each sink. An output tag is a unique identifier that you associate with each sink. You can define your sinks and output tags using the OutputTag class. Once you have defined your sinks and output tags, you can use the split() function to split your data stream into multiple streams based on some criteria. For example, you could split the stream based on the value of a certain field in the data. You can then use the sideOutput() function to write data to each sink using the corresponding output tag. Here is an example of how to use Flink Multiple Output: ``` // Define your sinks and output tags OutputTag<String> firstOutputTag = new OutputTag<String>("first-output"){}; OutputTag<String> secondOutputTag = new OutputTag<String>("second-output"){}; DataStream<String> stream = ... // your data stream // Split the stream into two streams based on some criteria SingleOutputStreamOperator<String> firstStream = stream .filter(data -> data.startsWith("A")) .map(data -> data.toUpperCase()) .returns(Types.STRING) .name("First Stream") .tag(firstOutputTag); SingleOutputStreamOperator<String> secondStream = stream .filter(data -> data.startsWith("B")) .map(data -> data.toLowerCase()) .returns(Types.STRING) .name("Second Stream") .tag(secondOutputTag); // Write data to each sink using the corresponding output tag firstStream.getSideOutput(firstOutputTag).addSink(... // first sink); secondStream.getSideOutput(secondOutputTag).addSink(... // second sink); ``` In this example, we define two output tags, `firstOutputTag` and `secondOutputTag`, and use them to split the data stream into two streams, `firstStream` and `secondStream`. We then use the `getSideOutput()` function to write data to each sink using the corresponding output tag. Overall, Flink Multiple Output is a powerful feature that can help you write data to multiple sinks from a single data stream. It can simplify your code and improve your application's performance and scalability.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

王知无(import_bigdata)

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值