Flink如何实现3个实时流同时join,leftjoin,rightjoin

还有几分钟就登记了,目前在哈尔滨飞往北京的候机厅。由于晚上回去很晚,第二天忙活没时间更新文章,挤时间整理了一下。

Flink如何实现3个实时流同时join?整体思路就是:

•设置相同的时间类型•设置相同的时间窗口,这样就会到达相同窗口时,3个实时流会同时触发。

由于flink不支持3个实时流同时join,你需要先把2个实时流join完成的结果,再跟第三个实时流join。

import java.util	
import SessionIdKeyedProcessFunction.MyTimeTimestampsAndWatermarks	
import org.apache.flink.streaming.api.TimeCharacteristic	
import org.apache.flink.streaming.api.functions.{AssignerWithPeriodicWatermarks, AssignerWithPunctuatedWatermarks}	
import org.apache.flink.streaming.api.scala._	
import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment	
import org.apache.flink.streaming.api.watermark.Watermark	
import org.apache.flink.streaming.api.windowing.assigners.TumblingProcessingTimeWindows	
import org.apache.flink.streaming.api.windowing.time.Time	
import org.apache.flink.util.Collector	
object FlinkWindow {	
  class MyTimeTimestampsAndWatermarks extends AssignerWithPeriodicWatermarks[(String,Int)] with Serializable{	
    //生成时间戳	
    val maxOutOfOrderness = 3500L // 3.5 seconds	
    var currentMaxTimestamp: Long = _	
    override def extractTimestamp(element: (String,Int), previousElementTimestamp: Long): Long = {	
      val timestamp = System.currentTimeMillis()	
      currentMaxTimestamp = Math.max(timestamp, currentMaxTimestamp)	
      timestamp	
    }	
    override def getCurrentWatermark(): Watermark = {	
      // return the watermark as current highest timestamp minus the out-of-orderness bound	
      new Watermark(currentMaxTimestamp - maxOutOfOrderness);	
    }	
  }	
  def main(args: Array[String]): Unit = {	
    val env = StreamExecutionEnvironment.getExecutionEnvironment	
    env.setStreamTimeCharacteristic(TimeCharacteristic.ProcessingTime)	
    val input = env.socketTextStream("localhost", 9001)	
    val inputMap = input.flatMap(f => {	
      f.split("\\W+")	
    }).map(line =>(line ,1)).assignTimestampsAndWatermarks(new MyTimeTimestampsAndWatermarks())	

	
    inputMap.print()	
    val input1 = env.socketTextStream("localhost", 9002)	
    val inputMap1 = input1.flatMap(f => {	
      f.split("\\W+")	
    }).map(line =>(line ,1)).assignTimestampsAndWatermarks(new MyTimeTimestampsAndWatermarks())	
    inputMap1.print()	
    val input2 = env.socketTextStream("localhost", 9003)	
    val inputMap2 = input2.flatMap(f => {	
      f.split("\\W+")	
    }).map(line =>(line ,1)).assignTimestampsAndWatermarks(new MyTimeTimestampsAndWatermarks())	
    inputMap2.print()	
    val aa = inputMap.join(inputMap1).where(_._1).equalTo(_._1).window(TumblingProcessingTimeWindows.of(Time.seconds(6)))	
    .apply{(t1:(String,Int),t2:(String,Int), out : Collector[(String,Int,Int)])=>	
      out.collect(t1._1,t1._2,t2._2)	
    }	
  aa.print()	
      val cc = aa.join(inputMap2).where(_._1).equalTo(_._1).window(TumblingProcessingTimeWindows.of(Time.seconds(6)))	
      .apply{(t1:(String,Int,Int),t2:(String,Int), out : Collector[(String,Int,Int,Int)])=>	
        out.collect(t1._1,t1._2,t1._3,t2._2)	
      }	
    cc.print()	
    env.execute()	
  }	
}

leftjoin,rightjoin由于flink官网没有明确指出实现方案,join算子无法实现,大家需要用cogroup来实现leftjoin和rightjoin,大家可以参考这个改一下就可以了

import util.source.StreamDataSource1;	
import util.source.StreamDataSource;	
import org.apache.flink.api.common.functions.CoGroupFunction;	
import org.apache.flink.api.java.functions.KeySelector;	
import org.apache.flink.api.java.tuple.Tuple3;	
import org.apache.flink.api.java.tuple.Tuple5;	
import org.apache.flink.streaming.api.TimeCharacteristic;	
import org.apache.flink.streaming.api.datastream.DataStream;	
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;	
import org.apache.flink.streaming.api.functions.timestamps.BoundedOutOfOrdernessTimestampExtractor;	
import org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows;	
import org.apache.flink.streaming.api.windowing.time.Time;	
import org.apache.flink.util.Collector;	
public class FlinkTumblingWindowsLeftJoinDemo {	
    public static void main(String[] args) throws Exception {	
        int windowSize = 10;	
        long delay = 5100L;	
        final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();	
        env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);	
        env.setParallelism(1);	
        // 设置数据源	
        DataStream<Tuple3<String, String, Long>> leftSource = env.addSource(new StreamDataSource()).name("Demo Source");	
        DataStream<Tuple3<String, String, Long>> rightSource = env.addSource(new StreamDataSource1()).name("Demo Source");	
        // 设置水位线	
        DataStream<Tuple3<String, String, Long>> leftStream = leftSource.assignTimestampsAndWatermarks(	
            new BoundedOutOfOrdernessTimestampExtractor<Tuple3<String, String, Long>>(Time.milliseconds(delay)) {	
                @Override	
                public long extractTimestamp(Tuple3<String, String, Long> element) {	
                    return element.f2;	
                }	
            }	
        );	
        DataStream<Tuple3<String, String, Long>> rigjhtStream = rightSource.assignTimestampsAndWatermarks(	
            new BoundedOutOfOrdernessTimestampExtractor<Tuple3<String, String, Long>>(Time.milliseconds(delay)) {	
                @Override	
                public long extractTimestamp(Tuple3<String, String, Long> element) {	
                    return element.f2;	
                }	
            }	
        );	
        // join 操作	
        leftStream.coGroup(rigjhtStream)	
            .where(new LeftSelectKey()).equalTo(new RightSelectKey())	
            .window(TumblingEventTimeWindows.of(Time.seconds(windowSize)))	
            .apply(new LeftJoin())	
            .print();	
        env.execute("TimeWindowDemo");	
    }	
    public static class LeftJoin implements CoGroupFunction<Tuple3<String, String, Long>, Tuple3<String, String, Long>, Tuple5<String, String, String, Long, Long>> {	
        @Override	
        public void coGroup(Iterable<Tuple3<String, String, Long>> leftElements, Iterable<Tuple3<String, String, Long>> rightElements, Collector<Tuple5<String, String, String, Long, Long>> out) {	
            for (Tuple3<String, String, Long> leftElem : leftElements) {	
                boolean hadElements = false;	
                for (Tuple3<String, String, Long> rightElem : rightElements) {	
                    out.collect(new Tuple5<>(leftElem.f0, leftElem.f1, rightElem.f1, leftElem.f2, rightElem.f2));	
                    hadElements = true;	
                }	
                if (!hadElements) {	
                    out.collect(new Tuple5<>(leftElem.f0, leftElem.f1, "null", leftElem.f2, -1L));	
                }	
            }	
        }	
    }	
    public static class LeftSelectKey implements KeySelector<Tuple3<String, String, Long>, String> {	
        @Override	
        public String getKey(Tuple3<String, String, Long> w) {	
            return w.f0;	
        }	
    }	
    public static class RightSelectKey implements KeySelector<Tuple3<String, String, Long>, String> {	
        @Override	
        public String getKey(Tuple3<String, String, Long> w) {	
            return w.f0;	
        }	
    }

想看更多大厂技术干货分享?请关注下方公号,回复“spark”,“flink”,“机器学习”,“前端”即可获取海量学习资料。

640?wx_fmt=jpeg

  • 2
    点赞
  • 32
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
提供的源码资源涵盖了安卓应用、小程序、Python应用和Java应用等多个领域,每个领域都包含了丰富的实例和项目。这些源码都是基于各自平台的最新技术和标准编写,确保了在对应环境下能够无缝运行。同时,源码配备了详细的注释和文档,帮助用户快速理解代码结构和实现逻辑。 适用人群: 这些源码资源特别适合大学生群体。无论你是计算机相关专业的学生,还是对其他领域编程感兴趣的学生,这些资源都能为你提供宝贵的学习和实践机会。通过学习和运行这些源码,你可以掌握各平台开发的基础知识,提升编程能力和项目实战经验。 使用场景及目标: 在学习阶段,你可以利用这些源码资源进行课程实践、课外项目或毕业设计。通过分析和运行源码,你将深入了解各平台开发的技术细节和最佳实践,逐步培养起自己的项目开发和问题解决能力。此外,在求职或创业过程,具备跨平台开发能力的大学生将更具竞争力。 其他说明: 为了确保源码资源的可运行性和易用性,特别注意了以下几点:首先,每份源码都提供了详细的运行环境和依赖说明,确保用户能够轻松搭建起开发环境;其次,源码的注释和文档都非常完善,方便用户快速上手和理解代码;最后,我会定期更新这些源码资源,以适应各平台技术的最新发展和市场需求。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值