flink左内连接Java_flink dataset join笔记

1、dataset的join连接,通过key进行关联,一般情况下的join都是inner join,类似sql里的inner join

key包括以下几种情况:

a key expression

a key-selector function

one or more field position keys (Tuple DataSet only).

Case Class Fields

2、inner join的几种情况

2.1 缺省的join,jion到一个Tuple2元组里

public static class User { public String name; public intzip; }public static class Store { public Manager mgr; public intzip; }

DataSet input1 = //[...]

DataSet input2 = //[...]//result dataset is typed as Tuple2

DataSet>result=input1.join(input2)

.where("zip") //key of the first input (users)

.equalTo("zip"); //key of the second input (stores)

2.2 用户自定义JoinFuncation,使用with语句

//some POJO

public classRating {publicString name;publicString category;public intpoints;

}//Join function that joins a custom POJO with a Tuple

public classPointWeighterimplements JoinFunction, Tuple2>{

@Overridepublic Tuple2 join(Rating rating, Tuple2weight) {//multiply the points and rating and construct a new output tuple

return new Tuple2(rating.name, rating.points *weight.f1);

}

}

DataSet ratings = //[...]

DataSet> weights = //[...]

DataSet>weightedRatings=ratings.join(weights)//key of the first input

.where("category")//key of the second input

.equalTo("f0")//applying the JoinFunction on joining pairs

.with(new PointWeighter());

2.3 使用Flat-Join Function,这种JoinFuncation和FlatJoinFuncation与MapFuncation和FlatMapFuncation的关系类似

public classPointWeighterimplements FlatJoinFunction, Tuple2>{

@Overridepublic void join(Rating rating, Tuple2weight,

Collector>out) {if (weight.f1 > 0.1) {

out.collect(new Tuple2(rating.name, rating.poi

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值