DateSet的API详解十一
fullOuterJoin
def fullOuterJoin[O](other: DataSet[O], strategy: JoinHint): UnfinishedOuterJoinOperation[T, O]
deffullOuterJoin[O](other: DataSet[O]): UnfinishedOuterJoinOperation[T, O]
Special fullOuterJoin operation for explicitly telling the system what join strategy to use.
全外连接
fullOuterJoin示例一
执行程序:
//1.定义DataSet[(String, String)]
val movies: DataSet[(String, String)] = benv.fromElements(
("moon","ok"),("dog","good"),
("cat","notbad"),("sun","nice"))
//2.定义 DataSet[Rating]
case class Rating(name: String, category: String, points: Int)
val ratings: DataSet[Rating] = benv.fromElements(
Rating("moon","youny1",3),Rating("sun","youny2",4),
Rating("cat","youny3",1),Rating("dog","youny4",5))
//3.两个dataset进行全外连接,指定连接方法
val result1 = movies.fullOuterJoin(ratings).where(0).equalTo("name"){
(m, r) => (m._1, if (r == null) -1 else r.points)
}
//5.显示结果
result1.collect
执行结果:
res33: Seq[(String, Int)] = Buffer((moon,3), (sun,4), (cat,1), (dog,5))
web ui中的执行效果:
rightOuterJoin示例二
执行程序:
//1.定义DataSet[(String, String)]
val movies: DataSet[(String, String)] = benv.fromElements(
("moon","ok"),("dog","good"),
("cat","notbad"),("sun","nice"))
//2.定义 DataSet[Rating]
case class Rating(name: String, category: String, points: Int)
val ratings: DataSet[Rating] = benv.fromElements(
Rating("moon","youny1",3),Rating("sun","youny2",4),
Rating("cat","youny3",1),Rating("dog","youny4",5))
//3.两个dataset进行全外连接,指定连接方法
val result1 = movies.fullOuterJoin(ratings,JoinHint.REPARTITION_SORT_MERGE).where(0).equalTo("name"){
(m, r) => (m._1, if (r == null) -1 else r.points)
}
//5.显示结果
result1.collect
执行结果:
res41: Seq[(String, Int)] = Buffer((cat,1), (dog,5), (moon,3), (sun,4))
暗示项目说明:
左外连接支持以下项目:
JoinHint.OPTIMIZER_CHOOSES
JoinHint.BROADCAST_HASH_FIRST
JoinHint.REPARTITION_HASH_FIRST
JoinHint.REPARTITION_SORT_MERGE