DateSet的API详解十二
cross
def cross[O](other: DataSet[O]): CrossDataSet[T, O]
Creates a new DataSet by forming the cartesian product of this DataSet and the other DataSet.
交叉。拿第一个输入的每一个元素和第二个输入的每一个元素进行交叉操作。
cross实例一:基本tuple
执行程序:
//1.定义两个DataSet
val coords1 = benv.fromElements((1,4,7),(2,5,8),(3,6,9))
val coords2 = benv.fromElements((10,40,70),(20,50,80),(30,60,90))
//2.交叉两个DataSet[Coord]
val result1 = coords1.cross(coords2)
//3.显示结果
result1.collect
执行结果:
res71: Seq[((Int, Int, Int), (Int, Int, Int))] = Buffer(
((1,4,7),(10,40,70)), ((2,5,8),(10,40,70)), ((3,6,9),(10,40,70)),
((1,4,7),(20,50,80)), ((2,5,8),(20,50,80)), ((3,6,9),(20,50,80)),
((1,4,7),(30,60,90)), ((2,5,8),(30,60,90)), ((3,6,9),(30,60,90)))
web ui中的执行效果:
cross实例二:case class
执行程序:
//1.定义 case class
case class Coord(id: Int, x: Int, y: Int)
//2.定义两个DataSet[Coord]
val coords1: DataSet[Coord] = benv.fromElements(Coord(1,4,7),Coord(2,5,8),Coord(3,6,9))
val coords2: DataSet[Coord] = benv.fromElements(Coord(10,40,70),Coord(20,50,80),Coord(30,60,90))
//3.交叉两个DataSet[Coord]
val result1 = coords1.cross(coords2)
//4.显示结果
result1.collect
执行结果:
res69: Seq[(Coord, Coord)] = Buffer(
(Coord(1,4,7),Coord(10,40,70)), (Coord(2,5,8),Coord(10,40,70)), (Coord(3,6,9),Coord(10,40,70)),
(Coord(1,4,7),Coord(20,50,80)), (Coord(2,5,8),Coord(20,50,80)), (Coord(3,6,9),Coord(20,50,80)),
(Coord(1,4,7),Coord(30,60,90)), (Coord(2,5,8),Coord(30,60,90)), (Coord(3,6,9),Coord(30,60,90)))
cross实例三:自定义操作
执行程序:
//1.定义 case class
case class Coord(id: Int, x: Int, y: Int)
//2.定义两个DataSet[Coord]
val coords1: DataSet[Coord] = benv.fromElements(Coord(1,4,7),Coord(2,5,8),Coord(3,6,9))
val coords2: DataSet[Coord] = benv.fromElements(Coord(1,4,7),Coord(2,5,8),Coord(3,6,9))
//3.交叉两个DataSet[Coord],使用自定义方法
val r = coords1.cross(coords2) {
(c1, c2) =>{
val dist =(c1.x + c2.x) +(c1.y + c2.y)
(c1.id, c2.id, dist)
}
}
//4.显示结果
r.collect
执行结果:
res65: Seq[(Int, Int, Int)] = Buffer(
(1,1,22), (2,1,24), (3,1,26),
(1,2,24), (2,2,26), (3,2,28),
(1,3,26), (2,3,28), (3,3,30))