DateSet的API详解十三
crossWithTiny
def crossWithTiny[O](other: DataSet[O]): CrossDataSet[T, O]
Special cross operation for explicitly telling the system that the right side
is assumed to be a lot smaller than the left side of the cartesian product.
暗示第二个输入较小的交叉。
拿第一个输入的每一个元素和第二个输入的每一个元素进行交叉操作。
执行程序:
//1.定义 case class
case class Coord(id: Int, x: Int, y: Int)
//2.定义两个DataSet[Coord]
val coords1: DataSet[Coord] = benv.fromElements(Coord(1,4,7),Coord(2,5,8),Coord(3,6,9))
val coords2: DataSet[Coord] = benv.fromElements(Coord(10,40,70),Coord(20,50,80),Coord(30,60,90))
//3.交叉两个DataSet[Coord],暗示第二个输入较小
val result1 = coords1.crossWithTiny(coords2)
//4.显示结果
result1.collect
执行结果:
res67: Seq[(Coord, Coord)] = Buffer(
(Coord(1,4,7),Coord(10,40,70)), (Coord(1,4,7),Coord(20,50,80)), (Coord(1,4,7),Coord(30,60,90)),
(Coord(2,5,8),Coord(10,40,70)), (Coord(2,5,8),Coord(20,50,80)), (Coord(2,5,8),Coord(30,60,90)),
(Coord(3,6,9),Coord(10,40,70)), (Coord(3,6,9),Coord(20,50,80)), (Coord(3,6,9),Coord(30,60,90)))
web ui中的执行效果:
crossWithHuge
def crossWithHuge[O](other: DataSet[O]): CrossDataSet[T, O]
Special cross operation for explicitly telling the system that the left side
is assumed to be a lot smaller than the right side of the cartesian product.
暗示第二个输入较大的交叉。
拿第一个输入的每一个元素和第二个输入的每一个元素进行交叉操作。
执行程序:
//1.定义 case class
case class Coord(id: Int, x: Int, y: Int)
//2.定义两个DataSet[Coord]
val coords1: DataSet[Coord] = benv.fromElements(Coord(1,4,7),Coord(2,5,8),Coord(3,6,9))
val coords2: DataSet[Coord] = benv.fromElements(Coord(10,40,70),Coord(20,50,80),Coord(30,60,90))
//3.交叉两个DataSet[Coord],暗示第二个输入较大
val result1 = coords1.crossWithHuge(coords2)
//4.显示结果
result1.collect
执行结果:
res68: Seq[(Coord, Coord)] = Buffer(
(Coord(1,4,7),Coord(10,40,70)), (Coord(2,5,8),Coord(10,40,70)), (Coord(3,6,9),Coord(10,40,70)),
(Coord(1,4,7),Coord(20,50,80)), (Coord(2,5,8),Coord(20,50,80)), (Coord(3,6,9),Coord(20,50,80)),
(Coord(1,4,7),Coord(30,60,90)), (Coord(2,5,8),Coord(30,60,90)), (Coord(3,6,9),Coord(30,60,90)))
web ui中的执行效果: