zipWithIndex
官方文档描述:
Zips this RDD with its element indices. The ordering is first based on the partition index and then the ordering of items within each partition. So the first item in the first partition gets index 0, and the last item in the last partition receives the largest index. This is similar to Scala's zipWithIndex but it uses Long instead of Int as the index type.This method needs to trigger a spark job when this RDD contains more than one partitions.
函数原型:
def zipWithIndex(): JavaPairRDD[T, JLong]
该函数将RDD中的元素和这个元素在RDD中的indices组合起来,形成键/值对的RDD。
源码分析:
def zipWithIndex(): RDD[(T, Long)] = withSco