spark map源码
/**
* Return a new RDD by applying a function to all elements of this RDD.
*/
def map[U: ClassTag](f: T => U): RDD[U] = withScope {
val cleanF = sc.clean(f)
new MapPartitionsRDD[U, T](this, (context, pid, iter) => iter.map(cleanF))
}
scala map 源码
/** Creates a new iterator that maps all produced values of this iterator
* to new values using a transformation function.
*
* @param f the transformation function
* @return a new iterator which transforms every value produced by this
* iterator by applying the function `f` to it.
* @note Reuse: $consumesAndProducesIterator
*/
def map[B](f: A => B): Iterator[B] = new AbstractIterator[B] {
def hasNext = self.hasNext
def next() = f(self.next())
}
map将RDD原分区的 iterator 的每一个元素调用 传入函数 f ,底层用Scala的map 方法, 回调函数map的next,将每一个元素进行计算处理,最后返回一个新的RDD,新的RDD的分区数 保持不变。