spark阶段测试题

最新推荐文章于 2022-12-21 17:34:08 发布

Glace.♥

最新推荐文章于 2022-12-21 17:34:08 发布

阅读量241

点赞数

分类专栏： spark

本文链接：https://blog.csdn.net/LULU_lulu666/article/details/103276038

版权

1、map和flatMap的区别

RDD.scala中的map和flatMap

package com.grace.updateState

import org.apache.spark.{SparkConf, SparkContext}

object MapAndFlatMap {
  def main(args: Array[String]): Unit = {
    val sc = new SparkContext(new SparkConf().setAppName("map_flatMap_demo").setMaster("local"))
    val arrayRDD =sc.parallelize(Array("a_b","c_d","e_f"))
    arrayRDD.foreach(println) //打印结果1
    //a_b
    //c_d
    //e_f

    arrayRDD.map(string=>{
      string.split("_")
    }).foreach(x=>{
      println(x.mkString(",")) //打印结果2
      //a,b
      //c,d
      //e,f
    })

    arrayRDD.flatMap(string=>{
      string.split("_")
    }).foreach(x=>{
      println(x.mkString(","))//打印结果3
      //a
      //b
      //c
      //d
      //e
      //f
    })
  }
}

对比结果2与结果3，很容易得出结论：

map函数后，RDD的值为 Array(Array("a","b"),Array("c","d"),Array("e","f"))

flatMap函数处理后，RDD的值为 Array("a","b","c","d","e","f")

即最终可以认为，flatMap（Func）会将其返回的数组全部打散，然后合成到一个数组中，并且对每个数据源进行Func处理

源码

 /**
   * Return a new RDD by applying a function to all elements of this RDD.
   */
  def map[U: ClassTag](f: T => U): RDD[U] = withScope {
    val cleanF = sc.clean(f)
    new MapPartitionsRDD[U, T](this, (context, pid, iter) => iter.map(cleanF))
  }

  /**
   *  Return a new RDD by first applying a function to all elements of this
   *  RDD, and then flattening the results.
   */
  def flatMap[U: ClassTag](f: T => TraversableOnce[U]): RDD[U] = withScope {
    val cleanF = sc.clean(f)
    new MapPartitionsRDD[U, T](this, (context, pid, iter) => iter.flatMap(cleanF))
  }

DStream.scala中的map和flatMap

源码

  /** Return a new DStream by applying a function to all elements of this DStream. */
  def

最低0.47元/天解锁文章

Glace.♥

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
spark阶段测试题

1、map和flatMap的区别RDD.scala中的map和flatMappackage com.grace.updateStateimport org.apache.spark.{SparkConf, SparkContext}object MapAndFlatMap { def main(args: Array[String]): Unit = { val sc...
复制链接

扫一扫

专栏目录