spark ReduceByKey操作

最新推荐文章于 2024-03-01 11:07:15 发布

fjr_huoniao

最新推荐文章于 2024-03-01 11:07:15 发布

阅读量3.4k

点赞数

分类专栏： spark 文章标签： java reduceByKey

本文链接：https://blog.csdn.net/kimyoungvon/article/details/51417789

版权

spark 专栏收录该内容

29 篇文章 1 订阅

订阅专栏

      执行reduceByKey算子
       // reduceByKey，接收的参数是Function2类型，它有三个泛型参数，实际上代表了三个值
       // 第一个泛型类型和第二个泛型类型，代表了原始RDD中的元素的value的类型
           // 因此对每个key进行reduce，都会依次将第一个、第二个value传入，将值再与第三个value传入
           // 因此此处，会自动定义两个泛型类型，代表call()方法的两个传入参数的类型
       // 第三个泛型类型，代表了每次reduce操作返回的值的类型，默认也是与原始RDD的value类型相同的
       // reduceByKey算法返回的RDD，还是JavaPairRDD<key, value>

public static void myReduceByKey(){

       SparkConf conf=new SparkConf()
       .setMaster("local")
       .setAppName("myGroupByKey");
       JavaSparkContext sc=new JavaSparkContext(conf);
       List list=Arrays.asList(new Tuple2<String,Integer>("c1",23),new Tuple2<String,Integer>("c2",33),
               new Tuple2<String,Integer>("c1",23),new Tuple2<String,Integer>("c2",56));
       JavaPairRDD<String, Integer> listRdd= sc.parallelizePairs(list);
       JavaPairRDD<String, Integer> listReduce=listRdd.reduceByKey(new Function2<Integer,Integer,Integer>(){

           private static final long serialVersionUID = 1L;

           @Override
           public Integer call(Integer x, Integer y) throws Exception {
               // TODO Auto-generated method stub
               return x+y;
           }

       });
       listReduce.foreach(new VoidFunction<Tuple2<String,Integer>>(){

           @Override
           public void call(Tuple2<String, Integer> tuple) throws Exception {
               // TODO Auto-generated method stub
               System.out.println("key:"+tuple._1+",values:"+tuple._2);
           }


       });

   }