Map操作主要是遍历rdd中的每个元素,对每个元素进行操作并返回,代码如下:
public static void myMap(){
List<Integer> numbers=Arrays.asList(1,2,3,4,5);
SparkConf conf=new SparkConf()
.setMaster("local")
.setAppName("myMap");
JavaSparkContext sc=new JavaSparkContext(conf);
JavaRDD<Integer> numberRdd= sc.parallelize(numbers);
//第一个参数表示输入的值,第二个参数表示输出的值
JavaRDD<Integer> numMapRdd=numberRdd.map(new Function<Integer,Integer>(){
private static final long serialVersionUID = 1L;
@Override
public Integer call(Integer num) throws Exception {
// TODO Auto-generated method stub
return num+2;
}
});
numMapRdd.foreach(new VoidFunction<Integer>(){
@Override
public void call(Integer num) throws Exception {
// TODO Auto-generated method stub
System.out.println("numbers;"+num);
}
});
}
结果:
umbers;3
numbers;4
numbers;5
numbers;6
numbers;7