Map操作主要是遍历rdd中的每个元素,对每个元素进行操作并返回,代码如下:
public static void myMap(){
List numbers=Arrays.asList(1,2,3,4,5);
SparkConf conf=new SparkConf()
.setMaster("local")
.setAppName("myMap");
JavaSparkContext sc=new JavaSparkContext(conf);
JavaRDD numberRdd= sc.parallelize(numbers);
//第一个参数表示输入的值,第二个参数表示输出的值
JavaRDD numMapRdd=numberRdd.map(new Function(){
private static final long serialVersionUID = 1L;
@Override
public Integer call(Integer num) throws Exception {
// TODO Auto-generated method stub
return num+2;
}
});
numMapRdd.foreach(new VoidFunction(){
@Override
public void call(Integer num) throws Exception {
// TODO Auto-generated method stub
System.out.println("numbers;"+num);
}
});
}
结果:
umbers;3 numbers;4 numbers;5 numbers;6 numbers;7