我知道火花做懒惰的评价.
但这是预期的行为?
使用以下程序,输出为20.
但如果是打印声明
System.out.println("/// After "+MainRDD.count());
如果没有注释,输出将为40
我不是在我的应用程序中这样做,但只是为了演示,我创建了这个程序..
SparkConf sparkConf = new SparkConf().setMaster("local").setAppName("JavaSparkSQL");
JavaSparkContext sc = new JavaSparkContext(sparkConf);
JavaRDD MainRDD;
ArrayList list = new ArrayList<>();
JavaRDD tmp;
for (int i = 0; i < 20; i++) {
list.add(i);
}
MainRDD = sc.parallelize(list);// MainRDD.union(tmp);
System.out.println("//First "+MainRDD.count());
list.clear();
for (int i = 20; i < 25; i++) {
for (int j = 1; j < 5; j++) {
list.add(i*j);
}
tmp = sc.parallelize(list);
// System.out.println("/// Before "+MainRDD.count());
MainRDD = MainRDD.union(tmp);
// System.out.println("/// After "+MainRDD.count());
list.clear();
}
System.out.println("/// last "+MainRDD.count());
}