dependencies
dependencies可以查看此 RDD 依赖的上一个 RDD 的类型
val rdd1 = sc.parallelize(List(1,2,3,4,5,6,7,8,9))
println("rdd1的依赖:" + rdd1.dependencies)
var rdd2 = rdd1.map(x=>(x,1))
println("rdd2的依赖:" + rdd2.dependencies)
var rdd3 = rdd2.reduceByKey(_+_)
println("rdd3的依赖:" + rdd3.dependencies)
var rdd4 = rdd3.groupByKey()
println("rdd4的依赖:" + rdd4.dependencies)
结果
rdd1的依赖:List()
rdd2的依赖:List(org.apache.spark.OneToOneDependency@60e949e1)
rdd3的依赖:List(org.apache.spark.ShuffleDependency@57ce634f)
rdd4的依赖:List(org.apache.spark.OneToOneDependency@6f3f0fae)
toDebugString
可以看到此 RDD的血缘关系
val rdd1 = sc.parallelize(List(1,2,3,4,5,6,7,8,9))
println("rdd1的血缘关系:" + rdd1.toDebugString)
println("--------------------------------华丽的分割线--------------------------------")
var rdd2 = rdd1.map(x=>(x,1))
println("rdd2的血缘关系:" + rdd2.toDebugString)
println("--------------------------------华丽的分割线--------------------------------")
var rdd3 = rdd2.reduceByKey(_+_)
println("rdd3的血缘关系:" + rdd3.toDebugString)
println("--------------------------------华丽的分割线--------------------------------")
var rdd4 = rdd3.groupByKey()
println("rdd4的血缘关系:" + rdd4.toDebugString)
结果
rdd1的血缘关系:(1) ParallelCollectionRDD[0] at parallelize at CheckPoint.scala:14 []
--------------------------------华丽的分割线--------------------------------
rdd2的血缘关系:(1) MapPartitionsRDD[1] at map at CheckPoint.scala:19 []
| ParallelCollectionRDD[0] at parallelize at CheckPoint.scala:14 []
--------------------------------华丽的分割线--------------------------------
rdd3的血缘关系:(1) ShuffledRDD[2] at reduceByKey at CheckPoint.scala:24 []
+-(1) MapPartitionsRDD[1] at map at CheckPoint.scala:19 []
| ParallelCollectionRDD[0] at parallelize at CheckPoint.scala:14 []
--------------------------------华丽的分割线--------------------------------
rdd4的血缘关系:(1) MapPartitionsRDD[3] at groupByKey at CheckPoint.scala:29 []
| ShuffledRDD[2] at reduceByKey at CheckPoint.scala:24 []
+-(1) MapPartitionsRDD[1] at map at CheckPoint.scala:19 []
| ParallelCollectionRDD[0] at parallelize at CheckPoint.scala:14 []
红色的部分表示在这个RDD 发生了 Shuffle