参考:http://www.tuicool.com/articles/bQVRryr
/**
* Mark this RDD for checkpointing. It will be saved to a file inside the checkpoint* directory set with `SparkContext#setCheckpointDir` and all references to its parent
* RDDs will be removed. This function must be called before any job has been
* executed on this RDD. It is strongly recommended that this RDD is persisted in
* memory, otherwise saving it on a file will require recomputation.
*/
这是源码中RDD里的checkpoint()方法的注释,里面建议在执行checkpoint()方法之前先对rdd进行persisted操作。