* - A list of partitions rdd是分片的 * - A function for computing each split 每个function是作用在每个split * - A list of dependencies on other RDDs rdd是有互相依赖的 * - Optionally, a Partitioner for key-value RDDs (e.g. to say that the RDD is hash-partitioned) * rdd的partion是按照hashpartion进行分区的 * - Optionally, a list of preferred locations to compute each split on (e.g. block locations for * an HDFS file) * 会选择最优的local位置进行计算
RDD五大特性
最新推荐文章于 2022-09-03 10:06:16 发布