![](https://img-blog.csdnimg.cn/20201014180756923.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
spark
文章平均质量分 84
职场江湖指北
关注公众号「职场江湖指北」,解答更多offer选择问题!
展开
-
spark学习笔记—核心算子(二)
spark学习笔记—核心算子(二)distinct算子 /** * Return a new RDD containing the distinct elements in this RDD. */ def distinct(numPartitions: Int)(implicit ord: Ordering[T] = null): RDD[T] = withScope { def removeDuplicatesInPartition(partition: Iterator[原创 2021-09-22 23:04:51 · 144 阅读 · 0 评论 -
spark学习笔记—核心算子(一)
spark学习笔记—核心算子(一)HashPartitioner的决定分区的逻辑核心方法 def getPartition(key: Any): Int = key match { case null => 0 case _ => Utils.nonNegativeMod(key.hashCode, numPartitions) } /* Calculates 'x' modulo 'mod', takes to consideration sign of x,原创 2021-09-06 10:40:29 · 267 阅读 · 0 评论