Spark学习中,遇到了无法使用partitionBy()方法的问题,idea无提示,显示没有此方法,
解决过程,
val rdd3 = rdd2.map(t => {
val url = t._1
val host = new URL(url).getHost
(host, (url, t._2))
})
其中 (host, (url, t._2))被我写成了(host, url, t._2),而partitionBy()是PairRDDFunctions的方法,只适用于【k,v】类型的rdd
/**
* Extra functions available on RDDs of (key, value) pairs through an implicit conversion.
*/
class PairRDDFunctions[K, V](self: RDD[(K, V)])
(implicit kt: ClassTag[K], vt: ClassTag[V], ord: Ordering[K] = null)
extends Logging with Serializable {