目录
概述
总体需求:Array("laoduan 30 99", "laozhao 29 9999", "laozhang 28 98", "laoyang 28 99")
排序规则:首先按照颜值的降序,如果颜值相等,再按照年龄的升序。下面列举了各种排序思路。
方案汇总
方法一
建了一个user类,继承了Ordered,里面的参数是整个user,实现了Serializable,因为要进行网络传输,将所有属性全部传进来,重写compare方法,实现排序。
object CustomSort1 {
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("CustomSort1").setMaster("local[*]")
val sc = new SparkContext(conf)
val users= Array("laoduan 30 99", "laozhao 29 9999", "laozhang 28 98", "laoyang 28 99") //三个属性分别是:姓名,年龄,属性。排序规则:首先按照颜值的降序,如果颜值相等,再按照年龄的升序
val lines: RDD[String] = sc.parallelize(users) //将Driver端的数据并行化变成RDD
val userRDD: RDD[User] = lines.map(line => { //切分整理数据
val fields = line.split(" ")
val name = fields(0)
val age = fields(1).toInt
val fv = fields(2).toInt
new User(name, age, fv)
})
val sorted: RDD[User] = userRDD.sortBy(u => u) //将RDD里面装的User类型的数据进行排序
val r = sorted.collect()
println(r.toBuffer)
sc.stop()
}
}
class User(val name: String, val age: Int, val fv: Int) extends Ordered[User] with Serializable {
override def compare(that: User): Int = {
if(this.fv == that.fv) {
this.age - that.age
} else {
-(this.