scala中用于排序的有两个特质 Ordered和Ordering,Ordered继承了java中的Comparable接口,Ordering继承了java的Comparator接口
trait Ordered[A] extends scala.Any with java.lang.Comparable[A]
trait Ordering[T] extends java.lang.Object with java.util.Comparator[T]
这里选择使用Ordered完成排序功能
定义一个数组
val girl: Array[String] = Array(“reba,18,80”,“mimi,22,70”,“liya,30,80”,“jingtian,18,85”)
按照年龄和体重进行排序
年龄小的排前面,如果年龄相同,体重大的排前面,有两个排序条件
如果用sortBy(func)来排序,就不满足要求了,sortBy(_.2)只能满足一种条件的排序
自定义一个排序类
class Girl(name:String,age:Int,weight:Int) extends Ordered with Seriallizable {
//重写里面的compar方法
override def compar(that:Girl):Int ={
//年龄相同比较体重
//正数 正序 负数 倒序
if(this.age == that.age){
-(this.weight - that.weight)
}else{
this.age - that.age
}
}
}
因为Girl类本身就有排序的功能,所以可以用sortBy(s => s)进行排序。
完整代码:
import org.apache.spark.rdd.RDD
import org.apache.spark.{SparkConf, SparkContext}
/**
* 实现自定义排序
* 按照年龄进行排序,年龄相同按照体重排序
*/
object MySort1 {
def main(args: Array[String]): Unit = {
//1.初始化sparkcontext,spark程序入口
val conf: SparkConf = new SparkConf().setAppName("").setMaster("local[2]")
val sc: SparkContext = new SparkContext(conf)
//2.创建数组
val girl: Array[String] = Array("reba,18,80","mimi,22,70","liya,30,80","jingtian,18,85")
//3.转换RDD
val grdd1: RDD[String] = sc.parallelize(girl)
//4.切分数据
val grdd2:RDD[(Gril)]= grdd1.map(line => {
val field: Array[String] = line.split(",")
//拿到每个属性
val name = field(0)
val age = field(1).toInt
val weight = field(2).toInt
new Gril(name,age,weight)
})
val sorted: RDD[Gril] = grdd2.sortBy(s => s)
var r = sorted.collect()
println(r.toBuffer)
sc.stop()
}
class Gril(val name:String,val age:Int,val weight:Int) extends Ordered[Gril] with Serializable {
override def compare(that: Gril): Int = {
//如果年龄相同 体重重的往前排
if (this.age == that.age){
//如果正数 正序 负数 倒序
-(this.weight - that.weight)
}else{
//年龄小的往前排
this.age -that.age
}
}
override def toString: String = s"名字:$name,年龄:$age,体重:$weight"
}
}
排序结果:
ArrayBuffer(名字:jingtian,年龄:18,体重:85, 名字:reba,年龄:18,体重:80, 名字:mimi,年龄:22,体重:70, 名字:liya,年龄:30,体重:80)