1、Tuple类型
val products = sc.parallelize(List("屏保 20 10","支架 20 1000","酒精棉 5 2000","吸氧机 5000 1000"))
val productData = products.map(x=>{
val splits = x.split(" ")
val name = splits(0)
val price = splits(1).toDouble
val amount = splits(2).toInt
(name,price,amount)
})
/**
* 价格降序
*/
productData.sortBy(_._2,false).collect().foreach(println)
/**
* 多个字段的排序,在sortBy方法中可以传入一个tuple
* 价格降序 库存降序
*/
productData.sortBy(x=>(-x._2,-x._3)).collect().foreach(println)
2、实体类Bean
定义实体类如果没有继承Ordered和Serializable,会报如下错误
Error:(20, 23) No implicit Ordering defined for com.bigdata.sort.Products.
productData.sortBy(x=>x).collect().foreach(println)
Error:(20, 23) not enough arguments for method sortBy: (implicit ord: Ordering[com.bigdata.sort.Products], implicit ctag: scala.reflect.ClassTag[com.bigdata.sort.Products])org.apache.spark.rdd.RDD[com.bigdata.sort.Products].
Unspecified value parameters ord, ctag.
productData.sortBy(x=>x).collect().foreach(println)
----------------------------------------------------------
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1.0 in stage 0.0 (TID 1) had a not serializable result: com.bigdata.sort.Products
实体类定义如下
class Products(val name:String,val price:Double,val amount:Int) extends Ordered[Products] with Serializable {
override def toString: String = name + "," + price + "," + amount
/**
* 库存降序
* @param that
* @return
*/
override def compare(that: Products): Int = {
that.amount - this.amount
}
}
val products = sc.parallelize(List("屏保 20 10","支架 20 1000","酒精棉 5 2000","吸氧机 5000 1000"))
val productData = products.map(x=>{
val splits = x.split(" ")
val name = splits(0)
val price = splits(1).toDouble
val amount = splits(2).toInt
new Products(name,price,amount)
})
productData.sortBy(x=>x).collect().foreach(println)
3、case class
case class ProductsCaseClass(name:String,price:Double,amount:Int) extends Ordered[ProductsCaseClass] {
override def toString: String ="case class" + name + "," + price + "," + amount
override def compare(that: ProductsCaseClass): Int = {
that.amount - this.amount
}
}
val products = sc.parallelize(List("屏保 20 10","支架 20 1000","酒精棉 5 2000","吸氧机 5000 1000"))
val productData = products.map(x=>{
val splits = x.split(" ")
val name = splits(0)
val price = splits(1).toDouble
val amount = splits(2).toInt
ProductsCaseClass(name,price,amount)
})
productData.sortBy(x=>x).collect().foreach(println)
隐式转换
case class ProductsInfo(name:String,price:Double,amount:Int)
val products = sc.parallelize(List("支架 20 10","屏保 20 1000","酒精棉 5 2000","吸氧机 5000 1000"))
val productData = products.map(x=>{
val splits = x.split(" ")
val name = splits(0)
val price = splits(1).toDouble
val amount = splits(2).toInt
new ProductsInfo(name,price,amount)
})
/**
* 隐式object 升序
*/
implicit object productsInfo2OrderingObj extends Ordering[ProductsInfo] {
override def compare(x: ProductsInfo, y: ProductsInfo): Int = {
x.amount - y.amount
}
}
/**
* 隐式变量 降序
*/
implicit val productsInfo2Ordering:Ordering[ProductsInfo] = new Ordering[ProductsInfo]{
override def compare(x: ProductsInfo, y: ProductsInfo): Int = {
y.amount - x.amount
}
}
/**
* 隐式方法 降序
*/
implicit def productInfo2Ordered(productsInfo:ProductsInfo):Ordered[ProductsInfo] = new Ordered[ProductsInfo] {
override def compare(that: ProductsInfo): Int = {
that.amount - productsInfo.amount
}
}
productData.sortBy(x=>x).collect().foreach(println)
其中优先级 隐式object > 隐式变量 > 隐式方法
4、Tuple类型的隐式转换(重要)
/**
* (Double,Int) 定义排序规则的返回值类型
* (String,Double,Int)进来数据的类型
*/
implicit val ord = Ordering[(Double,Int)].on[(String,Double,Int)](x=>(-x._2,-x._3))
val products = sc.parallelize(List("支架 20 10","屏保 20 1000","酒精棉 5 2000","吸氧机 5000 1000"))
val productData = products.map(x=>{
val splits = x.split(" ")
val name = splits(0)
val price = splits(1).toDouble
val amount = splits(2).toInt
(name,price,amount)
})
productData.sortBy(x=>x).collect.foreach(println)