spark九 排序

排序算子

sortBy等

1.rdd内部转换元祖,

  //按照price排序
 val rdd = sc.parallelize(List("iphone5 1000 20", "iphone6 2000 50", 
    "iphone7 2000 100", "iphone11 5000 50"))

    val product = rdd.map(x => {
      //按照空格拆分
      val split = x.split(" ")
      val name = split(0)
      val price = split(1).toDouble
      val amount = split(2).toInt
      (name, price, amount)
    }).sortBy(x=>(x._2))//排序规则
	//打印数据
    product.printInfo()

2.使用自定义类

extends Ordered[ProductInfoV1] with Serializable,实现compare()

import com.bigdata.spark.utils.ImplicitAspect._
import org.apache.spark.{SparkConf, SparkContext}

object SortApp02 {

  def main(args: Array[String]): Unit = {
    val sparkConf = new SparkConf().setMaster("local[2]").setAppName("my-spark")
    val sc = new SparkContext(sparkConf)
    val rdd = sc.parallelize(List("iphone5 1000 20", "iphone6 2000 50", "iphone7 2000 100", "iphone11 5000 50"))

    val product = rdd.map(x => {
      //按照空格拆分
      val split = x.split(" ")
      val name = split(0)
      val price = split(1).toDouble
      val amount = split(2).toInt
      new ProductInfoV1(name, price, amount)
    }).sortBy(x => x)
    //打印信息
    product.printInfo()

    sc.stop()

  }
}

class ProductInfoV1(val name: String, val price: Double, val amount: Int)
  extends Ordered[ProductInfoV1] with Serializable {
  //重写compare方法
  override def compare(that: ProductInfoV1) = {
    (this.price - that.price).toInt
  }
  // 重写toString方法
  override def toString: String = {
    name + "\t" + price + "\t" + amount
  }
}

3: 优化为case class(推荐)

推荐使用case class的原因,主要是因为
1.自动序列化
2.自动重写了toString
3.不需要new

import com.bigdata.spark.utils.ImplicitAspect._
import org.apache.spark.{SparkConf, SparkContext}

object SortApp03 {


  def main(args: Array[String]): Unit = {
    val sparkConf = new SparkConf().setMaster("local[2]").setAppName("my-spark")
    val sc = new SparkContext(sparkConf)
    val rdd = sc.parallelize(List("iphone5 1000 20", "iphone6 2000 50", 
    "iphone7 2000 100", "iphone11 5000 50"))

    val product = rdd.map(x => {
      //按照空格拆分
      val split = x.split(" ")
      val name = split(0)
      val price = split(1).toDouble
      val amount = split(2).toInt
      ProductInfoV2(name, price, amount)
    }).sortBy(x => x)
    //打印信息
    product.printInfo()

    sc.stop()

  }
}

case class ProductInfoV2(val name: String, val price: Double, val amount: Int)
  extends Ordered[ProductInfoV2] {
  //重写compare方法
  override def compare(that: ProductInfoV2) = {
    (this.price - that.price).toInt
  }
}

4:优化为case class, implicit(推荐)

定义一个下面这个类,不允许修改此类

case class ProductInfoV2(val name: String, val price: Double, val amount: Int) {

}

然后使用时通过隐式转换对此类进行增强

object SortApp03 {


  def main(args: Array[String]): Unit = {
    val sparkConf = new SparkConf().setMaster("local[2]").setAppName("my-spark")
    val sc = new SparkContext(sparkConf)
    val rdd = sc.parallelize(List("iphone5 1000 20", "iphone6 2000 50", 
    "iphone7 2000 100", "iphone11 5000 50"))
	//隐式转换
    implicit def product2Ordered(product: ProductInfoV2) = new Ordered[ProductInfoV2] {
      override def compare(that: ProductInfoV2): Int = {
        (product.price - that.price).toInt
      }
    }

    val product = rdd.map(x => {
      //按照空格拆分
      val split = x.split(" ")
      val name = split(0)
      val price = split(1).toDouble
      val amount = split(2).toInt
      ProductInfoV2(name, price, amount)
    }).sortBy(x => x)


    //打印信息
    product.printInfo()

    sc.stop()

  }

5:优化为implicit on

需求:现在比方说要按照价格升序,如果价格相同,按照数量降序

implicit on的公式如下:
implicit val ord = Ordering[排序规则数据类型].on[数据的类型](x => 排序规则)

import com.bigdata.spark.utils.ImplicitAspect._

object SortApp01 {

  def main(args: Array[String]): Unit = {
    val sparkConf = new SparkConf().setMaster("local[2]").setAppName("my-spark")
    val sc = new SparkContext(sparkConf)

    val rdd = sc.parallelize(List("iphone5 1000 20", "iphone6 2000 50", 
    "iphone7 2000 100", "iphone11 5000 50"))

    val product = rdd.map(x => {
      //按照空格拆分
      val split = x.split(" ")
      val name = split(0)
      val price = split(1).toDouble
      val amount = split(2).toInt
      (name, price, amount)
    })
    /**
      * 
      *
      * x._2, -x._3排序规则
      * (Double, Int)定义的是规矩的返回值类型
      * (String, Double, Int) 数据的类型
      */

    implicit val ord = Ordering[(Double, Int)].on[(String, Double, Int)](x => (x._2, -x._3))
    product.sortBy(x => x).printInfo()
    sc.stop()

  }
}

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

orange大数据技术探索者

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值