本文为《Spark大型电商项目实战》 系列文章之一,主要介绍使用Scala实现二次排序。
代码实现
在Scala IDE中的包com.erik.sparkproject
中创建SortKey.scala
,实现二次排序,代码如下:
package com.erik.sparkproject
/**
* @author Erik
*/
class SortKey(val clickCount: Int,
val orderCount: Int,
val payCount: Int)
extends Ordered[SortKey] with Serializable {
def compare(that: SortKey): Int = {
if(clickCount - that.clickCount != 0) {
clickCount - that.clickCount
} else if(orderCount - that.orderCount != 0) {
orderCount - that.orderCount
} else if(payCount - that.payCount != 0) {
payCount - that.payCount
} else {
0
}
}
}
测试
对实现的二次排序方法进行测试,新建SortKeyTest.scala
,代码如下:
package com.erik.sparkproject
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
/**
* @author Erik
*/
object SortKeyTest {
def main(args: Array[String]): Unit = {
val conf = new SparkConf()
.setAppName("SortKeyTest")
.setMaster("local")
val sc = new SparkContext(conf)
val arr = Array(Tuple2(new SortKey(30, 35, 40), "1"),
Tuple2(new SortKey(35, 30, 40), "2"),
Tuple2(new SortKey(30, 38, 30), "3"))
val rdd = sc.parallelize(arr, 1)
val sortedRdd = rdd.sortByKey(false)
for(tuple <- sortedRdd.collect()) {
println(tuple._2)
}
}
}
如果运行后结果为2 3 1
,则说明测试通过。
本文为《Spark大型电商项目实战》系列文章之一,
更多文章:Spark大型电商项目实战:http://blog.csdn.net/u012318074/article/category/6744423