内容
本文讲述使用SparkCore和SparkSQL实现每个省份点击量最多的前三个广告id,测试数据如下
省份id 广告id
1 100
1 100
1 100
1 112
1 101
1 112
1 102
1 102
1 103
1 112
1 112
1 101
1 112
2 100
2 121
2 101
2 121
2 104
2 121
2 111
2 104
2 103
2 111
2 121
2 104
3 121
3 112
3 112
3 121
3 100
SparkCore
import org.apache.spark.rdd.RDD
import org.apache.spark.{
SparkConf, SparkContext}
import scala.collection.mutable.ArrayBuffer
/**
* Program: fastspark
* Package:
* Description: Created by felahong on 2020/4/15 12:03
* TODO 统计每个省份点击TOP3的广告
*/
case class AdClick(province: Int, ad: Int)
object ProvinceAdTopThree {
def main(args: Array[String]): Unit = {
val conf =