agg的作用
- 正常情况下,当我们使用了聚合算子,后面就无法在使用其他聚合算子
- 而agg可以使我们同时获取多个聚合运算结果
示例
object InnerFunctionDemo {
def main(args: Array[String]): Unit = {
val spark = SparkSession.builder().master("local[*]").appName("function").getOrCreate()
import spark.implicits._
val sc = spark.sparkContext
val stuDF: DataFrame = Seq(
Student(1001, "zhangsan", "F", 20),
Student(1002, "lisi", "M", 16),
Student(1003, "wangwu", "M", 21),
Student(1004, "zhaoliu", "F", 21),
Student(1005, "zhouqi", "M", 22),
Student(1006, "qianba", "M", 19),
Student(1007, "liuliu", "F", 23)
).toDF()
import org.apache.spark.sql.functions._
//同样也可以这样写
//stuDF.groupBy("gender").agg(max("age"),min("age"),avg("age"),count("id")).show()
stuDF.groupBy("gender").agg("age"->"max","age"->"min","age"->"avg","id"->"count").show()
/*
+------+--------+--------+------------------+---------+
|gender|max(age)|min(age)| avg(age)|count(id)|
+------+--------+--------+------------------+---------+
| F| 23| 20|21.333333333333332| 3|
| M| 22| 16| 19.5| 4|
+------+--------+--------+------------------+---------+
*/
}
}