group by 分组汇总

邂逅↪秋

已于 2023-05-15 10:51:19 修改

阅读量307

点赞数

文章标签：数据库 sql

于 2023-03-07 11:26:59 首次发布

本文链接：https://blog.csdn.net/weixin_43752504/article/details/129368704

版权

group by:

grouping sets():

group by ...with rollup:

with cube:

rollup()

group by:

数据分析经常会使用到分组汇总的方法。通过汇总的方法可以更快速地看到数据的特征。

例如查看男女生的比例

spark/presto/hive/mysql:

select
    student_sex,
    count(distinct student_id) as stu_num
    from student_info
group by student_sex

或者

spark/presto/hive/mysql:

select
    student_sex,
    count(distinct student_id) as stu_num
    from student_info 
group by 1

又如想查看男女生不同年龄阶段的学生数

spark/presto/hive/mysql:

select
    student_sex,
    student_age,
    count(distinct student_id) as stu_num
    from student_info 
group by student_sex,student_age

此时计算逻辑是想对学生性别进行分类，再对学生年龄分类

想查看男女生不同年龄阶段的年龄最大和最小分别是多少

spark/presto/hive/mysql:

select
    student_sex,
    max(student_age) as max_age,
    min(student_age) as min_age
    from student_info 
group by student_sex

grouping sets():

grouping sets()分层聚合

如果你不仅想看各维度的分组求和，还想看总的求和，那么久可以用grouping sets()

spark:

select
student_sex,
student_age,
count(distinct student_id) as stu_num
from student_info
group by  
grouping sets((student_sex,student_age))


--相当于
/*
select
student_sex,
student_age,
count(distinct student_id) as stu_num
from student_info 
group by student_sex,student_age

union all

select
null as student_sex,
null as student_age,
count(distinct student_id) as stu_num
from student_info 
*/

group by ...with rollup:

group by ...with rollup 不仅可以对字段汇总，并且还可以聚合

例如我们想看到学生的不同性别的人数以及不同性别不同年龄的

spark:

select
student_sex,
student_age,
count(distinct student_id) as stu_num
from student_info 
group by student_sex,student_age with rollup


/*相当于使用grouping by 
student_sex,
student_age,
count(distinct student_id) as stu_num
from student_info 
grouping by ((student_sex,student_age),(student_sex))
*/


/* 也相当于我们这么写
student_sex,
student_age,
count(distinct student_id) as stu_num
from student_info 
group by student_sex,student_age

union all

select
student_sex,
null as student_age,
count(distinct student_id) as stu_num
from student_info 
group by student_sex

union all

select
null student_sex,
null as student_age,
count(distinct student_id) as stu_num
from student_info 
*/

with cube:

spark:

select
student_sex,
student_age,
count(distinct student_id) as stu_num
from student_info 
group by student_sex,student_age with cube

/*相当于
select
student_sex,
student_age,
count(distinct student_id) as stu_num
from student_info 
group by student_sex,student_age with rollup

union all

select
null as student_sex,
student_age,
count(distinct student_id) as stu_num
from student_info 
group by student_age

with cube 和 with rollup 的区别在于 with cube 比with rollup 在多个字段汇总时数据条数更多；

因为with rollup 是跟接近的字段往后结和汇总的，但是with cube 相当于是排列组合的方式，将所有结果一一汇总出来。

rollup()

presto:

select
day,
count(distinct stu_id) as stu_num
group by
    rollup (day)

邂逅↪秋

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
group by 分组汇总

group by 分组汇总
复制链接

扫一扫

group by 分组汇总

group by:

grouping sets():

group by ...with rollup:

with cube:

rollup()

“相关推荐”对你有帮助么？