Hive高级聚合之GROUPING SETS/ROLLUP/CUBE/Grouping_ID

最新推荐文章于 2023-05-11 19:35:01 发布

遐想者csdn

最新推荐文章于 2023-05-11 19:35:01 发布

阅读量1.6k

点赞数

文章标签： HQL 聚合函数

本文链接：https://blog.csdn.net/u010965287/article/details/80409412

版权

1、GROUPING SETS

该关键字可以实现同一数据集的多重group by操作。事实上GROUPING SETS是多个GROUP BY进行UNION ALL操作的简单表达，它仅仅使用一个stage完成这些操作。GROUPING SETS的子句中如果包含()数据集，则表示整体聚合。

示例：

select name, work_space[0] as main_place, count(employee_id) as emp_id_cnt
from employee
group by name, work_space[0]
GROUPING SETS((name,work_space[0]), name, ());
 
// 上面语句与下面语句等效
 
select name, work_space[0] as main_place, count(employee_id) as emp_id_cnt
from employee
group by name, work_space[0]
UNION ALL
select name, work_space[0] as main_place, count(employee_id) as emp_id_cnt
from employee
group by name
UNION ALL
select name, work_space[0] as main_place, count(employee_id) as emp_id_cnt
from employee;

2、ROLLUP

扩展了GROUTING SETS。

示例：

select a, b, c from table group by a, b, c WITH ROLLUP;
// 等价于下面语句
select a, b, c from table group by a, b, c
GROUPING SETS((a,b,c),(a,b),(a),());

3、CUBE

扩展了GROUTING SETS，对各种条件进行聚合。

示例：

select a, b, c from table group by a, b, c WITH ROLLUP;
// 等价于下面语句
select a, b, c from table group by a, b, c
GROUPING SETS((a,b,c),(a,b),(a,c),(b,c),(a),(b),(c),());

4、聚合条件 HAVING

having用于在组内进行过滤。

select cid,max(price) mx from orders group by cid having mx  > 1000;
//等价于下面的子查询语句
select t.cid, t.mx from (
        select cid, max(price) mx from orders group by cid
    ) t
where t.mx > 1000;

5、Grouping_ID

详解：https://blog.csdn.net/wen_2/article/details/65446971

遐想者csdn

关注

0
点赞
踩
9

收藏

觉得还不错? 一键收藏
0
评论
Hive高级聚合之GROUPING SETS/ROLLUP/CUBE/Grouping_ID

1、GROUPING SETS该关键字可以实现同一数据集的多重group by操作。事实上GROUPING SETS是多个GROUP BY进行UNION ALL操作的简单表达，它仅仅使用一个stage完成这些操作。GROUPING SETS的子句中如果包含()数据集，则表示整体聚合。示例：select name, work_space[0] as main_place, count(employe...
复制链接

扫一扫