Hive的group by

-- 创建 stu 表
CREATE TABLE stu(
id int,
name string,
age int,
sex string 
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';

-- 表内数据
load data local inpath '/home/hadoop/stu' into table stu;
1   name1   12  boy
2   name2   12  boy
3   name3   13  girl
4   name4   13  boy
5   name5   14  boy
6   name6   14  boy
7   name7   15  girl
8   name8   15  girl

group by

group by 根据一个或多个列对结果集进行分组,一般和聚合函数一起使用才有意义,比如 count sum avg max min等,
使用group by的两个要素:
• 出现在select后面的字段,要么是聚合函数中的,要么就是group by中的.即select列表项中出现的列必须全部出现在group by后面(聚合函数除外).group by中的字段可以不在select列表项中.
• 要筛选结果:
        1.可以先使用 where 再用 group by.
        2.可以先使用 group by 再用 having.
-- 先where后group by
select max(id),max(name),max(age),sex from stu where age=13 group by sex;
-- 先group by后having
select max(id),max(name),age,sex from stu group by age,sex having age=13;

grouping sets

grouping sets是一种将多个 group by 逻辑写在一个sql语句中的便利写法.

GROUP BY a, b GROUPING SETS ((a,b))

SELECT a, b, SUM(c) FROM tab1 GROUP BY a, b GROUPING SETS ((a,b))
-- 等于
SELECT a, b, SUM(c) FROM tab1 GROUP BY a, b

GROUP BY a, b GROUPING SETS ((a,b), a)

SELECT a, b, SUM(c) FROM tab1 GROUP BY a, b GROUPING SETS ((a,b), a)
-- 等于
SELECT a, b, SUM(c) FROM tab1 GROUP BY a, b 
UNION ALL
SELECT a, null, SUM(c) FROM tab1 GROUP BY a

GROUP BY a, b GROUPING SETS (a,b)

SELECT a,b, SUM(c) FROM tab1 GROUP BY a, b GROUPING SETS (a,b)
-- 等于
SELECT a, null, SUM(c) FROM tab1 GROUP BY a 
UNION ALL
SELECT null, b, SUM(c) FROM tab1 GROUP BY b

GROUP BY a, b GROUPING SETS ((a, b), a, b, ())

SELECT a, b, SUM(c) FROM tab1 GROUP BY a, b GROUPING SETS ((a, b), a, b, ())
-- 等于
SELECT a, b, SUM(c) FROM tab1 GROUP BY a, b 
UNION ALL
SELECT a, null, SUM(c) FROM tab1 GROUP BY a
UNION ALL
SELECT null, b, SUM(c) FROM tab1 GROUP BY b 
UNION ALL
SELECT null, null, SUM(c) FROM tab1

with cube

是group by中所有key的组合(类似于笛卡尔积)
select age,sex,count(id) from stu group by age,sex with cube;
--等于
select age,sex,count(id) from stu group by age,sex grouping sets((age,sex),age,sex,());

--例如:
group by a,b,c with cube =
grouping sets(
(a,b,c)
(a,b)
(b,c)
(a,c)
a
b
c
()
)

with rollup

是按右侧递减的顺序组合
-- 相当于按右侧递减的顺序group by
select age,sex,count(id) from stu group by age,sex with rollup;
-- 等于
select age,sex,count(id) from stu group by age,sex grouping sets((age,sex),age,());

-- 例如:
group by a,b,c with rollup =
grouping sets(
(a,b,c)
(a,b)
(a)
()
)
  • 7
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

早拾碗吧

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值