表结构
CREATE TABLE test (f1 string,
f2 string,
f3 string,
cnt int) ROW FORMAT delimited FIELDS TERMINATED BY '\t' stored AS textfile;
LOAD DATA LOCAL inpath '/data/logs/suiyingli/tmp/test.data' overwrite INTO TABLE test;
原始数据
•
A A B 1
•
B B A 1
•
A A A 2
with cube查询语句
SELECT f1,
f2,
f3,
sum(cnt),
GROUPING__ID,
rpad(reverse(bin(cast(GROUPING__ID AS bigint))),3,'0')
FROM test
GROUP BY f1,
f2,
f3 WITH CUBE;
SELECT f1,
f2,
f3,
sum(cnt),
GROUPING__ID,
rpad(reverse(bin(cast(GROUPING__ID AS bigint))),3,'0')
FROM test
GROUP BY f1,
f2,
f3 WITH CUBE;
with cube结果范例
rollup查询语句
SELECT f1,
f2,
f3,
sum(cnt),
GROUPING__ID,
rpad(reverse(bin(cast(GROUPING__ID AS bigint))),3,'0')
FROM test
GROUP BY f1,
f2,
f3 WITH ROLLUP;
rollup结果范例
grouping sets查询语句
SELECT f1,
f2,
f3,
sum(cnt),
GROUPING__ID,
rpad(reverse(bin(cast(GROUPING__ID AS bigint))),3,'0')
FROM test
GROUP BY f1,
f2,
f3
GROUPING sets((f1),(f1,f2))
grouping sets结果范例
总结
cube的分组组合最全,是各个维度值的笛卡尔(包含null)组合,
rollup的各维度组合应满足,前一维度为null后一位维度必须为null,前一维度取非null时,下一维度随意,
grouping sets则为自定义维度,根据需要分组即可。
ps:通过grouping sets的使用可以简化SQL,比group by单维度进行union性能更好。