【Hive】【内置常用函数】

get_json_object

1)数据 json数组

[{"name":"大郎","sex":"男","age":"25"},{"name":"西门庆","sex":"男","age":"47"}]

2)取出第一个json对象

select 
get_json_object('[{"name":"大郎","sex":"男","age":"25"},{"name":"西门庆","sex":"男","age":"47"}]','$[0]');

结果是:{“name”:“大郎”,“sex”:“男”,“age”:“25”}

3)取出第一个json的age字段的值

hive (gmall)>
SELECT get_json_object('[{"name":"大郎","sex":"男","age":"25"},{"name":"西门庆","sex":"男","age":"47"}]',"$[0].age");

结果是:25

窗口函数

  • LAG(col,n,DEFAULT) 用于统计窗口内往上第 n 行值
  • LEAD(col,n,DEFAULT) 用于统计窗口内往下第 n 行值
  • FIRST_VALUE(col) 取分组内排序后,截止到当前行,第一个值
  • LAST_VALUE 取分组内排序后,截止到当前行,最后一个值

多维分析

1.grouping sets \ grouping__id

SELECT 
	month, 
	day, 
	COUNT(DISTINCT cookieid) AS uv, 
	GROUPING__ID 
FROM test_t5 
GROUP BY month,day 
GROUPING SETS (month,day) 
ORDER BY GROUPING__ID;

grouping_id 表示这一组结果属于哪个分组集合, 根据 grouping sets 中的分组条件 month,day,1 是代表 month,2 是代表 day

SELECT 
	month,
	NULL,
	COUNT(DISTINCT cookieid) AS uv,
	1 AS GROUPING__ID 
FROM test_t5 
GROUP BY month 
UNION ALL 
SELECT 
	NULL as month,
	day,
	COUNT(DISTINCT cookieid) AS uv,
	2 AS GROUPING__ID 
FROM test _t5 
GROUP BY day;

SELECT 
	month,
	day, 
	COUNT(DISTINCT cookieid) AS uv, 
	GROUPING__ID 
FROM test_t5 
GROUP BY month,day 
GROUPING SETS (month,day,(month,day)) 
ORDER BY GROUPING__ID;
SELECT 
	month,
	NULL,
	COUNT(DISTINCT cookieid) AS uv,
	1 AS GROUPING__ID 
FROM test_t5 
GRO UP BY month 
UNION ALL 
SELECT 
	NULL,
	day,
	COUNT(DISTINCT cookieid) AS uv,
	2 AS GROUPING__ID 
FROM test_t5 
GROUP BY day 
UNION ALL 
SELECT 
	month,
	day,
	COUNT(DISTINCT cookieid) AS uv,
	3 AS GROUPING__ID 
FROM test_t5 
GROU P BY month,day;

3.cube

SELECT 
	month, 
	day, 
	COUNT(DISTINCT cookieid) AS uv, 
	GROUPING__ID 
FROM test_t5 
GROUP BY month,day WITH CUBE 
ORDER BY GROUPING__ID;
SELECT 
	NULL,
	NULL,
	COUNT(DISTINCT cookieid) AS uv,
	0 AS GROUPING__ID 
FROM test_t5 
UNION ALL 
SELECT 
	month,
	NULL,
	COUNT(DISTINCT cookieid) AS uv,
	1 AS GROUPING__ID 
FROM test_t5 
GROUP BY month 
UNION ALL 
	SELECT 
	NULL,
	day,
	COUNT(DISTINCT cookieid) AS uv,
	2 AS GROUPING__ID 
FROM test_t5 
GROUP BY day 
UNION ALL 
SELECT 
	month,
	day,
	COUNT(DISTINCT cookieid) AS uv,
	3 AS GROUPING__ID 
FROM test_t5 
GROUP BY month,day;

4.rollup

是 CUBE 的子集,以最左侧的维度为主,从该维度进行层级聚合。

select device_id,os_id,app_id,client_version,from_id,count(user_id) 
from test_xinyan_reg 
group by device_id,os_id,app_id,client_version,from_id with rollup;
select device_id,os_id,app_id,client_version,from_id,count(user_id) 
from test_xinyan_reg 
group by device_id,os_id,app_id,client_version,from_id 
grouping sets (
(device_id,os_id,app_id,client_version,from_id),(device_id,os_id,app_id,client_version),
(device_id,os_id,app_id),
(device_id,os_id),
(device_id),
());
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值