get_json_object
1)数据 json数组
[{"name":"大郎","sex":"男","age":"25"},{"name":"西门庆","sex":"男","age":"47"}]
2)取出第一个json对象
select
get_json_object('[{"name":"大郎","sex":"男","age":"25"},{"name":"西门庆","sex":"男","age":"47"}]','$[0]');
结果是:{“name”:“大郎”,“sex”:“男”,“age”:“25”}
3)取出第一个json的age字段的值
hive (gmall)>
SELECT get_json_object('[{"name":"大郎","sex":"男","age":"25"},{"name":"西门庆","sex":"男","age":"47"}]',"$[0].age");
结果是:25
窗口函数
- LAG(col,n,DEFAULT) 用于统计窗口内往上第 n 行值
- LEAD(col,n,DEFAULT) 用于统计窗口内往下第 n 行值
- FIRST_VALUE(col) 取分组内排序后,截止到当前行,第一个值
- LAST_VALUE 取分组内排序后,截止到当前行,最后一个值
多维分析
1.grouping sets \ grouping__id
SELECT
month,
day,
COUNT(DISTINCT cookieid) AS uv,
GROUPING__ID
FROM test_t5
GROUP BY month,day
GROUPING SETS (month,day)
ORDER BY GROUPING__ID;
grouping_id 表示这一组结果属于哪个分组集合, 根据 grouping sets 中的分组条件 month,day,1 是代表 month,2 是代表 day
SELECT
month,
NULL,
COUNT(DISTINCT cookieid) AS uv,
1 AS GROUPING__ID
FROM test_t5
GROUP BY month
UNION ALL
SELECT
NULL as month,
day,
COUNT(DISTINCT cookieid) AS uv,
2 AS GROUPING__ID
FROM test _t5
GROUP BY day;
SELECT
month,
day,
COUNT(DISTINCT cookieid) AS uv,
GROUPING__ID
FROM test_t5
GROUP BY month,day
GROUPING SETS (month,day,(month,day))
ORDER BY GROUPING__ID;
SELECT
month,
NULL,
COUNT(DISTINCT cookieid) AS uv,
1 AS GROUPING__ID
FROM test_t5
GRO UP BY month
UNION ALL
SELECT
NULL,
day,
COUNT(DISTINCT cookieid) AS uv,
2 AS GROUPING__ID
FROM test_t5
GROUP BY day
UNION ALL
SELECT
month,
day,
COUNT(DISTINCT cookieid) AS uv,
3 AS GROUPING__ID
FROM test_t5
GROU P BY month,day;
3.cube
SELECT
month,
day,
COUNT(DISTINCT cookieid) AS uv,
GROUPING__ID
FROM test_t5
GROUP BY month,day WITH CUBE
ORDER BY GROUPING__ID;
SELECT
NULL,
NULL,
COUNT(DISTINCT cookieid) AS uv,
0 AS GROUPING__ID
FROM test_t5
UNION ALL
SELECT
month,
NULL,
COUNT(DISTINCT cookieid) AS uv,
1 AS GROUPING__ID
FROM test_t5
GROUP BY month
UNION ALL
SELECT
NULL,
day,
COUNT(DISTINCT cookieid) AS uv,
2 AS GROUPING__ID
FROM test_t5
GROUP BY day
UNION ALL
SELECT
month,
day,
COUNT(DISTINCT cookieid) AS uv,
3 AS GROUPING__ID
FROM test_t5
GROUP BY month,day;
4.rollup
是 CUBE 的子集,以最左侧的维度为主,从该维度进行层级聚合。
select device_id,os_id,app_id,client_version,from_id,count(user_id)
from test_xinyan_reg
group by device_id,os_id,app_id,client_version,from_id with rollup;
select device_id,os_id,app_id,client_version,from_id,count(user_id)
from test_xinyan_reg
group by device_id,os_id,app_id,client_version,from_id
grouping sets (
(device_id,os_id,app_id,client_version,from_id),(device_id,os_id,app_id,client_version),
(device_id,os_id,app_id),
(device_id,os_id),
(device_id),
());