【hive】 rollup的空值过滤

最新推荐文章于 2022-06-13 23:58:46 发布

SmellyKitty

最新推荐文章于 2022-06-13 23:58:46 发布

阅读量1.6k

点赞数 1

分类专栏： hive 文章标签： hive big data 大数据

本文链接：https://blog.csdn.net/SmellyKitty/article/details/122665589

版权

hive 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

rollup

rollup 用于hive在group by的顺序分组聚合，举例如下：

select  mac, stat_date,
        avg(cast(cpuOccupancyRate as int)) cpu_avg
from bidata.t_ods_pp_uplink_status_id
where stat_date>=20220119 and stat_date<=20220119
group by rollup(mac,stat_date)

输出结果：

mac	stat_date	cpu_avg
NULL	NULL	3.8
30A176EDF550	20220119	5.645833333333333
30A176EDAB70	20220119	1.9148936170212767
30A176EDF550	NULL	5.645833333333333
30A176EDAB70	NULL	1.9148936170212767

以上内容可以看出，mac和stat_date会分别进行聚合，输出值为NULL，有时候不希望输出mac为NULL的结果，自然想到通过where mac is not null 进行过滤，实际验证

select * from
(
select  mac, stat_date,
        avg(cast(cpuOccupancyRate as int)) cpu_avg
from bidata.t_ods_pp_uplink_status_id
where stat_date>=20220119 and stat_date<=20220119
group by rollup(mac,stat_date)
) as t1 
where mac is not null

结果如下

mac	stat_date	cpu_avg
NULL	NULL	3.8
30A176EDF550	20220119	5.645833333333333
30A176EDAB70	20220119	1.9148936170212767
30A176EDF550	NULL	5.645833333333333
30A176EDAB70	NULL	1.9148936170212767

为什么没有完成过滤呢？难道是NULL 不是空？是真的字符串值为NULL吗？经过验证，where mac!=‘NULL’，结果依然如上，满脸问号？？？？
经过反复验证，解决方案如下：

select * from
(
select  NVL(mac,'mac') as mac, NVL(stat_date,'stat') as stat_date,
        avg(cast(cpuOccupancyRate as int)) cpu_avg
from bidata.t_ods_pp_uplink_status_id
where stat_date>=20220119 and stat_date<=20220119
group by rollup(mac,stat_date)
) as t1 
where mac != 'mac'

结果如下

mac	stat_date	cpu_avg
30A176EDF550	20220119	5.645833333333333
30A176EDAB70	20220119	1.9148936170212767
30A176EDF550	stat	5.645833333333333
30A176EDAB70	stat	1.9148936170212767

为什么？？？？还没找到原因，知道的同学求解答！！

SmellyKitty

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
【hive】 rollup的空值过滤

rolluprollup 用于hive在group by的顺序分组聚合，举例如下：select mac, stat_date, avg(cast(cpuOccupancyRate as int)) cpu_avgfrom bidata.t_ods_pp_uplink_status_idwhere stat_date>=20220119 and stat_date<=20220119group by rollup(mac,stat_date)输出结果：ma
复制链接

扫一扫

专栏目录