需求:
一张表有uid、起始日期和终止日期,能不能一条sql统计出每日在线总数
样例:
比如 三条数据:
a 20220510 20220512
b 20220510 20220511
c 20220511 20220512
结果:
20220510 2
20220511 3
20220512 2
数据准备:
CREATE TABLE brand as select 'a' as uid, '2022-05-10' as start_date, '2022-05-12' as end_date union all select 'b' as uid, '2022-05-10' as start_date, '2022-05-11' as end_date union all select 'c' as uid, '2022-05-11' as start_date, '2022-05-12' as end_date ;
参考答案:
将时间打散 --对时间区间打散到每一天,在按照日期去重count
select
tt,count(distinct uid)
from
(
select
uid
,start_date,end_date
,indx,val
,date_add(start_date,indx) tt
from brand lateral view posexplode(split(space(datediff(end_date,start_date)),'')) t as indx,val
)tmp
group by tt
;
知识点
1.space()函数
hive>select split(space(10), ''); [" "," "," "," "," "," "," "," "," "," ",""]
2.posexplode()函数
select i, date_add('2020-11-01', pe.i) as dynamic_date ,'2020-11-01' as start_time, '2020-11-30' end_time from ods.test lateral view posexplode(split(space(datediff('2020-11-30', '2020-11-01')),' ')) pe as i, x limit 30 ;