看到一道sql题目,比较感兴趣,就做了下
题目如下:
主要思路1:
- 使用用户变量+笛卡尔积生成时间维度表
- 通过时间维度表和每次停车信息再做笛卡尔积
- 对时间进行分组求和即可
思路2:
- 计算时间间隔,使用space填充,然后split分割后使用posexplode炸开并获得索引,用索引乘以3600即可获得每个小时的维度表
建表如下:
mysql代码如下:
select substring(tt.dim_time,1,13) as hh,count(*)
from (
select id, plate_no, from_unixtime(unix_timestamp(t.inttime)),
from_unixtime(unix_timestamp(if(t.outtime='',current_date(),t.outtime))),from_unixtime(t3.dim_time) dim_time
from t,(
select case when @dim_timestamp then @dim_timestamp:=@dim_timestamp+3600 else 0 end dim_time
from (select @dim_timestamp:=(select min(unix_timestamp(time))
from (select inttime time
from t
union
select if(outtime='',current_date(),outtime) time
from t)T1))T2,
(select 1 union select 2)t4,
(select 1 union select 2 union select 3)t5,
(select 1 union select 2 union select 3 union select 4)t6,
(select 1 union select 2 union select 3 union select 4 union select 5)t7,
(select 1 union select 2 union select 3 union select 4 union select 5 union select 6)t8,
(select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7)t9
)t3
where from_unixtime(t3.dim_time)>from_unixtime(unix_timestamp(t.inttime))
and from_unixtime(t3.dim_time)<from_unixtime(unix_timestamp(if(outtime='',current_date(),outtime)))
)tt
group by tt.dim_time;
输出结果如下
Hive代码如下:
with
t1 as (
select max(outtime) max_time
from parkrecode
),
t2 as (
select p.id,p.plate_no,p.inttime,coalesce(p.outtime,t.max_time) outtime
from parkrecode p
cross join t1 t
where substr(inttime,1,10)!=substr(outtime,1,10)),
t3 as (
select id,plate_no inhour,from_unixtime(unix_timestamp(inttime,'yyyyMMddHHmmss'))
start_date,
from_unixtime(unix_timestamp(outtime,'yyyyMMddHHmmss')-3600) end_date
from t2
),
t4 as (
select t3.id,t3.inhour,from_unixtime(unix_timestamp(start_date)+pos*3600,'yyyyMMddHH')
stop_hour
from t3
lateral view posexplode(split(space(hour(end_date)-hour(start_date)+
(datediff(end_date,start_date))*24),''))t as pos,val)
select stop_hour,count(inhour)
from t4
group by stop_hour;