解决的问题是在自己做小项目是遇到的,如需要可在添加链接描述这篇博客获取所需的项目文件自行练习。
解决思路:上午时间不变 下午的时间+12
hql语句测试
select time,case
when instr(time,'PM') !=0 then from_unixtime(unix_timestamp(concat('1920-1-1 ',split(time,' ')[0],':00'))+43200,'HH:mm')
when instr(time,'AM') !=0 then split(time,' ')[0] else time end from transaction_details limit 20
解释:
hive时间控制常用是转成时间戳再加减
其中unix_timestamp是日期转时间戳,43200=126060(小时换为秒)
此时就可以创建临宏来使用:
hive> create temporary macro time_change(tm string)
> case
> when instr(tm,'PM') !=0 then
> from_unixtime(unix_timestamp(concat('1920-1-1 ',split(tm,' ')[0],':00'))+43200,'HH:mm')
> when instr(tm,'AM') !=0 then
> split(tm,' ')[0]
> else
> tm
> end ;
前面为原始时间 后面为转化之后的24小时制数据
hive> select time,time_change(time) from transaction_details limit 3;
OK
4:07 PM 16:07
1:12 PM 13:12
13:51 13:51
Time taken: 0.091 seconds, Fetched: 3 row(s)
处理完24小时制之后 就可以分时间段了,一般是按照下面的要求分时间段。
--define time bucket
--early morning: (5:00, 8:00]
--morning: (8:00, 11:00]
--noon: (11:00, 13:00]
--afternoon: (13:00, 18:00]
--evening: (18:00, 22:00]
--night: (22:00, 5:00]
在这里可以使用case when函数,用个简单额题目举例
按时段(上午,中午等)计算总收入/平均购买次数:
with
t as(select stored_id,case when instr(time,'PM') !=0 then
from_unixtime(unix_timestamp(concat('1920-1-1 ',split(time,' ')[0],':00'))+43200,'HH:mm')
when instr(time,'AM') !=0 then split(time,' ')[0] else time end as time,price from transaction_details),
t1 as (select stored_id,time,price,case
when cast(regexp_replace(time,":","") as int) between 500 and 800
then 'early morning'
when cast(regexp_replace(time,":","") as int) between 801 and 1100
then 'morning'
when cast(regexp_replace(time,":","") as int) between 1101 and 1300
then 'noon'
when cast(regexp_replace(time,":","") as int) between 1301 and 1800
then 'afternoon'
when cast(regexp_replace(time,":","") as int) between 1801 and 2200
then 'evening'
else
'night'
end as timebucket from t)
select timebucket,round(sum(price),2) as countPrice,count(price) as batch from t1 group by timebucket