上周末做了一封自动邮件,把格式化时间HH写成hh。出现了意想不到的结果,以为是数据问题,其实是自己的问题
我们来看看上周遇到的问题:
12小时内的HH格式化:
spark-sql> select
> from_unixtime(dateline,'HH') as hh,
> count(1)
> from
> (select '1631727073' as dateline
> union all
> select '1631737873' as dateline
> union all
> select '1631739733' as dateline
> union all
> select '1631754132' as dateline
> union all
> select '1631764932' as dateline
> ) t1
> group by from_unixtime(dateline,'HH')
> order by from_unixtime(dateline,'HH');
01 1
04 1
05 1
09 1
12 1
Time taken: 2.671 seconds, Fetched 5 row(s)
12小时内的hh格式化:
spark-sql> select
> from_unixtime(dateline,'hh') as hh,
> count(1)
> from
> (select '1631727073' as dateline
> union all
> select '1631737873' as dateline
> union all
> select '1631739733' as dateline
> union all
> select '1631754132' as dateline
> union all
> select '1631764932' as dateline
> ) t1
> group by from_unixtime(dateline,'hh')
> order by from_unixtime(dateline,'hh');
01 1
04 1
05 1
09 1
12 1
Time taken: 2.671 seconds, Fetched 5 row(s)
从上面结果可以看出对于12小时之内,HH和hh是一样的。
不要简单的就以为两者没有区别,不分大小写,那你就大错特错了。
我们再来看看超过12小时会出现什么问题:
超过12小时的HH格式化:
spark-sql> select
> from_unixtime(dateline,'HH') as hh,
> count(1)
> from
> (select '1631727073' as dateline
> union all
> select '1631739733' as dateline
> union all
> select '1631754132' as dateline
> union all
> select '1631764932' as dateline
> union all
> select '1631768532' as dateline
> ) t1
> group by from_unixtime(dateline,'HH')
> order by from_unixtime(dateline,'HH');
01 1
05 1
09 1
12 1
13 1
Time taken: 1.272 seconds, Fetched 5 row(s)
超过12小时的hh格式化:
spark-sql> select
> from_unixtime(dateline,'hh') as hh,
> count(1)
> from
> (select '1631727073' as dateline
> union all
> select '1631739733' as dateline
> union all
> select '1631754132' as dateline
> union all
> select '1631764932' as dateline
> union all
> select '1631768532' as dateline
> ) t1
> group by from_unixtime(dateline,'hh')
> order by from_unixtime(dateline,'hh');
01 2
05 1
09 1
12 1
Time taken: 4.15 seconds, Fetched 4 row(s)
到这,应该都明白了,HH是24小时制,而hh是12小时制的。对于12小时以内的统计,HH和hh都可以用,对于12小时以外的统计,得要注意一下了。