hive时间戳格式化HH和hh的区别

最新推荐文章于 2024-05-17 15:43:21 发布

不想起的昵称

最新推荐文章于 2024-05-17 15:43:21 发布

阅读量1.4k

点赞数

分类专栏： hive 文章标签：大数据数据仓库 hive sql etl

本文链接：https://blog.csdn.net/weixin_40267121/article/details/120272494

版权

hive 专栏收录该内容

42 篇文章 9 订阅

订阅专栏

上周末做了一封自动邮件，把格式化时间HH写成hh。出现了意想不到的结果，以为是数据问题，其实是自己的问题

我们来看看上周遇到的问题：

12小时内的HH格式化：

spark-sql> select 
         >   from_unixtime(dateline,'HH') as hh,
         >   count(1)
         > from 
         > (select '1631727073' as dateline
         > union all
         > select '1631737873' as dateline
         > union all
         > select '1631739733' as dateline
         > union all
         > select '1631754132' as dateline
         > union all 
         > select '1631764932' as dateline
         > ) t1
         > group by from_unixtime(dateline,'HH')
         > order by from_unixtime(dateline,'HH');
01      1                                                                       
04		1
05		1
09		1
12		1
Time taken: 2.671 seconds, Fetched 5 row(s)

12小时内的hh格式化：

spark-sql> select 
         >   from_unixtime(dateline,'hh') as hh,
         >   count(1)
         > from 
         > (select '1631727073' as dateline
         > union all
         > select '1631737873' as dateline
         > union all
         > select '1631739733' as dateline
         > union all
         > select '1631754132' as dateline
         > union all 
         > select '1631764932' as dateline
         > ) t1
         > group by from_unixtime(dateline,'hh')
         > order by from_unixtime(dateline,'hh');
01      1                                                                       
04		1
05		1
09		1
12		1
Time taken: 2.671 seconds, Fetched 5 row(s)

从上面结果可以看出对于12小时之内，HH和hh是一样的。
不要简单的就以为两者没有区别，不分大小写，那你就大错特错了。

我们再来看看超过12小时会出现什么问题：

超过12小时的HH格式化：

spark-sql> select 
         >   from_unixtime(dateline,'HH') as hh,
         >   count(1)
         > from 
         > (select '1631727073' as dateline
         > union all
         > select '1631739733' as dateline
         > union all
         > select '1631754132' as dateline
         > union all 
         > select '1631764932' as dateline
         > union all
         > select '1631768532' as dateline
         > ) t1
         > group by from_unixtime(dateline,'HH')
         > order by from_unixtime(dateline,'HH');
01      1                                                                       
05		1
09		1
12		1
13		1
Time taken: 1.272 seconds, Fetched 5 row(s)

超过12小时的hh格式化：

spark-sql> select 
         >   from_unixtime(dateline,'hh') as hh,
         >   count(1)
         > from 
         > (select '1631727073' as dateline
         > union all
         > select '1631739733' as dateline
         > union all
         > select '1631754132' as dateline
         > union all 
         > select '1631764932' as dateline
         > union all
         > select '1631768532' as dateline
         > ) t1
         > group by from_unixtime(dateline,'hh')
         > order by from_unixtime(dateline,'hh');
01      2                                                                       
05		1
09		1
12		1
Time taken: 4.15 seconds, Fetched 4 row(s)