不过是在数据处理、脚本、还是编程语言开发中,经常会遇到时间处理的情况,在hive中,如何获取指定日期月份的第一天、年份的第一天、指定日期下周的指定周几、按指定格式返回指定日期增加几个月后的日期呢?
下面我们就利用hive自带的函数一一实现。
获取指定日期月份的第一天、年份的第一天
在hive中有这样一个函数,trunc(string date, string format)
该函数官网是这样说明的:
Returns date truncated to the unit specified by the format (as of Hive 1.2.0). Supported formats: MONTH/MON/MM, YEAR/YYYY/YY. Example: trunc(‘2015-03-17’, ‘MM’) = 2015-03-01
在hive1.2.0版本后执行此函数,此函数返回指定格式的单元,指定格式有:MONTH/MON/MM, YEAR/YYYY/YY
> select trunc('2019-02-24', 'YYYY');
+-------------+--+
| _c0 |
+-------------+--+
| 2019-01-01 |
+-------------+--+
> select trunc('2019-02-24', 'MM');
+-------------+--+
| _c0 |
+-------------+--+
| 2019-02-01 |
+-------------+--+
指定日期下周的指定周几
利用hive中自带的函数:next_day(string start_date, string day_of_week)
官网说明:
Returns the first date which is later than start_date and named as day_of_week (as of Hive 1.2.0). start_date is a string/date/timestamp. day_of_week is 2 letters, 3 letters or full name of the day of the week (e.g. Mo, tue, FRIDAY). The time part of start_date is ignored. Example: next_day(‘2015-01-14’, ‘TU’) = 2015-01-20.
在hive1.2.0版本后才支持此函数
> select next_day('2019-02-24', 'TU');
+-------------+--+
| _c0 |
+-------------+--+
| 2019-02-26 |
+-------------+--+
> select next_day('2019-02-24', 'MONDAY');
+-------------+--+
| _c0 |
+-------------+--+
| 2019-02-25 |
+-------------+--+
按指定格式返回指定日期增加几个月后的日期
利用hive中自带的函数:add_months(string start_date, int num_months, output_date_format)
官网说明:
Returns the date that is num_months after start_date (as of Hive 1.1.0). start_date is a string, date or timestamp. num_months is an integer. If start_date is the last day of the month or if the resulting month has fewer days than the day component of start_date, then the result is the last day of the resulting month. Otherwise, the result has the same day component as start_date. The default output format is ‘yyyy-MM-dd’.
Before Hive 4.0.0, the time part of the date is ignored.
As of Hive 4.0.0, add_months supports an optional argument output_date_format, which accepts a String that represents a valid date format for the output. This allows to retain the time format in the output.
For example :
add_months(‘2009-08-31’, 1) returns ‘2009-09-30’.
add_months(‘2017-12-31 14:15:16’, 2, ‘YYYY-MM-dd HH:mm:ss’) returns ‘2018-02-28 14:15:16’.
此函数在hive1.1.0版本后
> select add_months('2019-02-28', 1);
+-------------+--+
| _c0 |
+-------------+--+
| 2019-03-31 |
+-------------+--+
> select add_months('2019-02-24 21:15:16', 2, 'YYYY-MM-dd HH:mm:ss');
+----------------------+--+
| _c0 |
+----------------------+--+
| 2019-04-24 21:15:16 |
+----------------------+--+
参考链接:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF?tdsourcetag=s_pctim_aiomsg