链接:https://blog.csdn.net/zjh19961213/article/details/107497690
hive函数总结
1.时间函数
1.1date_sub、datediff,date_add函数区别
https://blog.csdn.net/qq_35958094/article/details/80460644
1.2时间维度表
(1)根据起始时间列出日期(天)
WITH dates AS(
SELECT DATE_SUB(DATE_ADD("${start_date}", a.pos),0) AS d
FROM (SELECT POSEXPLODE(SPLIT(REPEAT("o", DATEDIFF("${end_date}", "${start_date}")), "o"))) a
)
select * from dates
参考:https://blog.csdn.net/fengfengzai0101/article/details/106062265/
(2)根据时间列出指定月份数的所有月份以及开始和结束时间
SELECT
pos AS month_id,
SUBSTR(add_months(FROM_UNIXTIME(unix_timestamp(SUBSTR(start_date,1,7), 'yyyy-MM')), -pos ), 1, 7) AS month,
trunc(add_months(FROM_UNIXTIME(unix_timestamp(SUBSTR(start_date,1,7), 'yyyy-MM')), -pos ),'MM') AS start_date,
last_day(add_months(FROM_UNIXTIME(unix_timestamp(SUBSTR(start_date,1,7), 'yyyy-MM')), -pos )) AS end_date
FROM
(SELECT '2021-04-07' AS start_date) tmp lateral VIEW posexplode (split(space(12), '')) t AS pos, val
参考连接:https://blog.csdn.net/zjh19961213/article/details/107497690
2.遇到的问题
hive分区表遇到的坑
1.insert override 覆盖分区表只能覆盖select * 中有的分区,而不能覆盖所有分区
INSERT OVERWRITE TABLE dwd.dwd_view_buy_ticket_info_79 PARTITION(import_hive_date) SELECT `(row_num|rn)?+.+` FROM ( SELECT *, ROW_NUMBER() OVER(PARTITION BY ticket_id ORDER BY 1) rn FROM ods.ods_view_buy_ticket_info_79) a WHERE a.rn=1;