Hive/Impala超全函数整理

2 篇文章 0 订阅

一、常用方式整理

快捷计算方式Hive/Sparkimpala备注
当天日期cast(current_date() as string)
cast(date_format(current_date(),'yyyyMMdd') as string)
cast(from_timestamp(now(),'yyyy-MM-dd') as string)
cast(from_timestamp(now(),'yyyyMMdd') as string)
2023-06-30
20230630
昨天日期cast(date_sub('2023-06-30',1) as string)
cast(date_format(date_sub('2023-06-30',1),'yyyyMMdd') as string)
cast(from_timestamp(date_sub(now(),1),'yyyy-MM-dd') as string)
cast(from_timestamp(date_sub(now(),1),'yyyyMMdd') as string)
2023-06-29
20230629
去年当天cast(date_sub(current_date(),365) as string)
cast(date_format(date_sub(current_date(),365),'yyyyMMdd') as string)
cast(from_timestamp(date_sub(now(),365),'yyyy-MM-dd') as string)
cast(from_timestamp(date_sub(now(),365),'yyyyMMdd') as string)
2022-06-30
20220630
去年当月cast(date_format(date_sub(current_date(),365),'yyyy-MM') as string)
cast(date_format(date_sub(current_date(),365),'yyyyMM') as string)
cast(from_timestamp(date_sub(now(),365),'yyyy-MM') as string)
cast(from_timestamp(date_sub(now(),365),'yyyyMM') as string)
2022-06
202206
当月substr(cast(current_date() as string),1,7)
replace(substr(cast(current_date() as string),1,7),'-','')
substr(cast(now() as string),1,7)
replace(substr(cast(now() as string),1,7),'-','')
2023-06
202306
上月substr(cast(add_months(current_date(),-1) as string),1,7)
replace(substr(cast(add_months(current_date(),-1) as string),1,7),'-','')
substr(cast(add_months(now(),-1) as string),1,7)
replace(substr(cast(add_months(now(),-1) as string),1,7),'-','')
2023-05
202305
下个月substr(cast(add_months(current_date(),1) as string),1,7)
replace(substr(cast(add_months(current_date(),1) as string),1,7),'-','')
substr(cast(add_months(now(),1) as string),1,7)
replace(substr(cast(add_months(now(),1) as string),1,7),'-','')
2023-07
202307
当年substr(cast(current_date() as string),1,4)cast(year(now()) as string)2023
去年cast(cast(date_format(current_date(),'yyyy') - 1 as decimal(4,0)) as string)cast(year(years_add(now(),-1)) as string)2022
明年cast(cast(date_format(current_date(),'yyyy') + 1 as decimal(4,0)) as string)cast(year(years_add(now(),1)) as string)2024

二、常用函数整理

函数分类细分函数作用Hive/Sparkimpala备注



获取当前当前日期current_date()2023-06-30
当前日期时间current_timestamp()current_timestamp()
now()
2023-06-30 13:14:57
转换时间戳(秒)缺省获取当前unix_timestamp()
unix_timestamp('2021-08-14 10:05:20')
unix_timestamp()
unix_timestamp('2021-08-14 10:05:20')
1628911641
格式转换日期时间转日期函数to_date('2023-06-30 13:34:12')to_date('2023-06-30 13:34:12')2023-06-30
转化UNIX时间戳from_unixtime(1323308945,'yyyy-MM-dd')from_unixtime(1323308945,'yyyy-MM-dd')2023-06-30
格式化时间(字符串/时间戳/日期)date_format('2023-06-30','yyyyMMdd')from_timestamp('2023-06-30','yyyyMMdd')20230630
获取年year('2023-06-30 13:34:12')year('2023-06-30 13:34:12')2023
获取季度quarter('2023-06-30 13:34:12')quarter('2023-06-30 13:34:12')2
获取月份month('2023-06-30 13:34:12')month('2023-06-30 13:34:12')6
获取天(从1开始)day('2023-01-01 13:34:12')day('2023-01-01 13:34:12')1
获取小时hour('2023-01-01 13:34:12')hour('2023-01-01 13:34:12')13
获取分钟minute('2023-01-01 13:34:12')minute('2023-01-01 13:34:12')34
获取秒second('2023-01-01 13:34:12')second('2023-01-01 13:34:12')12
获取周weekofyear('2021-08-14 10:05:20')weekofyear('2021-08-14 10:05:20')32
当前周的第几天(注:周日为第1天)dayofweek('2023-06-27 10:05:20')dayofweek('2023-06-27 10:05:20')3
当前月的第几天dayofmonth(now()) 
时间推算日期相减(第一个日期减第二个日期)datediff('2021-08-14','2021-08-08')datediff('2021-08-14','2021-08-08')6
日期减少函数date_sub('2021-08-14',6)date_sub('2021-08-14',6)
date_sub(now(),interval 2 months)
2021-08-08
2023-08-27 17:40:31.94032
日期增加函数date_add('2021-08-14',6)date_add('2021-08-14',6)
date_add(now(),interval 2 hours)
2021-08-20
2023-06-27 19:41:21.906604
两个日期之间包含的月数months_between('2021-10-14','2021-05-04')months_between('2021-10-14','2021-05-04')5.32258065
获取周期最初日期(月初 / 年初)trunc(current_date(),'MM')
trunc(current_date(),'YY')
trunc(current_date(),'month')
trunc(current_date(),'year')
2023-06-01
2023-01-01
日期当月最后一天last_day('2023-01-15')last_day('2023-01-15')2023-01-31
增加年years_add(cast('2023-01-15' as timestamp),1)2024-01-15 00:00:00.0
减少年years_sub(cast('2023-01-15' as timestamp),1)2022-01-15 00:00:00.0
增加月add_months('2023-01-15',1)months_add(cast('2023-01-15' as timestamp),1)2023-02-15 00:00:00.0
减少月add_months('2023-01-15',-1)months_sub(cast('2023-01-15' as timestamp),1)2022-12-15 00:00:00.0
增加周weeks_add(cast('2023-01-15' as timestamp),1)2023-01-22 00:00:00.0
减少周weeks_sub(cast('2023-01-15' as timestamp),1)2023-01-08 00:00:00.0
增加小时hours_add(cast('2023-01-15' as timestamp),1)2023-01-15 01:00:00.0
减少小时hours_sub(cast('2023-01-15' as timestamp),1)2023-01-14 23:00:00.0
增加分钟minutes_add(cast('2023-01-15' as timestamp),1)2023-01-15 00:01:00.0
减少分钟minutes_sub(cast('2023-01-15' as timestamp),1)2023-01-14 23:59:00.0
增加秒seconds_add(cast('2023-01-15' as timestamp),1)2023-01-15 00:00:01.0
减少秒seconds_sub(cast('2023-01-15' as timestamp),1)2023-01-14 23:59:59.0
下个周周一next_day(cast('2023-01-15' as timestamp),'monday')2023-01-16 00:00:00.0
每月的第几天dayofmonth(cast('2023-01-15' as timestamp))15
月的名称monthname(cast('2023-01-15' as timestamp))January
星期几的名称dayname(cast('2023-01-15' as timestamp))Sunday
字符串函数字符串长度length('jiaoxin')length('jiaoxin')7
反转字符串reverse('jiaoxin')reverse('jiaoxin')nixoaij
拼接字符串(任何一个为null返回null)concat('xiaojiao','xiaojiao','ETL')concat('xiaojiao','xiaojiao','ETL')xiaojiaoxiaojiaoETL
带分隔符字符串拼接concat_ws("#",'jiao', 'xin')concat_ws("#",'jiao', 'xin')jiaoxin#xin
字符串截取substr('jiaoxin',1,4)substr('jiaoxin',1,4)jiao
字符串转大写upper('jiaoxin')
ucase('jiaoxin')
upper('jiaoxin')
ucase('jiaoxin')
JIAOXIN
字符串转小写lower('JIAOXIN')
lcase('JIAOXIN')
lower('JIAOXIN')
lcase('JIAOXIN')
jiaoxin
去空格trim('  haha woshi zhangsan ')trim('  haha woshi zhangsan ')haha woshi zhangsan
去左边空格ltrim('  haha woshi zhangsan ')ltrim('  haha woshi zhangsan ')haha woshi zhangsan 
去右边空格rtrim('  haha woshi zhangsan ')rtrim('  haha woshi zhangsan ')  haha woshi zhangsan
正则替换regexp_replace('marry','a|r','')regexp_replace('marry','a|r','')my
正则解析
(0 是表示把整个表达式对应的结果全部返回
1 表示返回正则表达式中第一个()对应的结果
...依次类推)
regexp_extract('marry','m(.*?)(ry)',0)
regexp_extract('marry','m(.*?)(ry)',1)
regexp_extract('marry','m(.*?)(ry)',2)
regexp_extract('marry','m(.*?)(ry)',0)
regexp_extract('marry','m(.*?)(ry)',1)
regexp_extract('marry','m(.*?)(ry)',2)
marry
ar
ry
json解析get_json_object('{"name":"marry","sex":"F"}','$.name')get_json_object('{"name":"marry","sex":"F"}','$.name')marry
返回指定长度的空格字符串space(10)space(10)          
重复字符串repeat('tom',3)repeat("tom",3)tomtomtom
左补足lpad('jiao',11,'ha')lpad("jiao",11,'ha')hahahahjiao
右补足rpad('jiao',10,'6')rpad("jiao",10,'6')jiao666666
分割字符串split('jiao,kong,qu',',')["jiao","kong","qu"]
分割字符串(返回指定位置)split_part('part1-part2-part3','-',2)part2
集合字符串位置查找(未找到返回0)find_in_set('ab','fwqieur,ab')find_in_set('ab','fwqieur,ab')2
字符串位置查找instr('abcdf','df')instr('abcdf','df')4
一行转多行select tmp_col
from xxxxx
LATERAL VIEW explode (split(column_1, '\\.')) tmp as tmp_col
where xxxx = yyyyy
把column_1按照英文符号.拆分
成多行(其他字段会和之前的那
行保持一致)
返回第一个非空(NULL)字符串(两个参数)
注:如果是空字符串'' 则会返回'',
并不会继续往后找后面有内容的值
nvl(NULL,'1')
nvl('','1')
nvl(NULL,'1')
nvl('','1')
1'
''
返回第一个非空字符串(无限多个参数)coalesce(NULL,'','NULL','123')coalesce(NULL,'','NULL','123')''
开窗函数窗口内的行号row_number()over(partition by industry  order by sale_amt desc) as amt_rankrow_number() over(partition by weeks order by uid) as uid_num
窗口内向下第n行的值lead(dates,1) over(distribute by uid sort by dates asc) as dates_endlead(dates,1) over(distribute by uid sort by dates asc) as dates_end按照销额计算产业排名
窗口内向上第n行的值lag(dates,1) over(distribute by uid sort by dates asc) as dates_endlag(dates,1) over(distribute by uid sort by dates asc) as dates_end
窗口内排序的序号(考虑重复,挤占后续位置)rank() over(partition by cookieid order by pv desc) as rank_numrank() over(partition by cookieid order by pv desc) as rank_num1
1
3
4
5
窗口内排序的序号(考虑重复,不挤占后续位置)dense_rank() over(partition by cookieid order by pv desc) as rank_numdense_rank() over(partition by cookieid order by pv desc) as rank_num1
1
2
3
4
窗口内的第一个值first_value(url) over(partition by cookieid order by createtime) as url_first_valuefirst_value(url) over(partition by cookieid order by createtime) as url_first_value
窗口内的最后一个值last_value(url) over(partition by cookieid order by createtime) as url_last_valuelast_value(url) over(partition by cookieid order by createtime) as url_last_value
窗口内求和sum(sale_qty) over(partition by industry order by weeks) as sale_qtysum(sale_qty) over(partition by industry order by weeks) as sale_qty
窗口内去重计数count(distinct uid) over(partition by weeks) as uid_num
窗口内所有值最大值max(uid) over(partition by weeks) as uid_maxmax(uid) over(partition by weeks) as uid_max
窗口内所有值最小值min(uid) over(partition by weeks) as uid_minmin(uid) over(partition by weeks) as uid_min

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值