Hive常用函数及实例演示

一、数学函数

1.round(double a,int b) 返回a保留b位小数位的值,根据四舍五入原则

select round(4.356),round(4.356,2),round(4.354,2)
+------+-------+-------+--+
| _c0  |  _c1  |  _c2  |
+------+-------+-------+--+
| 4.0  | 4.36  | 4.35  |
+------+-------+-------+--+

2.floor(double a) 向下取整
   ceil(double a) 向上取整

select floor(4.6),ceil(3.4)
+------+------+--+
| _c0  | _c1  |
+------+------+--+
| 4    | 4    |
+------+------+--+

3.rand(int seed) 返回随机数,seed是随机因子,可以根据更改随机因子来改变值

select rand(10),rand(20)
+---------------------+---------------------+--+
|         _c0         |         _c1         |
+---------------------+---------------------+--+
| 0.7304302967434272  | 0.7320427537419555  |
+---------------------+---------------------+--+

4.exp(double a) 返回e的a次幂
   power/pow(double a,double p) 返回a的p次幂

 select exp(2),pow(2,3),power(2,3)
+-------------------+------+------+--+
|        _c0        | _c1  | _c2  |
+-------------------+------+------+--+
| 7.38905609893065  | 8.0  | 8.0  |
+-------------------+------+------+--+

5.sqrt(double a) 返回a的平方差
   abs(double a) 返回a的绝对值
   pmod(int a,int b) a对b取模,这里a可以是double类型,不过常用的是int型

select sqrt(4),abs(-9),pmod(8,3);
+------+------+------+--+
| _c0  | _c1  | _c2  |
+------+------+------+--+
| 2.0  | 9    | 2    |
+------+------+------+--+
二、转化函数

1.cast(expr as type) 将expr转换成type类型的值

select cast("123" as int);
+------+--+
| _c0  |
+------+--+
| 123  |
+------+--+
三、日期函数

1.from_unixtime(bigInt unixtime,string format) 将时间戳转成需要的时间格式

select from_unixtime(1594671083,"yyyy-MM-dd"),from_unixtime(1594671083,"yyyy-MM-dd hh:mm:ss");
+-------------+----------------------+--+
|     _c0     |         _c1          |
+-------------+----------------------+--+
| 2020-07-14  | 2020-07-14 04:11:23  |
+-------------+----------------------+--+

2.unix_timestamp(string date) 将日期格式转成时间戳
unix_timestamp(string date,string format) 若format格式没有写全,则当作成0秒

select unix_timestamp(),unix_timestamp("2017-7-13 8:8:8"),unix_timestamp("2017-7-13","yyyy-MM-dd");
+-------------+-------------+-------------+--+
|     _c0     |     _c1     |     _c2     |
+-------------+-------------+-------------+--+
| 1594671454  | 1499904488  | 1499875200  |
+-------------+-------------+-------------+--+

3.to_date(string timestamp) 返回时间格式的日期部分

select to_date("2017-09-12 21:22:23");
+-------------+--+
|     _c0     |
+-------------+--+
| 2017-09-12  |
+-------------+--+

4.quarter(date/string) 返回季度(hive1.3版本后可使用,我这里是1.1版本)
   weekofyear(string/date) 返回一年内的第几周

select weekofyear("2020-07-13");
+------+--+
| _c0  |
+------+--+
| 29   |
+------+--+

5.datediff(string enddate,string startdate) 两日期相差的天数
   date_add(string startdate,int days) 开始时间加上days天数
   date_sub(string startdate,int days)

select datediff("2020-07-13 08:08:08","2020-07-11 23:59:59"),date_add("2020-07-13 08:08:08",2),date_sub("2020-07-13 08:08:08",3);
+------+-------------+-------------+--+
| _c0  |     _c1     |     _c2     |
+------+-------------+-------------+--+
| 2    | 2020-07-15  | 2020-07-10  |
+------+-------------+-------------+--+

6.current_date() 返回当前日期
   current_timestamp 返回当前时间

select current_date(),current_timestamp();
+-------------+--------------------------+--+
|     _c0     |           _c1            |
+-------------+--------------------------+--+
| 2020-07-14  | 2020-07-14 07:29:25.721  |
+-------------+--------------------------+--+

7.last_day(string date) 返回date对应月份的最后一天
   next_day(string datetime,string day of week) 返回datetime到下一个星日几的天数(星期取前两个英文字母)

select last_day("2020-07-10"),next_day("2020-07-12","TH");
+-------------+-------------+--+
|     _c0     |     _c1     |
+-------------+-------------+--+
| 2020-07-31  | 2020-07-16  |
+-------------+-------------+--+

8.add_months(string startdate,int months) 加上几个月

select add_months("2020-07-13",3);
+-------------+--+
|     _c0     |
+-------------+--+
| 2020-10-13  |
+-------------+--+

9.trunc(string datetime,string format) 返回datetime中的年和月,format支持的格式为MONTH/MON/MM,YEAR/YYYY/YY

select trunc("2020-07-13 08:08:08","YYYY"),trunc("2020-07-13","MONTH");
+-------------+-------------+--+
|     _c0     |     _c1     |
+-------------+-------------+--+
| 2020-01-01  | 2020-07-01  |
+-------------+-------------+--+

10.months_between(string enddate,string startdate) 返回两个时间的月份差(结果是double类型,按照时间戳差值/该月的天数)

select months_between("2020-07-13","2020-07-10"),months_between("2020-07-13","2020-06-13");
+-------------+------+--+
|     _c0     | _c1  |
+-------------+------+--+
| 0.09677419  | 1.0  |
+-------------+------+--+

11.date_format(string datetime,string format) 按照时间格式返回

select date_format("2020-07-13 08:08:08","yyyy-MM-dd"),date_format("2020-07-13 08:08:08","yyyy/MM/dd");
+-------------+-------------+--+
|     _c0     |     _c1     |
+-------------+-------------+--+
| 2020-07-13  | 2020/07/13  |
+-------------+-------------+--+
四、条件函数

1.if(boolean condition,valueTrue,valueFalseOrNull) contition为true,则返回valueTrue,否则返回valueFalse或null

select if(3>2,"true","false");
+-------+--+
|  _c0  |
+-------+--+
| true  |
+-------+--+

2.nvl(value,default_value) value为空返回default_value,否则返回value

select nvl(null,20),nvl(1,20);
+------+------+--+
| _c0  | _c1  |
+------+------+--+
| 20   | 1    |
+------+------+--+

3.case-when 类似于if-else

case a when b1 then c1
	   when b2 then c2
	   ...
	   else c
end	   
case when a1 then b1
	 when a2 then b2
	 ...
	 else b
end
select case 20/4 when 1 then "a"
. . . . . . . .> when 3 then "b"
. . . . . . . .> when 5 then "c"
. . . . . . . .> else "d"
. . . . . . . .> end;
+------+--+
| _c0  |
+------+--+
| c    |
+------+--+
select case when 20/4=1 then "a"
. . . . . . > when 20/4=3 then "b"
. . . . . . > when 20/4=5 then "c"
. . . . . . > else "d"
. . . . . . >end;
+------+--+
| _c0  |
+------+--+
| c    |
+------+--+

4.isnull(a) 是否为空
  isnotnull(a) 是否不是空

select isnull(null),isnotnull(3);
+-------+-------+--+
|  _c0  |  _c1  |
+-------+-------+--+
| true  | true  |
+-------+-------+--+
五、字符串函数

1.concat(s1,s2,sn) 多个字符串合并成一个字符串
  concat_ws(x,s1,s2,sn) 同concat,但每个字符间用x连接

select concat("abc",",""223"),concat_ws("-","abc","2342");
+----------+-----------+--+
|   _c0    |    _c1    |
+----------+-----------+--+
| abc,223  | abc-2342  |
+----------+-----------+--+

2.find_in_set(string str,string strlist) 返回str在strlist中的位置(strlist是逗号分隔的,如果查出是’,'或查找失败,则返回0,为空则返回null)

select find_in_set('a',"a,d,f"),find_in_set('d',"a,c,f"),find_in_set(',',"a,d,f,g"),find_in_set(null,'a,d,g');
+------+------+------+-------+--+
| _c0  | _c1  | _c2  |  _c3  |
+------+------+------+-------+--+
| 1    | 0    | 0    | NULL  |
+------+------+------+-------+--+

3.format_number(number x,int d) 将x转换成#,###,###.### ,进行四舍五入保留d位小数

select format_number(12356433.445,2),format_number(23424234.342,0);
+----------------+-------------+--+
|      _c0       |     _c1     |
+----------------+-------------+--+
| 12,356,433.45  | 23,424,234  |
+----------------+-------------+--+

4.get_json_object(string json_string,string s) 返回json格式的字符串中需要的属性名s

select get_json_object('{"userid":"1","username":"zs","age":"20"}',"$.username");
+------+--+
| _c0  |
+------+--+
| zs   |
+------+--+

5.instr(string str,string str1) 返回str1在str中的位置,找不到返回0

select instr("sdfsdf","fs"),instr("dfsdf","dd");
+------+------+--+
| _c0  | _c1  |
+------+------+--+
| 3    | 0    |
+------+------+--+

6.locate(string substr,string str,int pos) 返回从第pos的位置开始substr在str中第一次出现的位置,找不到返回0

select locate("fs","dsfserfsdfsd",2),locate("fs","dsfserfsdfsd",5),locate("fsa","dsfserfsdfsd",1);
+------+------+------+--+
| _c0  | _c1  | _c2  |
+------+------+------+--+
| 3    | 7    | 0    |
+------+------+------+--+

7.lpad(string str,int n,string s) 用s字符串在str前补齐到n位
  rpad(string str,int n,string s) 用s字符串在str后补齐到n位

select lpad("abc",5,"xx"),rpad("abc",5,"xx");
+--------+--------+--+
|  _c0   |  _c1   |
+--------+--------+--+
| xxabc  | abcxx  |
+--------+--------+--+

8.ltrim(string s) 去掉s字符串前面的空格
  rtrim(string s) 去掉s字符串前面的空格

select ltrim("   adg   "),rtrim("   adg   ");
+---------+---------+--+
|   _c0   |   _c1   |
+---------+---------+--+
| adg     |    adg  |
+---------+---------+--+

PS:去除中间空格,用正则表达式

select regexp_replace("hello world",'\\s+','');
+-------------+--+
|     _c0     |
+-------------+--+
| helloworld  |
+-------------+--+

9.根据URL地址,抽取指定部分的内容
parse_url(string urlstring,string partTOExtract) partTOExtract类型有PROTOCOL,HOST PATH QUERY;
select parse_url(‘http://facebook.com/path/p1.php?name=1&pwd=123’,‘HOST’)
PROTOCOL:http
HOST:facebook.com
PATH:path/p1.php
QUERY:name=1&pwd=123
‘QUERY’:‘name’:1

select parse_url('http://facebook.com/path/p1.php?name=1&pwd=123','HOST');
+---------------+--+
|      _c0      |
+---------------+--+
| facebook.com  |
+---------------+--+
select parse_url('http://facebook.com/path/p1.php?name=1&pwd=123','PATH');
+---------------+--+
|      _c0      |
+---------------+--+
| /path/p1.php  |
+---------------+--+

10.regexp_extract(string subject,string pattern,int index) 截取字符串subject中符合正则表达式pattern的第index个部分的子字符串

select regexp_extract("Coming999",'(\\d+)',1);
+------+--+
| _c0  |
+------+--+
| 999  |
+------+--+
select regexp_extract("Coming999",'[a-zA-z]+',0);
+---------+--+
|   _c0   |
+---------+--+
| Coming  |
+---------+--+

11.split(string s,pattern) 按正则表达式进行分割,返回一个字符数组

select split("avd sde|dfe,seg","[ \\|,]");
+----------------------------+--+
|            _c0             |
+----------------------------+--+
| ["avd","sde","dfe","seg"]  |
+----------------------------+--+

12.regexp_replace(string s, string pattern ,string replacement); 按照正则表达式用replacement替换匹配到s的部分

select regexp_replace("123adbd345",'[a-zA-Z]','o');
+-------------+--+
|     _c0     |
+-------------+--+
| 123oooo345  |
+-------------+--+

13.repeat(string s,int n) 重复
  reverse(string s) 反转

 select repeat("dsfs",3),reverse("adf");
 +---------------+------+--+
|      _c0      | _c1  |
+---------------+------+--+
| dsfsdsfsdsfs  | fda  |
+---------------+------+--+

14.substr(string s, int start)
     substring(string s, int start)
    substr(string s, int start,int len)
   substring(string s, int start,int len)

select substr("adf123",3),substring("adf123",3),substr("adf123",3,2),substring("adf123",3,2);
+-------+-------+------+------+--+
|  _c0  |  _c1  | _c2  | _c3  |
+-------+-------+------+------+--+
| f123  | f123  | f1   | f1   |
+-------+-------+------+------+--+

15.translate(string s,string s1,string s2) 用s2替换s中的s1部分

select translate("hello world","world","1");
+-------+--+
|  _c0  |
+-------+--+
| he 1  |
+-------+--+

解析:替换的是字符"1",因此“1”只代表“w”,剩下的"orld"均是null,所以“hello”中的"llo"部分包含于“world”中也是null,“world”中只有“w”是“1”,其余均为null。常用于加密文件。
16.initcap(string s) 将首字母转成大写

select initcap("hello world");
+--------------+--+
|     _c0      |
+--------------+--+
| Hello World  |
+--------------+--+
六、聚合函数

count,avg,sum,min,max这几个太常用,不多做说明

七、表生成函数

explode(array a) 将a中的每一个元素转成行

select explode(split("a,d,f,g,h",","));
+------+--+
| col  |
+------+--+
| a    |
| d    |
| f    |
| g    |
| h    |
+------+--+
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值