spark sql学习笔记

时间类

获取当前时间current_timestamp

select now();  --2021-03-18 14:39:47.962
select current_timestamp;  --2021-03-18 14:39:03.262

从日期时间中提取字段

1.year,month,day/dayofmonth,hour,minute,second
Examples:> SELECT day('2009-07-30');    --30
Examples:> SELECT year(now());     --2021

2.dayofweek (1 = Sunday, 2 = Monday, ..., 7 = Saturday),dayofyear
Examples:> SELECT dayofweek('2009-07-30');   5
Examples:> SELECT dayofweek(now());   5

3.weekofyear
weekofyear(date) - Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days.
Examples:> SELECT weekofyear('2008-02-20');   8

4.trunc截取某部分的日期,其他部分默认为01
第二个参数 ["year", "yyyy", "yy", "mon", "month", "mm"]

Examples:

> SELECT trunc('2009-02-12', 'MM');
 2009-02-01
> SELECT trunc('2015-10-27', 'YEAR');
 2015-01-01
5.date_trunc ["YEAR", "YYYY", "YY", "MON", "MONTH", "MM", "DAY", "DD", "HOUR", "MINUTE", "SECOND", "WEEK", "QUARTER"]
Examples:> SELECT date_trunc('2015-03-05T09:32:05.359', 'HOUR');  2015-03-05T09:00:00

6.date_format将时间转化为某种格式的字符串
Examples:> SELECT date_format('2016-04-08', 'y');    2016

日期时间转换

1.unix_timestamp返回当前时间的unix时间戳
SELECT unix_timestamp();  1476884637
SELECT unix_timestamp('2016-04-08', 'yyyy-MM-dd');   1460041200
SELECT unix_timestamp('2016-04-08 00:00:00.000', 'yyyy-MM-dd HH:mm:ss.SSS');   1460044800
 
2.from_unixtime将时间戳换算成当前时间,to_unix_timestamp将时间转化为时间戳
Examples:
SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss');  1970-01-01 00:00:00
SELECT to_unix_timestamp('2016-04-08', 'yyyy-MM-dd');  1460041200

3.to_date/date将字符串转化为日期格式,to_timestamp(Since: 2.2.0SELECT to_date('2009-07-30 04:17:52');  2009-07-30
 SELECT to_date('2016-12-31', 'yyyy-MM-dd');   2016-12-31
 SELECT to_timestamp('2016-12-31 00:12:00');   2016-12-31 00:12:00
 
4.quarter 将14等分(range 1 to 4)
Examples:> SELECT quarter('2016-08-31');  3

日期、时间计算

1.months_between两个日期之间的月数
months_between(timestamp1, timestamp2) - Returns number of months between timestamp1 and timestamp2.

Examples:> SELECT months_between('1997-02-28 10:30:00', '1996-10-30');  3.94959677

2. add_months返回日期后n个月后的日期
SELECT add_months('2016-08-31', 1);  2016-09-30

3.last_day(date),next_day(start_date, day_of_week)
SELECT last_day('2009-01-12');  2009-01-31
SELECT next_day('2015-01-14', 'TU');  2015-01-20

4.date_add,date_sub()
date_add(start_date, num_days) - Returns the date that is num_days after start_date.
 SELECT date_add('2016-07-30', 1);  2016-07-31
 select date_sub(now(),-3);            2021-03-21
 
5.datediff(两个日期间的天数)
datediff(endDate, startDate) - Returns the number of days from startDate to endDate.
SELECT datediff('2009-07-31', '2009-07-30'); 1

6.关于UTC时间
to_utc_timestamp
to_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'.
SELECT to_utc_timestamp('2016-08-31', 'Asia/Seoul');  2016-08-30 15:00:0

from_utc_timestamp
from_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.

cast函数

select cast(‘111’ as int) 111
select cast(now() as String) 2021-03-18 14:59:17.489
select cast(‘2021-03-01 00:15:00.000’ as Date) 2021-03-01
select cast(‘2021-03-01 00:15:00.000’ as Timestamp) 2021-03-01 00:15:00.0

字符串函数

字符串的拼接: concat

SELECT concat('啊啊啊', 'sss','333');   啊啊啊sss333

字符串的拼接: concat_ws

SELECT concat_ws('++', 'kk', 'sdf第三方');    kk++sdf第三方

字符串的截取: substr /substring

SELECT substr('Spark SQL', 5);      k SQL
SELECT substr('Spark SQL', -3);     SQL
SELECT substr('Spark SQL', 5, 1);    k
SELECT substring('Spark SQL', 5);    k SQL
SELECT substring('Spark SQL', -3);    SQL
SELECT substring('Spark SQL', 5, 1);   k

截取字符串,从分隔符开始: substring_index

substring_index(str,delim,count)-在分隔符delim出现计数之前,从str返回子字符串。如果count为正,则返回最后一个分隔符左侧的所有内容(从左侧开始计数)。如果count为负,则返回最后一个分隔符右侧的所有内容(从右侧开始计数)。函数substring_index在搜索delim时执行区分大小写的匹配。–Since: 1.5.0

spark-sql> SELECT substring_index('www.apache.org', '.', 2);
www.apache
Time taken: 0.04 seconds, Fetched 1 row(s)
spark-sql> SELECT substring_index('www.apache.org', '.', 1);
www
Time taken: 0.042 seconds, Fetched 1 row(s)
spark-sql> SELECT substring_index('www.apache.org', '.', 3);
www.apache.org
Time taken: 0.037 seconds, Fetched 1 row(s)
spark-sql> SELECT substring_index('www.apache.org', '.', -1);
org

返回截取字符串的位置: instr / locate

instr(str,substr)-返回str中第一个substr的(基于1的)索引 --Since: 1.5.0

SELECT instr('SparkSQL', 'SQL');
6
SELECT locate('bar', 'foobarbar');
 4
SELECT locate('bar', 'foobarbar', 5);
 7
SELECT POSITION('bar' IN 'foobarbar');
 4

字符串的复制: repeat

repeat(str,n)-返回重复给定字符串值n次的字符串。–Since: 1.5.0

spark-sql> SELECT repeat('123', 2);
123123
Time taken: 0.052 seconds, Fetched 1 row(s)
spark-sql> SELECT repeat('123', 1);
123
Time taken: 0.034 seconds, Fetched 1 row(s)
spark-sql> SELECT repeat('123', 3);
123123123 

字符串的长度: length / CHAR_LENGTH/CHARACTER_LENGTH

length(expr)-返回字符串数据的字符长度或二进制数据的字节数。字符串数据的长度包括尾随空格。二进制数据的长度包括二进制零。–Since: 1.5.0

SELECT length('Spark SQL ');  10
SELECT CHAR_LENGTH('Spark SQL ');  10
SELECT CHARACTER_LENGTH('Spark SQL ');  10

替换函数

regexp_replace(str,“abc”,“b”)

将str中的abc替换成b

select regexp_replace('Spark SQL',' ','-')
结果:Spark-SQL
select regexp_replace(date_sub(now(),90),'-','')
结果:20201218

translate(“str”,“abc”,“b”)

将str中的a替换成b,并且将str含有字符b,c过滤掉

select translate("2.10.100.103.3","2.10","")
结果:33
select translate("2.10.100.103.2","1","2")
结果:2.20.200.203.2
select translate("2.10.100.103.2","10","4")
结果:2.4.4.43.2

切割函数 split

split("1.10.200.36","\\.")[2]
结果:200
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值