一、spark导java包获取时间
在spark 中获取时间用到java.util.{Calendar,Date} 以及java.text.SimpleDateFormat来对时间输出格式作规范
可以进入sparkshell界面测试:spark-shell
首先先导入包
import java.text.SimpleDateFormat
import java.util.{Calendar, Date}
获取当前时间:
def getNowTime(): String = {
//实例化一个Date对象并且获取时间戳(毫秒级)
val time = new Date().getTime
//设置时间格式
val format = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss")
//将时间格式套用在获取的时间戳上
format.format(time)
}
调用该函数得到的结果为
2021-07-06 17:44:48
当想要获取非今天时间或者年份,月份,日期,小时,则要用到Calendar包
val cal = Calendar.getInstance //实例化Calendar对象
如果想获取昨天的时间方法一:
//将-1添加到Calendar.Date中,即加载到昨天的时间
//day为1时,就是在当前时间加一天,即是明天
cal.add(Calendar.DATE, -1)
val time1: Date = cal.getTime //获取时间
val newtime: String = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(time1) //设置格式并且对时间格式化
如果想获取昨天的时间方法二:
取出数字型的时间 再减去24*60*60*1000,就得到昨天的时间了
val yesterday = new Date(new Date().getTime()-24*60*60*1000)
val matter = new SimpleDateFormat("yyyy-MM-dd")
val time = matter.format(yesterday)
如果想获取年,月,日,小时等
val week = cal.get(Calendar.DAY_OF_WEEK)
println("星期:"+week)
val year = cal.get(Calendar.YEAR)
println("年份:"+year)
val month = cal.get(Calendar.MONTH)
println("月份:"+(month+1)) //国外的月份是从0-11,所以要加1
val Day = cal.get(Calendar.DAY_OF_MONTH)
println("日子:"+Day)
val hour = cal.get(Calendar.HOUR_OF_DAY)
println("小时:"+hour)
val minute = cal.get(Calendar.MINUTE)
println("分钟:"+minute)
value second = cal.get(Calendar.SECOND)
println("秒:"+second)
value millisecond = cal.get(Calendar.MILLISECOND)
println("毫秒:"+millisecond)
输出的结果为:
2021-07-06 17:44:48
星期:3
年份:2021
月份:7
日子:6
小时:17
分钟:44
秒:48
毫秒:901
二、sparkSql获取时间
SPARK SQL主要通过内置日期时间函数实现。
可以进入sparksql界面测试:spark-sql
2.1 获取当前时间
1.current_date获取当前日期
select current_date;
2021-07-06
2.current_timestamp和now()获取当前时间
select current_timestamp;
select now();
2021-07-06 18:38:11.781
3.unix_timestamp返回当前时间的unix时间戳
select unix_timestamp();
1625569197
2.2 从日期时间中提取字段
1.year,month,day/dayofmonth,hour,minute,second(获取日期中的 年,月,日,天,时分秒)
Examples:
> SELECT day('2021-07-06 18:38:11.781');
6
2.dayofweek (1 = Sunday, 2 = Monday, ..., 7 = Saturday) 获取星期几,dayofyear获取年中的第几天
Examples:
> SELECT dayofweek('2021-07-06'); //3
Since: 2.3.0
3.weekofyear
weekofyear(date) - Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days.
Examples:
> SELECT weekofyear('2021-07-06'); //27
4.trunc截取某部分的日期,其他部分默认为01
第二个参数 ["year", "yyyy", "yy", "mon", "month", "mm"]
Examples:
> SELECT trunc('2021-07-06', 'MM');
2021-07-01
> SELECT trunc('2021-07-06', 'YEAR');
2021-01-01
> SELECT trunc('2021-07-06 18:38:11.781', 'mm');
2021-07-01
5.date_trunc ["YEAR", "YYYY", "YY", "MON", "MONTH", "MM", "DAY", "DD", "HOUR", "MINUTE", "SECOND", "WEEK", "QUARTER"]
Examples:
> SELECT date_trunc('2021-07-06 18:38:11.781', 'HOUR'); //2021-07-06 18:00:00
Since: 2.3.0
6.date_format将时间转化为某种格式的字符串
Examples:
> SELECT date_format('2021-07-06', 'y'); 2021
2.3 日期时间转换
1.将日期转换为时间戳:
SELECT unix_timestamp('2021-07-06 18:38:11.781');
SELECT unix_timestamp('2021-07-06 18:38:11.781','yyyy-MM-dd HH:mm:ss');
1625567891
2.from_unixtime将时间戳换算成当前时间,to_unix_timestamp将时间转化为时间戳
Examples:
> SELECT from_unixtime(0);
> SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss');
1970-01-01 08:00:00
>SELECT to_unix_timestamp('2021-07-06', 'yyyy-MM-dd');
1625500800
3.to_date/date将字符串转化为日期格式,to_timestamp(Since: 2.2.0)
> SELECT to_date('2021-07-06 18:38:11.781');
2021-07-06
> SELECT to_date('2021-07-06 18:38:11.781', 'yyyy-MM-dd HH:mm:ss');
2021-07-06
> SELECT to_timestamp('2021-07-06 18:38:11.781');
2021-07-06 18:38:11.781
4.quarter 将1年4等分(range 1 to 4)
Examples:
> SELECT quarter('2021-07-06'); //3
2.4 日期、时间的相关计算
1.months_between两个日期之间的月数
months_between(timestamp1, timestamp2) - Returns number of months between timestamp1
and timestamp2
.
Examples:
> SELECT months_between('2021-06-07', '2021-10-01'); //3.80645161
2. add_months返回日期后n个月后的日期
Examples:
> SELECT add_months('2021-07-06', 3); //2021-10-06
3.last_day(date),next_day(start_date, day_of_week)
Examples:
> SELECT last_day('2021-07-06'); //2021-07-31
> SELECT next_day('2021-06-07', 'TU'); //2021-07-13
4.date_add,date_sub(减)
date_add(start_date, num_days) - Returns the date that is num_days
after start_date
.
Examples:
> SELECT date_add('2021-07-06', 1); //2021-07-07
> SELECT date_sub('2021-07-06', 1); //2021-07-05
5.datediff(两个日期间的天数)
datediff(endDate, startDate) - Returns the number of days from startDate
to endDate
.
Examples:
> SELECT datediff('2021-07-06', '2021-07-10'); -4
> SELECT datediff('2021-07-10', '2021-07-06'); 4
6.关于UTC时间 to_utc_timestamp
to_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'.
Examples:
> SELECT to_utc_timestamp('2021-07-06', 'Asia/Seoul'); //2021-07-05 15:00:00
> SELECT to_utc_timestamp('2021-07-06', 'Asia/Beijing'); //2021-07-06 00:00:00
7.from_utc_timestamp
from_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.
Examples:
> SELECT from_utc_timestamp('2021-07-06', 'Asia/Seoul'); //2021-07-06 09:00:00
三、实战:
3.1 scala实现:
import java.text.SimpleDateFormat
import java.util.{Calendar, TimeZone}
object Utils {
/**获取今日、昨日、前日、上周的日期
*/
def get_related_date():List[String] = {
val calendar = Calendar.getInstance()
val today = new SimpleDateFormat("yyyyMMdd").format(calendar.getTime)
calendar.add(Calendar.DATE, -1)
val yestoday = new SimpleDateFormat("yyyyMMdd").format(calendar.getTime)
calendar.add(Calendar.DATE, -1)
val day_before_yestoday = new SimpleDateFormat("yyyyMMdd").format(calendar.getTime)
calendar.add(Calendar.DATE, -5)
val last_week_date = new SimpleDateFormat("yyyyMMdd").format(calendar.getTime)
List(yestoday,day_before_yestoday,last_week_date)
}
/**获取时间戳范围
* returns the timestamp range of the input date
* @param date the date to analyse
*/
def get_timestamp_range(date: String):List[Long] = {
val start = new SimpleDateFormat("yyyyMMdd").parse(date).getTime()
val end = start + 23*60*60*1000 + 59*60*1000 + 59*1000
List(start,end)
}
/**获取昨天、前天、上周的三个时间戳范围
*/
def get_related_date_timestamp_range():List[List[Long]] = {
val date = get_related_date()
val range_list = List(get_timestamp_range(date(0)),get_timestamp_range(date(1)),get_timestamp_range(date(2)))
range_list
}
}
3.2 Java实现:
import java.util.Date;
import java.text.SimpleDateFormat;
import java.util.Calendar;
public class Time {
public static void main(String[] args) {
SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");//设置日期格式
System.out.println(df.format(new Date()));// new Date()为获取当前系统时间
Date now = new Date();
SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss");//可以方便地修改日期格式
String time = dateFormat.format(now);
System.out.println(time);
Long startTs = System.currentTimeMillis(); // 当前时间戳
System.out.println(startTs);
Calendar c = Calendar.getInstance();//可以对每个时间域单独修改
int year = c.get(Calendar.YEAR);
int month = c.get(Calendar.MONTH) + 1; //从零开始
int date = c.get(Calendar.DATE);
int hour = c.get(Calendar.HOUR_OF_DAY);
int minute = c.get(Calendar.MINUTE);
int second = c.get(Calendar.SECOND);
System.out.println("年:" + year + "\n" + "月:" + month + "\n" + "日:" + date + "\n" + "时:" + hour + "\n" + "分:" + minute + "\n" + "秒:" + second);
}
}