一、问题场景
1、Java中的SimpleDateFormat在进行字符串转日期的操作时,若设定字符串解析超出时,生成的时间会出现变动的情况。问题如下:
时间与实际值发生了变化。
2、支持负值
二、源码分析
1、观察调用方法可以看到,这个过程有两部分组成,一部分是设置转换正则pattern,一部分是解析时间字符串parse。
SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMddHHmmss");
Date tmpDate = sdf.parse("2023052018303000");
2、SimpleDateFormat在初始化调用构造函数时,我们会调用以下方法
/** * Constructs a SimpleDateFormat using the given pattern and * the default date format symbols for the default * {@link java.util.Locale.Category#FORMAT FORMAT} locale. * Note: This constructor may not support all locales. * For full coverage, use the factory methods in the {@link DateFormat} * class. * This is equivalent to calling * {@link #SimpleDateFormat(String, Locale) * SimpleDateFormat(pattern, Locale.getDefault(Locale.Category.FORMAT))}. * * @see java.util.Locale#getDefault(java.util.Locale.Category) * @see java.util.Locale.Category#FORMAT * @param pattern the pattern describing the date and time format * @exception NullPointerException if the given pattern is null * @exception IllegalArgumentException if the given pattern is invalid */public SimpleDateFormat(String pattern){ this(pattern, Locale.getDefault(Locale.Category.FORMAT));} |
/** * Constructs a |
/* Initialize compiledPattern and numberFormat fields */private void initialize(Locale loc) { // Verify and compile the given pattern. compiledPattern = compile(pattern); /* try the cache first */ numberFormat = cachedNumberFormatData.get(loc); if (numberFormat == null) { /* cache miss */ numberFormat = NumberFormat.getIntegerInstance(loc); numberFormat.setGroupingUsed(false); /* update cache */ cachedNumberFormatData.putIfAbsent(loc, numberFormat); } numberFormat = (NumberFormat) numberFormat.clone(); initializeDefaultCentury();} |
3、将解析的结果存放在【transient private char[] compiledPattern;】中,因为java中一个char占2个字节,因此这个变量能够保存16位的信息。
格式存储这块需要详细介绍下:
低8位 对于普通的字符 如标识 y/Y 记录的他的位数
对于特殊符号,记录了字符信息
高8位 对普通的字符,记录该符号的索引位置(二进制标识)
static final String patternChars = "GyMdkHmsSEDFwWahKzZYuXL";下
主要记录了下标
4、对于parse方法,会调用重写父类DateFormate的parse函数
public Date parse(String text, ParsePosition pos) { checkNegativeNumberExpression(); int start = pos.index; int oldStart = start; int textLength = text.length(); boolean[] ambiguousYear = {false}; CalendarBuilder calb = new CalendarBuilder(); for (int i = 0; i < compiledPattern.length; ) { int tag = compiledPattern[i] >>> 8; int count = compiledPattern[i++] & 0xff; if (count == 255) { count = compiledPattern[i++] count |= compiledPattern[i++]; } switch (tag) { case TAG_QUOTE_ASCII_CHAR: if (start >= textLength || text.charAt(start) != (char)count) { pos.index = oldStart; pos.errorIndex = start; return null; } start++; break; case TAG_QUOTE_CHARS: while (count-- > 0) { if (start >= textLength || text.charAt(start) != compiledPattern[i++]) { pos.index = oldStart; pos.errorIndex = start; return null; } start++; } break; default: // Peek the next pattern to determine if we need to // obey the number of pattern letters for // parsing. It's required when parsing contiguous // digit text (e.g., "20010704") with a pattern which // has no delimiters between fields, like "yyyyMMdd". boolean obeyCount = false; // In Arabic, a minus sign for a negative number is put after // the number. Even in another locale, a minus sign can be // put after a number using DateFormat.setNumberFormat(). // If both the minus sign and the field-delimiter are '-', // subParse() needs to determine whether a '-' after a number // in the given text is a delimiter or is a minus sign for the // preceding number. We give subParse() a clue based on the // information in compiledPattern. boolean useFollowingMinusSignAsDelimiter = false; if (i < compiledPattern.length) { int nextTag = compiledPattern[i] >>> 8; if (!(nextTag == TAG_QUOTE_ASCII_CHAR || nextTag == TAG_QUOTE_CHARS)) { obeyCount = true; } if (hasFollowingMinusSign && (nextTag == TAG_QUOTE_ASCII_CHAR || nextTag == TAG_QUOTE_CHARS)) { int c; if (nextTag == TAG_QUOTE_ASCII_CHAR) { c = compiledPattern[i] & 0xff; } else { c = compiledPattern[i+1]; } if (c == minusSign) { useFollowingMinusSignAsDelimiter = true; } } } start = subParse(text, start, tag, count, obeyCount, ambiguousYear, pos, useFollowingMinusSignAsDelimiter, calb); if (start < 0) { pos.index = oldStart; return null; } } } // At this point the fields of Calendar have been set. Calendar // will fill in default values for missing fields when the time // is computed. pos.index = start; Date parsedDate; try { parsedDate = calb.establish(calendar).getTime(); // If the year value is ambiguous, // then the two-digit year == the default start year if (ambiguousYear[0]) { if (parsedDate.before(defaultCenturyStart)) { parsedDate = calb.addYear(100).establish(calendar).getTime(); } } } // An IllegalArgumentException will be thrown by Calendar.getTime() // if any fields are out of range, e.g., MONTH == 17. catch (IllegalArgumentException e) { pos.errorIndex = start; pos.index = oldStart; return null; } return parsedDate; } |
该方法是通过初始化的compiledPattern来对字符串进行解析,针对每一个compiledPattern[i] 对低8位及高8位进行检查,如果包含风格符,则进行专用的方法处理,本次不包含,则走默认的逻辑。5、这里涉及一个重要的参数boolean obeyCount = false;,这个参数表明是否要按照计数来格式化日期字符串,这个变量在与下一个解析域不存在分隔符并且不是该模式最后一个域的时候是true,其他时间为false
例如:
这个变量在解析yyyyMMddHHmmss这样模式特别重要 ,对于 202305261830300 因为dateformat解析的时候允许域中值溢出,这个在calb内部会从处理掉,直接向上进位,这就导致了在读yyyy的时候只读4位 MM只读2位。。。直到最后的3位被赋值给了sss,在最终转换日期时重新计算出了时间戳并转换成了日期
parsedDate = calb.establish(calendar).getTime();
三、扩展学习
1、多线程环境下会SimpleDateFormat会出现异常调用情况
1)format方法在多线程下会出现格式的时间不对
2)parse方法在多线程下会出现异常
原因:calendar是个全局变量,多线程环境下calendar是共享的,calendar.setTime(date)导致设置时间错乱问题,在使用时尽量不设置为static或者直接选用其他线程安全的日期处理函数。
解决方案:
1.ThreadLocal使用多线程访问共享变量
2.使用DateTimeFormatter代替