简而言之:strip()是trim()的“Unicode-aware”演变.
Problem
String::trim has existed from early days of Java when Unicode
had not fully evolved to the standard we widely use today.
The definition of space used by String::trim is any code point less
than or equal to the space code point (\u0020), commonly referred to
as ASCII or ISO control characters.
Unicode-aware trimming routines should use
Character::isWhitespace(int).
Additionally, developers have not been able to specifically remove
indentation white space or to specifically remove trailing white
space.
Solution
Introduce trimming methods that are Unicode white space aware
and provide additional control of leading only or trailing only.
这些新方法的一个共同特征是它们使用与旧方法(如String.trim())不同(更新)的“空格”定义.错误JDK-8200373.
The current JavaDoc for String::trim does not make it clear which
definition of “space” is being used in the code. With additional
trimming methods coming in the near future that use a different
definition of space, clarification is imperative. String::trim uses
the definition of space as any codepoint that is less than or equal to
the space character codepoint (\u0020.) Newer trimming methods will
use the definition of (white) space as any codepoint that returns true
when passed to the Character::isWhitespace predicate.
方法isWhitespace(char)被添加到带有JDK 1.1的Character中,但是方法isWhitespace(int)在JDK 1.5之前没有被引入到Character类中.后一种方法(接受int类型参数的方法)被添加以支持增补字符. Character类的Javadoc注释定义了补充字符(通常使用基于int的“代码点”建模)与BMP字符(通常使用单个字符建模):
The set of characters from U+0000 to U+FFFF is sometimes referred to
as the Basic Multilingual Plane (BMP). Characters whose code points
are greater than U+FFFF are called supplementary characters. The Java
platform uses the UTF-16 representation in char arrays and in the
String and StringBuffer classes. In this representation, supplementary
characters are represented as a pair of char values … A char value,
therefore, represents Basic Multilingual Plane (BMP) code points,
including the surrogate code points, or code units of the UTF-16
encoding. An int value represents all Unicode code points, including
supplementary code points. … The methods that only accept a char
value cannot support supplementary characters. … The methods that
accept an int value support all Unicode characters, including
supplementary characters.