最近有任务,是与字符串解析相关的,用到了String.split(),顺便记录一下。
简介:
Thinking in Java中将split()定义为String类的一个非常有用的正则表达式工具,其功能是“将字符串从正则表达式匹配的地方切开”。
所属包:
java.lang.String
用法详解:
有两个版本:
1. String[] java.lang.String.split(String regex, int limit)
param:
- String regex:匹配的符号或正则表达式
- int limit:分成几个部分
return:
- String[]:返回String数组
代码:
import java.util.Arrays;
public class SpiltTest {
public static void main(String[] args){
String content = "a,b,c,d";
String arr0[] = content.split(",", 0);
String arr1[] = content.split(",", 1);
String arr2[] = content.split(",", 2);
String arr4[] = content.split(",", 4);
System.out.println(Arrays.toString(arr0));
System.out.println(Arrays.toString(arr1));
System.out.println(Arrays.toString(arr2));
System.out.println(Arrays.toString(arr4));
}
}
结果:
说明:
regex这个参数就不多说了,这里用的”,”
关于limit,通过regex分割后的部门的数量,通过上面的例子可以看到:
0:的时候limit这个参数没起作用,见了”,”就分割,实际上split(String regex)就是调用split(String reges,0)的
1:分成1部分,没意义,相当于不分割
2:分成2部分,也就是进行了1次分割
4:分成4部分,也就是进行了3次分割
6:分成4部门,用”,“最多分成4部分,所以>4的都被当作4处理
附limit的jdk文档:
The limit parameter controls the number of times the pattern is
applied and therefore affects the length of the resulting array. If
the limit n is greater than zero then the pattern will be applied at
most n - 1 times, the array’s length will be no greater than n, and
the array’s last entry will contain all input beyond the last matched
delimiter. If n is non-positive then the pattern will be applied as
many times as possible and the array can have any length. If n is zero
then the pattern will be applied as many times as possible, the array
can have any length, and trailing empty strings will be discarded.
JDK源码:
public String[] split(String regex, int limit) {
/* fastpath if the regex is a
(1)one-char String and this character is not one of the
RegEx's meta characters ".$|()[{^?*+\\", or
(2)two-char String and the first char is the backslash and
the second is not the ascii digit or ascii letter.
*/
char ch = 0;
if (((regex.value.length == 1 &&
".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) ||
(regex.length() == 2 &&
regex.charAt(0) == '\\' &&
(((ch = regex.charAt(1))-'0')|('9'-ch)) < 0 &&
((ch-'a')|('z'-ch)) < 0 &&
((ch-'A')|('Z'-ch)) < 0)) &&
(ch < Character.MIN_HIGH_SURROGATE ||
ch > Character.MAX_LOW_SURROGATE))
{
int off = 0;
int next = 0;
boolean limited = limit > 0;
ArrayList<String> list = new ArrayList<>();
while ((next = indexOf(ch, off)) != -1) {
if (!limited || list.size() < limit - 1) {
list.add(substring(off, next));
off = next + 1;
} else { // last one
//assert (list.size() == limit - 1);
list.add(substring(off, value.length));
off = value.length;
break;
}
}
// If no match was found, return this
if (off == 0)
return new String[]{this};
// Add remaining segment
if (!limited || list.size() < limit)
list.add(substring(off, value.length));
// Construct result
int resultSize = list.size();
if (limit == 0) {
while (resultSize > 0 && list.get(resultSize - 1).length() == 0) {
resultSize--;
}
}
String[] result = new String[resultSize];
return list.subList(0, resultSize).toArray(result);
}
return Pattern.compile(regex).split(this, limit);
}
2. String[] java.lang.String.split(String regex)
不说多,直接看JDK源码:
public String[] split(String regex) {
return split(regex, 0);
}
明白第一个版本,这个也就明白了。