String split的更好认识

最新推荐文章于 2024-07-13 11:17:31 发布

iteye_21298

最新推荐文章于 2024-07-13 11:17:31 发布

阅读量106

点赞数

文章标签：正则表达式 F# C C++ C#

本文链接：https://blog.csdn.net/iteye_21298/article/details/81993503

版权

今天查看了String.split的实现方式,对于其中的实现原理与弊端进行一个更好的认识.

1.split的参数是一个regex,正则表达式.
要说到正则表达式,那就避免不了特殊字符的转义,比如+号,|号等都需要继续转义的.

2.性能优劣
正则表达式是需要complie的,那么就会有时间上的消耗.
而String.split这个方法的调用如下:
public String[] split(String regex, int limit) {
return Pattern.compile(regex).split(this, limit);
}
调用的是Pattern.compile(regex)返回的Pattern对象的split方法.
而Pattern.complie(regex)方法是直接new Pattern(regex,0);
private Pattern(String p, int f) {
pattern = p;
flags = f;

// Reset group index count
capturingGroupCount = 1;
localCount = 0;

if (pattern.length() > 0) {
compile();
} else {
root = new Start(lastAccept);
matchRoot = lastAccept;
}
}

在深入看一下compile()方法,
其中最后一句如下:
compiled = true;

这个是一个:
/**
* Boolean indicating this Pattern is compiled; this is necessary in order
* to lazily compile deserialized Patterns.
*/
private transient volatile boolean compiled = false;

非静态的变量.

再来看看split方法:
其中有去获取Matcher对象的步骤:
Matcher m = matcher(input);

public Matcher matcher(CharSequence input) {
if (!compiled) {
synchronized(this) {
if (!compiled)
compile();
}
}
Matcher m = new Matcher(this, input);
return m;
}

如果compile()没有进行的话,就进行一个compile()步骤.

问题就在整理了,使用String.split(regex)方法是直接new 一个Pattern实例,每次都会进行一次compile(),消耗就在这里了...

所以说,如果可以的话,最后是定义好一个Pattern供所有的地方调用....

再来说一下Pattern的split(regex,limit)的方法.
这里面有limit参数,限制了匹配的次数.如果是0的话,便是总是去匹配,如果大于0,进行有效匹配.
使用例子来:
String s = "a,b,c:e,f,g";
String[] split = s.split(",");
System.out.println(Arrays.toString(split));

String[] split3 = s.split(",", 1);//得到长度为0
System.out.println(split3.length);
System.out.println(Arrays.toString(split3));

String[] split4 = s.split(",", 3);//得到长度为3
System.out.println(split4.length);
System.out.println(Arrays.toString(split4));
结果:
[a, b, c:e, f, g]
1
[a,b,c:e,f,g]
3
[a, b, c:e,f,g]

iteye_21298

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
String split的更好认识

今天查看了String.split的实现方式,对于其中的实现原理与弊端进行一个更好的认识.1.split的参数是一个regex,正则表达式.要说到正则表达式,那就避免不了特殊字符的转义,比如+号,|号等都需要继续转义的.2.性能优劣正则表达式是需要complie的,那么就会有时间上的消耗.而String.split这个方法的调用如下:public String[] ...
复制链接

扫一扫