将字符串按照一定的规律转换成字符串数组,我们很容易想到使用String.split(String)方法。
的确String的split方法很方便,但是对于性能要求高的应用,string.split(String)将会花费更多的性能需求
我们可以使用java.util.StringTokenizer来代替String.split()方法,性能上也有一定的提升。
以下通过例子比较两者的性能消耗
String str = "abc";
StringBuffer buffer = new StringBuffer();
// prepare the string
for (int i = 0; i < 10000; i ++){
buffer.append(str).append(",");
}
str = buffer.toString();
// java.util.StringTokenizer
long curTime = System.currentTimeMillis();
for (int m = 0; m < 1000; m ++){
StringTokenizer token = new StringTokenizer(str, ",");
String[] array2 = new String[token.countTokens()];
int i = 0;
while (token.hasMoreTokens()){
array2[i++] = token.nextToken();
}
}
System.out.println("java.util.StringTokener : " + (System.currentTimeMillis() - curTime));
// String.split()
curTime = System.currentTimeMillis();
for (int m = 0; m < 1000; m ++){
String[] array = str.split(",");
}
System.out.println("String.split : " + (System.currentTimeMillis() - curTime));
curTime = System.currentTimeMillis();
for (int n = 0; n < 1000; n ++){
Vector<String> vector= new Vector<String>();
int index = 0, offset = 0;
while ((index = str.indexOf(",", index + 1)) != -1){
vector.addElement(str.substring(offset, index));
offset = index + 1;
}
String[] array3 = vector.toArray(new String[0]);
}
System.out.println("Vector & indexOf : " + (System.currentTimeMillis() - curTime));
输出----
java.util.StringTokener : 1407
String.split : 2546
Vector & indexOf : 1094
很显眼,使用StringTokenizer比使用Spring.split()提高接近一倍的性能。
而是用indexOf来逐步查找,性能还能进一步提高25%左右。很显然,越接近底层的方法性能越得到满足。
不过,这个只是在于对性能要求高的需求底下才有真正的意义。普通应用,String.split()足以
补充一点:
使用String.indexOf()去扫描的时候,如果使用ArrayList或者Vector(两者性能基本上没多大区别)也不是最优方案
还有可以提高更好的性能的方法,就是先扫描有多少个分割符,用String[] 来存贮,比使用Vector要提高一倍左右的性能
如果还需要更进一步,那么就需要使用好的扫描算法了。
public static String[] split(String s, String delimiter){
if (s == null) {
return null;
}
int delimiterLength;
int stringLength = s.length();
if (delimiter == null || (delimiterLength = delimiter.length()) == 0){
return new String[] {s};
}
// a two pass solution is used because a one pass solution would
// require the possible resizing and copying of memory structures
// In the worst case it would have to be resized n times with each
// resize having a O(n) copy leading to an O(n^2) algorithm.
int count;
int start;
int end;
// Scan s and count the tokens.
count = 0;
start = 0;
while((end = s.indexOf(delimiter, start)) != -1){
count++;
start = end + delimiterLength;
}
count++;
// allocate an array to return the tokens,
// we now know how big it should be
String[] result = new String[count];
// Scan s again, but this time pick out the tokens
count = 0;
start = 0;
while((end = s.indexOf(delimiter, start)) != -1){
result[count] = (s.substring(start, end));
count++;
start = end + delimiterLength;
}
end = stringLength;
result[count] = s.substring(start, end);
return (result);
}