String对象中用一个数组value来储存字符串,subString()函数的源码如下所示:
public String substring(int beginIndex, int endIndex) {
if (beginIndex < 0) {
throw new StringIndexOutOfBoundsException(beginIndex);
}
if (endIndex > value.length) {
throw new StringIndexOutOfBoundsException(endIndex);
}
int subLen = endIndex - beginIndex;
if (subLen < 0) {
throw new StringIndexOutOfBoundsException(subLen);
}
return ((beginIndex == 0) && (endIndex == value.length)) ? this
: new String(value, beginIndex, subLen);
}
可明显地看出,该函数会调用 new String(value, beginIndex, subLen)返回一个新的String对象,该构造器的源码如下所示:
String(int offset, int count, char value[]) {
this.value = value;
this.offset = offset;
this.count = count;
}
数组value被传入后,会被保存下来,而不是保存截取的部分。因此调用subString生成的对象保留了原生字符串的全部类容并占据了相应的内存空间,通过偏移量来决定当前对象的取值。当字符串很大时,subString方法会生成新的同样大的对象,这样非常浪费内存。我们可以通过如下方法来缓解这种现象:
String A = new String("..."); //大字符串
String B = new String(A.subString(start,end));
将A.subString(start,end)生成的的对象传入String的构造器中,再次生成B对象,A.subString(start,end)生成的的对象由于没有外部引用,很快就会被垃圾回收机制回收,这样就避免了内存浪费。以上是JDK6中的源码,但是查阅JDK1.8中的源码如下:
public String(char value[], int offset, int count) {
if (offset < 0) {
throw new StringIndexOutOfBoundsException(offset);
}
if (count <= 0) {
if (count < 0) {
throw new StringIndexOutOfBoundsException(count);
}
if (offset <= value.length) {
this.value = "".value;
return;
}
}
// Note: offset or count might be near -1>>>1.
if (offset > value.length - count) {
throw new StringIndexOutOfBoundsException(offset + count);
}
this.value = Arrays.copyOfRange(value, offset, offset+count);
}
源码中调用了Arrays.copyOfRange函数,再看看其源码:
public static char[] copyOfRange(char[] original, int from, int to) {
int newLength = to - from;
if (newLength < 0)
throw new IllegalArgumentException(from + " > " + to);
char[] copy = new char[newLength];
System.arraycopy(original, from, copy, 0,
Math.min(original.length - from, newLength));
return copy;
}
可以看到Arrays.copyOfRange函数内部调用了System.arraycopy,并没有将value保存下来。因此JDK1.8减少了内存浪费,很大程度上缓解了subString()方法内存溢出的问题。