【Java基础】String、StringBuilder、StringBuffer

最新推荐文章于 2024-04-27 18:02:07 发布

Brain_L

最新推荐文章于 2024-04-27 18:02:07 发布

阅读量119

点赞数

分类专栏： Java基础

本文链接：https://blog.csdn.net/weixin_39120845/article/details/83302136

版权

Java基础专栏收录该内容

16 篇文章 0 订阅

订阅专栏

String、StringBuilder、StringBuffer是经常会被拿来比较的三个类，本文主要研究三者之间的区别。

本文所用jdk为jdk1.8.0_151。

一、String源码

String的源码和注释已经列举了很多用法及原因，所以开头先贴出部分源码及注释，一些常用方法如charAt等本文暂不讨论。

/**
 * The {@code String} class represents character strings. All
 * string literals in Java programs, such as {@code "abc"}, are
 * implemented as instances of this class.
 * <p>
 * Strings are constant; their values cannot be changed after they
 * are created. String buffers support mutable strings.
 * Because String objects are immutable they can be shared. For example:
 * <blockquote><pre>
 *     String str = "abc";
 * </pre></blockquote><p>
 * is equivalent to:
 * <blockquote><pre>
 *     char data[] = {'a', 'b', 'c'};
 *     String str = new String(data);
 * </pre></blockquote><p>
 * Here are some more examples of how strings can be used:
 * <blockquote><pre>
 *     System.out.println("abc");
 *     String cde = "cde";
 *     System.out.println("abc" + cde);
 *     String c = "abc".substring(2,3);
 *     String d = cde.substring(1, 2);
 * </pre></blockquote>
 * <p>
 *
 *还有很多没有贴出来
*/




public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {
    /** The value is used for character storage. */
    private final char value[];

    /** Cache the hash code for the string */
    private int hash; // Default to 0


    /**
     * Allocates a new {@code String} so that it represents the sequence of
     * characters currently contained in the character array argument. The
     * contents of the character array are copied; subsequent modification of
     * the character array does not affect the newly created string.
     *
     * @param  value
     *         The initial value of the string
     */
    public String(char value[]) {
        this.value = Arrays.copyOf(value, value.length);
    }


    /**
     * Compares this string to the specified object.  The result is {@code
     * true} if and only if the argument is not {@code null} and is a {@code
     * String} object that represents the same sequence of characters as this
     * object.
     *
     * @param  anObject
     *         The object to compare this {@code String} against
     *
     * @return  {@code true} if the given object represents a {@code String}
     *          equivalent to this string, {@code false} otherwise
     *
     * @see  #compareTo(String)
     * @see  #equalsIgnoreCase(String)
     */
    public boolean equals(Object anObject) {
        if (this == anObject) {
            return true;
        }
        if (anObject instanceof String) {
            String anotherString = (String)anObject;
            int n = value.length;
            if (n == anotherString.value.length) {
                char v1[] = value;
                char v2[] = anotherString.value;
                int i = 0;
                while (n-- != 0) {
                    if (v1[i] != v2[i])
                        return false;
                    i++;
                }
                return true;
            }
        }
        return false;
    }



    /**
     * Returns a hash code for this string. The hash code for a
     * {@code String} object is computed as
     * <blockquote><pre>
     * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
     * </pre></blockquote>
     * using {@code int} arithmetic, where {@code s[i]} is the
     * <i>i</i>th character of the string, {@code n} is the length of
     * the string, and {@code ^} indicates exponentiation.
     * (The hash value of the empty string is zero.)
     *
     * @return  a hash code value for this object.
     */
    public int hashCode() {
        int h = hash;
        if (h == 0 && value.length > 0) {
            char val[] = value;

            for (int i = 0; i < value.length; i++) {
                h = 31 * h + val[i];
            }
            hash = h;
        }
        return h;
    }

    public String replace(char oldChar, char newChar) {
        if (oldChar != newChar) {
            int len = value.length;
            int i = -1;
            char[] val = value; /* avoid getfield opcode */

            while (++i < len) {
                if (val[i] == oldChar) {
                    break;
                }
            }
            if (i < len) {
                char buf[] = new char[len];
                for (int j = 0; j < i; j++) {
                    buf[j] = val[j];
                }
                while (i < len) {
                    char c = val[i];
                    buf[i] = (c == oldChar) ? newChar : c;
                    i++;
                }
                return new String(buf, true);
            }
        }
        return this;
    }

}

从源码可以看出几个需要注意的点：

1、String类是final的，该类不可继承。

2、类前的注释指出，String str = "abc"等效于String str = new String(data);其中data为{'a', 'b', 'c'}。查看该构造方法，就是将char数组拷贝到类中的value数组中，value数组为final的。也就是说，String实例化一个对象后，对象中存储的char数组就不能再改变了。这就是为什么说String类对象是不可变的原因。

3、由于String实例化后不可变，String中一些返回String的方法，并不是修改value后再返回，而是重新new一个String再返回，见replace、concat等方法的源码，最后返回时都是调用的new String()。

4、String重写了equals方法，比较的两个字符串对象中存储的value数组的每个字符是否相等，内容相等而不是地址相等。

5、String重写了hashcode方法，hash值和value数组中的每个字符交织在了一起。

二、字符串常量池

字符串常量池是讨论String绕不开的话题。日常开发中打交道最多的就是字符串，而很多时候前后操作的字符串是重复的，如果每次都为字符串重新分配空间，势必造成性能下降和浪费内存。JVM使用字符串常量池来解决这个问题。JDK8中字符串常量池在堆中。

同一个字符串常量在常量池中只会存在一份。下面看个例子：

	public String stringTest() {
		String s1 = "hello";
		String s2 = "hello";
		String s3 = new String("hello");

		System.out.println(s1 == s2);
		System.out.println(s1 == s3);

		return s2;
	}

output:
true
false

eclipse debug看下

编译的时候"hello"已经被存在了class文件中的常量池，运行时又被jvm加载到运行时常量区（只存在一个“hello”）。s1指向了常量区的"hello"，s2发现已经存在一个常量"hello"，所以并没有重新再分配一个"hello"，而是指向了和s1相同的同一对象，上图可以看到，s1和s2的id都是19。而s3是调用new来生成对象，new出的对象存在于堆中，所以s3和s1肯定指向的不是同一个对象，如上图，s3的id为24。尽管s3指向是堆上的空间，但是前面讲过，String对象最终都是存储在char数组value中的，s1、s2、s3中的value其实都是指向了常量区"hello"中的value，如上图所示三者的value的id都是25。

经过前面的分析，相信大家也就理解了三者之间的关系，也不难理解为什么两个输出一个是true，一个是false了。

三、字符串连接

String实例化后是不可变的，那么字符串连接（+）背后是怎么实现的呢？看个例子

	public String stringTest() {
		String s1 = "HelloString";
		String s2 = "Hello" + "String";
		String s3 = "Hello" + getTestString();
		final String s4 = "String";
		String s5 = "Hello" + s4;
		String s6 = s1 + s2;
		String s7 = "He" + "llo" + s2;
		String s8 = s2 + "He" + "llo";
		s2 += "he" + "llo";

		return s2;
	}

反编译如下：
	
	public String stringTest() {
		String s1 = "HelloString";
		String s2 = "HelloString";
		(new StringBuilder("Hello")).append(this.getTestString()).toString();
		String s4 = "String";
		String s5 = "HelloString";
		(new StringBuilder(String.valueOf(s1))).append(s2).toString();
		(new StringBuilder("Hello")).append(s2).toString();
		String s8 = s2 + "He" + "llo";
		s2 = s2 + "hello";

		return s2;
	}

我们看s2，"Hello" + "String"会被直接优化为"HelloString"，两个字符串常量相加，在编译时会被优化为拼接后的字符串。

s3为一个常量+一个函数返回值，在编译时是不知道返回值是多少的，所以无法优化，而是调用了StringBuilder的append来生成可变字符串，最终再通过toString转化成String类型。s6和s3类似。s5则不同于s3，是因为s4是final修饰的，编译期间可确定其值就是"String"，所以被优化为拼接后的字符串。而s7会先将可以拼接的常量拼接起来之后，再调用StringBuilder的append，其实这也是最大程度上的优化。s8和s7相比就是位置变化了下，但是可以看到，效果完全不同，先碰到不是常量的s2，后面的字符串拼接就不优化了。下面的s2自加又不一样了，常量拼接还是会被优化。

总结起来，就是编译期间可以确定的常量会被拼接在一起，不能确定的则调用StringBuilder去生成。

如果是多次拼接呢？

public String stringTest() {
		String s1 = "Hello";
		for (int i = 1; i < 10; i++) {
			s1 += "String";
		}
		
		return s1;
	}

javap反编译如下：

可以看到，每次循环都会new StringBuilder，append之后再调用toString，效率很低下，这种情况就可以采用StringBuilder，用append拼接完之后再调用toString，效率会有很大提升。

四、StringBuilder、StringBuffer

StringBuilder部分源码如下：

public final class StringBuilder
    extends AbstractStringBuilder
    implements java.io.Serializable, CharSequence
{

    /** use serialVersionUID for interoperability */
    static final long serialVersionUID = 4383685877147921099L;

    /**
     * Constructs a string builder with no characters in it and an
     * initial capacity of 16 characters.
     */
    public StringBuilder() {
        super(16);
    }
    
    public StringBuilder(String str) {
        super(str.length() + 16);
        append(str);
    }

}

继承的AbstractStringBuilder部分源码如下：
abstract class AbstractStringBuilder implements Appendable, CharSequence {
    /**
     * The value is used for character storage.
     */
    char[] value;

    /**
     * The count is the number of characters used.
     */
    int count;

    /**
     * This no-arg constructor is necessary for serialization of subclasses.
     */
    AbstractStringBuilder() {
    }

    /**
     * Creates an AbstractStringBuilder of the specified capacity.
     */
    AbstractStringBuilder(int capacity) {
        value = new char[capacity];
    }
}

StringBuilder也是final修饰的，即不可继承。

字符串存储在AbstractStringBuilder中定义的可变数组value中，默认的初始容量为16。

StringBuffer和StringBuilder基本相同，区别就是StringBuffer是线程安全的，方法前有synchronized修饰。源码在此就不贴出来了。

五、三者之间的区别

前面对三者分别进行了介绍，对三者的区别和联系做个总结：

1、三者都是不可继承的。

2、String对象存储的字符串不可变，而StringBuilder、StringBuffer可变，因此对String对象的拼接操作实际上是转换成StringBuilder实现的。

3、StringBuilder不是线程安全的，因此多线程情况下要用线程安全的StringBuffer。由于StringBuffer有锁的操作，所以单线程情况下StringBuffer的性能是不如StringBuilder的。

4、如果是字符串常量直接相加，编译期会进行优化，所以用String效率更高。如果有大量的非常量的字符串拼接，用String的话，会进行频繁的String和StringBuilder之间的转换，效率不如直接用StringBuilder进行append操作。所以不同场景下需考虑使用哪种类型效率更高。

以上，如有错误，还望指正。

Brain_L

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【Java基础】String、StringBuilder、StringBuffer

String、StringBuilder、StringBuffer是经常会被拿来比较的三个类，本文主要研究三者之间的区别。本文所用jdk为jdk1.8.0_151。一、String源码String的源码和注释已经列举了很多用法及原因，所以开头先贴出部分源码及注释，一些常用方法如charAt等本文暂不讨论。/** * The {@code String} class repres...
复制链接

扫一扫

专栏目录