从源码和测试多方面深度解读Java的String类对象的内存分布

开门见山:String无论如何创建都是在堆中,堆中有普通区域(new创建的就在这里)、字符串常量池(字符串字面量)。接下来是解读过程。

JDK8 注释

JDK8的源码中,String的注释如下:

The String class represents character strings. All string literals in Java programs, such as "abc", are implemented as instances of this class.
Strings are constant; their values cannot be changed after they are created. String buffers support mutable strings. Because String objects are immutable they can be shared. For example:
       String str = "abc";
   
is equivalent to:
       char data[] = {'a', 'b', 'c'};
       String str = new String(data);
   
Here are some more examples of how strings can be used:
       System.out.println("abc");
       String cde = "cde";
       System.out.println("abc" + cde);
       String c = "abc".substring(2,3);
       String d = cde.substring(1, 2);
   
The class String includes methods for examining individual characters of the sequence, for comparing strings, for searching strings, for extracting substrings, and for creating a copy of a string with all characters translated to uppercase or to lowercase. Case mapping is based on the Unicode Standard version specified by the Character class.
The Java language provides special support for the string concatenation operator ( + ), and for conversion of other objects to strings. String concatenation is implemented through the StringBuilder(or StringBuffer) class and its append method. String conversions are implemented through the method toString, defined by Object and inherited by all classes in Java. For additional information on string concatenation and conversion, see Gosling, Joy, and Steele, The Java Language Specification.
Unless otherwise noted, passing a null argument to a constructor or method in this class will cause a NullPointerException to be thrown.
A String represents a string in the UTF-16 format in which supplementary characters are represented by surrogate pairs (see the section Unicode Character Representations in the Character class for more information). Index values refer to char code units, so a supplementary character uses two positions in a String.
The String class provides methods for dealing with Unicode code points (i.e., characters), in addition to those for dealing with Unicode code units (i.e., char values).
Since:
JDK1.0
See Also:
Object.toString(), StringBuffer, StringBuilder, Charset
Author:
Lee Boynton, Arthur van Hoff, Martin Buchholz, Ulf Zibis

翻译就是:
String类表示字符串。Java程序中的所有字符串字面值,如"abc",都是作为该类的实例实现的。
字符串常量;它们的值在创建后不能更改。字符串缓冲区支持可变字符串。因为String对象是不可变的,所以它们可以被共享。例如String str = “abc” 和

char data[] = {'a', 'b', 'c'};
String str = new String(data);

是等价的。
而且还有更多如何使用字符串的例子:

System.out.println("abc");
String cde = "cde";
System.out.println("abc" + cde);
String c = "abc".substring(2,3);
String d = cde.substring(1, 2);

String类包括用于检查序列中的单个字符、比较字符串、搜索字符串、提取子字符串和创建所有字符转换为大写或小写的字符串副本的方法。Case映射基于Character类指定的Unicode标准版本。
Java语言为字符串连接操作符(+)以及将其他对象转换为字符串提供了特殊支持。字符串连接是通过StringBuilder(或StringBuffer)类及其append方法实现的。字符串转换是通过toString方法实现的,该方法由Object定义,并被Java中的所有类继承。有关字符串连接和转换的更多信息,请参阅Gosling、Joy和Steele的《Java语言规范》。
除非另有说明,否则将null参数传递给该类中的构造函数或方法将导致抛出NullPointerException。
String表示UTF-16格式的字符串,其中补充字符由代理项对表示(有关更多信息,请参阅Character类中的Unicode字符表示部分)。索引值指的是字符代码单元,因此补充字符在String中使用两个位置。
String类除了提供处理Unicode代码单元(即char值)的方法外,还提供了处理Unicode代码点(即字符)的方法。
从 JDK1.0 开始
也可以从以下来查看
Object.toString(), StringBuffer, StringBuilder, Charset
作者
Lee Boynton, Arthur van Hoff, Martin Buchholz, Ulf Zibis

String类声明

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence

final:不可继承
java.io.Serializable:可序列化
Comparable:支持字符串大小比较(compareTo方法)
CharSequence:字符序列

CharSequence是一个可读的字符值序列。这个接口提供了对许多不同类型字符序列的统一的只读访问。字符值表示基本多语言平面(BMP)中的字符或代理。详情请参阅Unicode字符表示。
这个接口没有细化equals方法和hashCode方法的通用契约。因此,比较两个实现CharSequence的对象的结果通常是未定义的。每个对象可以由不同的类实现,并且不能保证每个类都能够测试其实例与其他类的实例是否相等。因此,使用任意CharSequence实例作为集合中的元素或映射中的键是不合适的。
(来自JDK8 CharSequence 源码)

char[] 数组

private final char value[];

String类代表的字符串,就是这个数组存放的

equals() 方法

/**
     * Compares this string to the specified object.  The result is {@code
     * true} if and only if the argument is not {@code null} and is a {@code
     * String} object that represents the same sequence of characters as this
     * object.
     *
     * @param  anObject
     *         The object to compare this {@code String} against
     *
     * @return  {@code true} if the given object represents a {@code String}
     *          equivalent to this string, {@code false} otherwise
     *
     * @see  #compareTo(String)
     * @see  #equalsIgnoreCase(String)
     */
    public boolean equals(Object anObject) {
        if (this == anObject) {
            return true;
        }
        if (anObject instanceof String) {
            String anotherString = (String)anObject;
            int n = value.length;
            if (n == anotherString.value.length) {
                char v1[] = value;
                char v2[] = anotherString.value;
                int i = 0;
                while (n-- != 0) {
                    if (v1[i] != v2[i])
                        return false;
                    i++;
                }
                return true;
            }
        }
        return false;
    }

注释的意思就是:

将该字符串与指定对象进行比较。结果是{@code true}当且仅当参数不是{@code null}并且是{@code String}对象,表示与此相同的字符序列对象。

==的测试

a和b都是直接创建,c和d是new创建。
通过反射修改a和c的value,观察最终==的结果

void test4() {
        String a = "abcd";
        String b = "abcd";
        String c = new String("abcd");
        String d = new String("abcd");
        System.out.println("a == b is " + (a == b));
        System.out.println("c == d is " + (c == d));
        try {
            Field valueField = String.class.getDeclaredField("value");
            valueField.setAccessible(true);
            valueField.set(a, new char[]{'e', 'f', 'g'});
            valueField.set(c, new char[]{'e', 'f', 'g'});
            System.out.println("a is " + a);
            System.out.println("b is " + b);
            System.out.println("c is " + c);
            System.out.println("d is " + d);
            System.out.println("a == b is " + (a == b));
            System.out.println("c == d is " + (c == d));
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

结果:

a == b is true
c == d is false
a is efg
b is efg
c is efg
d is abcd
a == b is true
c == d is false

从测试结果猜测:
String字面量就是一个String对象,相同字面量的其实指向同个内存空间(堆中的字符串常量池),所以当value通过反射一改,其实两个指针依然是指向同个空间。
new创建的是堆中的类,两个字符串对象互不相干。

intern()方法

    /**
     * Returns a canonical representation for the string object.
     * <p>
     * A pool of strings, initially empty, is maintained privately by the
     * class {@code String}.
     * <p>
     * When the intern method is invoked, if the pool already contains a
     * string equal to this {@code String} object as determined by
     * the {@link #equals(Object)} method, then the string from the pool is
     * returned. Otherwise, this {@code String} object is added to the
     * pool and a reference to this {@code String} object is returned.
     * <p>
     * It follows that for any two strings {@code s} and {@code t},
     * {@code s.intern() == t.intern()} is {@code true}
     * if and only if {@code s.equals(t)} is {@code true}.
     * <p>
     * All literal strings and string-valued constant expressions are
     * interned. String literals are defined in section 3.10.5 of the
     * <cite>The Java&trade; Language Specification</cite>.
     *
     * @return  a string that has the same contents as this string, but is
     *          guaranteed to be from a pool of unique strings.
     */
    public native String intern();

注释翻译:
返回字符串对象的规范表示形式。
字符串池最初是空的,由String类私有地维护。
调用intern方法时,如果池中已经包含一个由equals(object)方法确定的与string对象相等的字符串,则返回池中的字符串。否则,该String对象将被添加到池中,并返回对该String对象的引用。
由此可见,对于任意两个字符串s和t,当且仅当 s.equals(t) 为真时,s.intern() == t.intern() 为真。
所有字面值字符串和字符串值常量表达式都是内嵌的。字符串字面值在Java语言规范的3.10.5节中定义。
返回与此字符串具有相同内容的字符串,但保证来自唯一字符串池。

测试intern()

    void test5() {
        String a = "abc";
        String b = "def";
        String c = new String("def");
        System.out.println("a.intern() == a is " + (a.intern() == a));
        System.out.println("c.intern() == b is " + (c.intern() == b));
        System.out.println("c.intern() == c is " + (c.intern() == c));
    }

结果:

a.intern() == a is true
c.intern() == b is true
c.intern() == c is false

猜测:intern就是返回字符串常量池中的String对象(的内存地址)。

字符串常量池

HotSpot在JDK 8时废弃永久代,将其中的内容移到元空间(使用本地内存Native Memory,不在虚拟机中),而字符串常量池则移动到堆中(运行时常量池在方法区,JDK8的元空间)。从逻辑上,字符串常量池还是属于方法区(毕竟方法区规定了放常量)。
在这里插入图片描述

方法区是逻辑上的东西,是JVM 的规范,所有虚拟机必须遵守的。存放JVM 所有线程共享的、用于存储类的信息、常量池、方法数据、方法代码等。永久代是JDK8之前的方法区实现,元空间是JDK8之后的方法区实现。

参考:https://blog.csdn.net/weixin_44556968/article/details/109468386
https://cloud.tencent.com/developer/article/1690589
https://blog.csdn.net/qq_38262266/article/details/107208357

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值