在JVM内存模型章节的常量池部分已经讲过常量池具体都有哪些内容,这篇文章就来详细讲解String字符串常量池。
以下绝大部分内容来自知乎R大朋友的回答:https://www.zhihu.com/question/55994121/answer/147296098
另推荐一篇文章:https://javaranch.com/journal/200409/ScjpTipLine-StringsLiterally.html
字符串字面量什么时候进入的字符串常量池?
首先上代码,没有什么比代码来的直接
//JDK7+
public class StringTest {
private static String s1 = "static";
public static void main(String[] args) {
String hello1 = new String("hell") + new String("o");
String hello2 = new String("he") + new String("llo");
String hello3 = hello1.intern();
String hello4 = hello2.intern();
System.out.println(hello1 == hello3); // true
System.out.println(hello1 == hello4); // true
}
}
下边是通过javap -verbose
命令反编译Class文件后的内容(Constant pool就是Class文件常量池的内容),先在这里放着,后边会用到。
Classfile /E:/workspace/VariousCases/target/classes/cn/onenine/jvm/constantpool/StringTest.class
Last modified 2021-8-3; size 1299 bytes
MD5 checksum 338bd0034155ec3bf8d608540a31761c
Compiled from "StringTest.java"
public class cn.onenine.jvm.constantpool.StringTest
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Class #2 // cn/onenine/jvm/constantpool/StringTest
#2 = Utf8 cn/onenine/jvm/constantpool/StringTest
#3 = Class #4 // java/lang/Object
#4 = Utf8 java/lang/Object
#5 = Utf8 s1
#6 = Utf8 Ljava/lang/String;
#7 = Utf8 <clinit>
#8 = Utf8 ()V
#9 = Utf8 Code
#10 = String #11 // static
#11 = Utf8 static
#12 = Fieldref #1.#13 // cn/onenine/jvm/constantpool/StringTest.s1:Ljava/lang/String;
#13 = NameAndType #5:#6 // s1:Ljava/lang/String;
#14 = Utf8 LineNumberTable
#15 = Utf8 LocalVariableTable
#16 = Utf8 <init>
#17 = Methodref #3.#18 // java/lang/Object."<init>":()V
#18 = NameAndType #16:#8 // "<init>":()V
#19 = Utf8 this
#20 = Utf8 Lcn/onenine/jvm/constantpool/StringTest;
#21 = Utf8 main
#22 = Utf8 ([Ljava/lang/String;)V
#23 = Class #24 // java/lang/StringBuilder
#24 = Utf8 java/lang/StringBuilder
#25 = Class #26 // java/lang/String
#26 = Utf8 java/lang/String
#27 = String #28 // hell
#28 = Utf8 hell
#29 = Methodref #25.#30 // java/lang/String."<init>":(Ljava/lang/String;)V
#30 = NameAndType #16:#31 // "<init>":(Ljava/lang/String;)V
#31 = Utf8 (Ljava/lang/String;)V
#32 = Methodref #25.#33 // java/lang/String.valueOf:(Ljava/lang/Object;)Ljava/lang/String;
#33 = NameAndType #34:#35 // valueOf:(Ljava/lang/Object;)Ljava/lang/String;
#34 = Utf8 valueOf
#35 = Utf8 (Ljava/lang/Object;)Ljava/lang/String;
#36 = Methodref #23.#30 // java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
#37 = String #38 // o
#38 = Utf8 o
#39 = Methodref #23.#40 // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#40 = NameAndType #41:#42 // append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#41 = Utf8 append
#42 = Utf8 (Ljava/lang/String;)Ljava/lang/StringBuilder;
#43 = Methodref #23.#44 // java/lang/StringBuilder.toString:()Ljava/lang/String;
#44 = NameAndType #45:#46 // toString:()Ljava/lang/String;
#45 = Utf8 toString
#46 = Utf8 ()Ljava/lang/String;
#47 = String #48 // he
#48 = Utf8 he
#49 = String #50 // llo
#50 = Utf8 llo
#51 = Methodref #25.#52 // java/lang/String.intern:()Ljava/lang/String;
#52 = NameAndType #53:#46 // intern:()Ljava/lang/String;
#53 = Utf8 intern
#54 = Fieldref #55.#57 // java/lang/System.out:Ljava/io/PrintStream;
#55 = Class #56 // java/lang/System
#56 = Utf8 java/lang/System
#57 = NameAndType #58:#59 // out:Ljava/io/PrintStream;
#58 = Utf8 out
#59 = Utf8 Ljava/io/PrintStream;
#60 = Methodref #61.#63 // java/io/PrintStream.println:(Z)V
#61 = Class #62 // java/io/PrintStream
#62 = Utf8 java/io/PrintStream
#63 = NameAndType #64:#65 // println:(Z)V
#64 = Utf8 println
#65 = Utf8 (Z)V
#66 = Utf8 args
#67 = Utf8 [Ljava/lang/String;
#68 = Utf8 hello1
#69 = Utf8 hello2
#70 = Utf8 hello3
#71 = Utf8 hello4
#72 = Utf8 StackMapTable
#73 = Class #67 // "[Ljava/lang/String;"
#74 = Utf8 SourceFile
#75 = Utf8 StringTest.java
{
static {};
descriptor: ()V
flags: ACC_STATIC
Code:
stack=1, locals=0, args_size=0
0: ldc #10 // String static
2: putstatic #12 // Field s1:Ljava/lang/String;
5: return
LineNumberTable:
line 11: 0
LocalVariableTable:
Start Length Slot Name Signature
public cn.onenine.jvm.constantpool.StringTest();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #17 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 10: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this Lcn/onenine/jvm/constantpool/StringTest;
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=5, locals=5, args_size=1
0: new #23 // class java/lang/StringBuilder
3: dup
4: new #25 // class java/lang/String
7: dup
8: ldc #27 // String hell
10: invokespecial #29 // Method java/lang/String."<init>":(Ljava/lang/String;)V
13: invokestatic #32 // Method java/lang/String.valueOf:(Ljava/lang/Object;)Ljava/lang/String;
16: invokespecial #36 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
19: new #25 // class java/lang/String
22: dup
23: ldc #37 // String o
25: invokespecial #29 // Method java/lang/String."<init>":(Ljava/lang/String;)V
28: invokevirtual #39 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
31: invokevirtual #43 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
34: astore_1
35: new #23 // class java/lang/StringBuilder
38: dup
39: new #25 // class java/lang/String
42: dup
43: ldc #47 // String he
45: invokespecial #29 // Method java/lang/String."<init>":(Ljava/lang/String;)V
48: invokestatic #32 // Method java/lang/String.valueOf:(Ljava/lang/Object;)Ljava/lang/String;
51: invokespecial #36 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
54: new #25 // class java/lang/String
57: dup
58: ldc #49 // String llo
60: invokespecial #29 // Method java/lang/String."<init>":(Ljava/lang/String;)V
63: invokevirtual #39 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
66: invokevirtual #43 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
69: astore_2
70: aload_1
71: invokevirtual #51 // Method java/lang/String.intern:()Ljava/lang/String;
74: astore_3
75: aload_2
76: invokevirtual #51 // Method java/lang/String.intern:()Ljava/lang/String;
79: astore 4
81: getstatic #54 // Field java/lang/System.out:Ljava/io/PrintStream;
84: aload_1
85: aload_3
86: if_acmpne 93
89: iconst_1
90: goto 94
93: iconst_0
94: invokevirtual #60 // Method java/io/PrintStream.println:(Z)V
97: getstatic #54 // Field java/lang/System.out:Ljava/io/PrintStream;
100: aload_1
101: aload 4
103: if_acmpne 110
106: iconst_1
107: goto 111
110: iconst_0
111: invokevirtual #60 // Method java/io/PrintStream.println:(Z)V
114: return
LineNumberTable:
line 13: 0
line 14: 35
line 15: 70
line 16: 75
line 17: 81
line 18: 97
line 20: 114
LocalVariableTable:
Start Length Slot Name Signature
0 115 0 args [Ljava/lang/String;
35 80 1 hello1 Ljava/lang/String;
70 45 2 hello2 Ljava/lang/String;
75 40 3 hello3 Ljava/lang/String;
81 34 4 hello4 Ljava/lang/String;
StackMapTable: number_of_entries = 4
frame_type = 255 /* full_frame */
offset_delta = 93
locals = [ class "[Ljava/lang/String;", class java/lang/String, class java/lang/String, class java/lang/String, class java/lang/String ]
stack = [ class java/io/PrintStream ]
frame_type = 255 /* full_frame */
offset_delta = 0
locals = [ class "[Ljava/lang/String;", class java/lang/String, class java/lang/String, class java/lang/String, class java/lang/String ]
stack = [ class java/io/PrintStream, int ]
frame_type = 79 /* same_locals_1_stack_item */
stack = [ class java/io/PrintStream ]
frame_type = 255 /* full_frame */
offset_delta = 0
locals = [ class "[Ljava/lang/String;", class java/lang/String, class java/lang/String, class java/lang/String, class java/lang/String ]
stack = [ class java/io/PrintStream, int ]
}
SourceFile: "StringTest.java"
在类加载阶段,JVM不会立即在堆中创建这些Class文件常量池中的字符串对象实例,并把创建完对象的堆中的引用放到字符串常量池中,而是在类的Resolve阶段执行,并且JVM规范里明确指定Resolve阶段可以是Lazy的。
在JVM规范中Class文件的常量池的类型,有两种东西:
- CONSTANT_Utf8
- CONSTANT_String
后者是String常量的类型,但它并不直接持有String常量的内容,而是只持有一个index,这个index所指定的另一个常量池必须是一个CONSTANT_Utf8类型的常量,这里才真正持有字符串的内容。查看上边通过javap反编译的内容也可以验证此结论, #10对应的是静态常量指向的**#11的"static"字符串常量引用。
在HotSpot VM中,运行时常量池里
- CONSTANT_Utf8 -> Symbol* (一个指针,指向一个Symbol类型的C++对象,内容是和Class文件同样格式的UTF-8编码的字符串)
- CONSTANT_String -> java.lang.String( 一个实际的Java对象的引用,C++类型是oop)
CONSTANT_Utf8会在类加载的过程中就全部创建出来,而CONSTANT_String则是lazy resolve的,例如在第一次引用该项的ldc指令被第一次执行到的时候才会resolve。尚未resolve的时候,HotSpot VM把它的类型叫做JVM_CONSTANT_UnresolvedString。
上边反编译的内容中也可以看到在112行中有ldc命令:8: ldc #27 // String hell
因此得出结论**:HotSpot中,加载类的时候字符串字面量会进入到当前类的运行时常量池,不会进入全局的字符串常量池(也就是String table中没有该字符串对象的引用,在堆中也没有相应的对象产生)
“ldc”命令是个啥?
大体意思就是:将int,float或String型常量值从常量池中推送至栈顶
在上边的反编译代码中第112行8: ldc #27 // String hell
的意思就是将"hell"推送至栈顶。
上边已经讲了字符串常量是lazy resolve的,那怎么推呢?是不是得有一个契机把对象创建,把对象的引用放到字符串常量池里去?****其实就是ldc
指令,这个指令就是触发lazy resolve动作的条件。
ldc字节码在这里的执行语义是:到当前类的运行时常量池(Runtime constant pool,HotSpot VM里指的是常量池 + 常量池缓存)去查找该index对应的项,如果该项尚未resolve就resolve它,并返回resolve后的内容。在遇到String类型常量时,resolve的过程如果发现String table(全局字符串常量池)已经有了内容匹配的java.lang.String的引用,则直接返回这个引用,反之,如果没有则会在Java堆里创建一个对应内容的String对象,然后在String table中记录下这个引用,并将这个引用返回。
**
也就是说,ldc指令是否会创建新的String实例,全看在第一次执行ldc指令时String table是否已经记录过同样内容的引用。
运行过程解析
解析之前首先我们先要了解String#intern到底做了什么
- JDK6中,intern方法会把首次遇到的字符串实例复制到永久代的字符串常量池中存储,返回的也是永久代里面这个字符串引用。
- JDK7及以后,intern方法实现就不需要再拷贝字符串的实例到永久代了,首先会判断常量池中是否已经有这个字符串了,如果有则直接返回常量池中它的引用,否则就将它的引用保存一份到字符串常量池中,然后返回这个引用,是返回引用,字符串常量池也就是String Table,这在JVM内存模型中已经讲过。
了解完String#intern之后,开始分析上边代码的运行过程,为了方便,我们把上边的代码复制过来:
//JDK7+
public class StringTest {
private static String s1 = "static";(1)
public static void main(String[] args) {
String hello1 = new String("hell") + new String("o"); (2)
String hello2 = new String("he") + new String("llo");(3)
String hello3 = hello1.intern();(4)
String hello4 = hello2.intern();(5)
System.out.println(hello1 == hello3); // true (6)
System.out.println(hello1 == hello4); // true (7)
}
}
(1):static变量s1,指向"static"字符串对象引用,查看反编译代码可查看也是通过ldc指令将"static"推到栈顶(创建字符串实例,将实例引用放到String table),然后通过putstatic指令赋值。
(2):查看反编译代码可知,会将"hello"和"o"通过ldc指令推到栈顶,还可以看到针对“+”号,编译器会创建StringBuilder对象并通过append方法将字符串拼接起来,并通过astore_1指令将拼接完的字符串赋值给第二个局部变量,重点:这里没有ldc命令将拼接后的字符串"hello"推到栈顶,也就是说没有将"hello"的引用放入String table。
(3):hello1调用了intern方法,hello1变量持有的是“hello”实例在堆中的引用,由于(2)没有将“hello”放入全局字符串常量池,因此(3)在调用inern方法时会将“hello”实例对象的引用放入String table,然后返回引用给hello3,因此这里hello3和hello1持有的引用是同一个(都是“hello”实例在堆中的地址)
(4):第五句hello2调用了intern方法,hello2持有的引用为“hello”实例在堆中的引用(与hello1持有的引用不同,虽然都是“hello”的实例),hello2调用intern方法时查找String table中发现有“hello”的引用(hello1持有的“hello”实例引用),就直接返回了String table中的引用并赋值给hello4。
**因此hello4、hello3和hello1持有的引用是同一个,hello2持有的引用与它们都不同。**通过一张图表示如下:
下边这张图用来表示下局部变量表、堆、字符串常量池之间的关系