在面试的时候,我们经常问碰到字符串相关的问题,问一段代码会产生几个对象,今天我们选取几个典型的例子,通过字节码以及字符串常量池来简单分析一下。
问题一
public class Test1 {
public static void main(String[] args) {
String s = "a"+"b"+"c";
}
}
首先编译java文件,生成class文件
javac Test1.java
然后生成字节码
javap -c Test1.class > log1.txt
javap -verbose Test1.class > log2.txt
上面两句,第一句只是简单的打印出字节码,第二句打印出的信息更为详细,我们为了看到常量池,所以使用第二句,我这里得到的结果如下
Classfile /Users/aaron/work/java/src/com/jvm/Test1.class
Last modified 2018-4-4; size 271 bytes
MD5 checksum 1a7c3e14f23649c4fba890dc20bd1683
Compiled from "Test1.java"
public class Test1
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #4.#13 // java/lang/Object."<init>":()V
#2 = String #14 // abc
#3 = Class #15 // Test1
#4 = Class #16 // java/lang/Object
#5 = Utf8 <init>
#6 = Utf8 ()V
#7 = Utf8 Code
#8 = Utf8 LineNumberTable
#9 = Utf8 main
#10 = Utf8 ([Ljava/lang/String;)V
#11 = Utf8 SourceFile
#12 = Utf8 Test1.java
#13 = NameAndType #5:#6 // "<init>":()V
#14 = Utf8 abc
#15 = Utf8 Test1
#16 = Utf8 java/lang/Object
{
public Test1();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 2: 0
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=1, locals=2, args_size=1
0: ldc #2 // String abc
2: astore_1
3: return
LineNumberTable:
line 5: 0
line 6: 3
}
SourceFile: "Test1.java"
这里 #1到#16代表了常量池里的对象,可以看出只有一个String对象abc
我们再看下main方法相关的字节码,首先
0: ldc #2 // String abc
将常量加载到操作数栈,可以看出,这里的常量直接就是abc了,那”a”+”b”+”c”是什么时候变成abc的呢?
首先,我们简要的介绍下javac编译的流程
1、解析与填充符号表
1) 词法、语法分析
2) 填充符号表
2、注解处理器
3、语义分析与字节码生成
1) 标注检查
2) 数据及控制流分析
3) 解语法糖
4) 字节码生成
那么,在标注检查的过程中,有一个关键的操作叫常量折叠,所以这边的”a”+”b”+”c”直接折叠成abc了(1+2也是会折叠成3的)。
常量加载到操作栈后,调用 astore_1 将栈顶元素也就是abc指向本地变量s就return了,所以很明显这里只创建了一个对象。
问题二
package com.jvm;
public class Test2 {
public static void main(String[] args) {
String s = new String("abc");
}
}
生成的字节码如下
Classfile /Users/aaron/work/java/src/com/jvm/Test2.class
Last modified 2018-4-4; size 342 bytes
MD5 checksum 03699a3397f3795430f6d40608b5f2ad
Compiled from "Test2.java"
public class com.jvm.Test2
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #6.#15 // java/lang/Object."<init>":()V
#2 = Class #16 // java/lang/String
#3 = String #17 // abc
#4 = Methodref #2.#18 // java/lang/String."<init>":(Ljava/lang/String;)V
#5 = Class #19 // com/jvm/Test2
#6 = Class #20 // java/lang/Object
#7 = Utf8 <init>
#8 = Utf8 ()V
#9 = Utf8 Code
#10 = Utf8 LineNumberTable
#11 = Utf8 main
#12 = Utf8 ([Ljava/lang/String;)V
#13 = Utf8 SourceFile
#14 = Utf8 Test2.java
#15 = NameAndType #7:#8 // "<init>":()V
#16 = Utf8 java/lang/String
#17 = Utf8 abc
#18 = NameAndType #7:#21 // "<init>":(Ljava/lang/String;)V
#19 = Utf8 com/jvm/Test2
#20 = Utf8 java/lang/Object
#21 = Utf8 (Ljava/lang/String;)V
{
public com.jvm.Test2();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 3: 0
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=3, locals=2, args_size=1
0: new #2 // class java/lang/String
3: dup
4: ldc #3 // String abc
6: invokespecial #4 // Method java/lang/String."<init>":(Ljava/lang/String;)V
9: astore_1
10: return
LineNumberTable:
line 5: 0
line 6: 10
}
SourceFile: "Test2.java"
可以看出,#2是一个Class对象,类型是String,也就是通过new String()生成的,而 #3则是一个String对象,值是abc,记住,当我们第一次引用一个字符串时,会成常量池中生成一个String对象,如果下次用到相同的字符串,则会引用之前生成的对象。
看完常量池,我们继续分析字节码
0: new #2 // class java/lang/String
首先,创建一个String对象,并将其引用放到栈顶
3: dup
然后,复制该引用,此时栈上有两个对象引用
4: ldc #3 // String abc
接着,将值abc复制到栈顶,这是String()初始化需要的参数
6: invokespecial #4 // Method java/lang/String."<init>":(Ljava/lang/String;)V
当参数准备就绪,就会调用String的init方法,这是对象的默认初始化方法,生成字节码时会自动帮我们生成。
这里对象初始化有两个参数,第一个参数是String引用,也就是我们常用的this对象,第二个参数则是值abc了。
由于两个引用指向同一个对象,那么对象初始化完成后,第一个引用也就有值了。
9: astore_1
最后,将第一个引用指向的对象赋值给变量s,并返回,分析结束。所以,这里创建了两个对象。
问题三
package com.jvm;
public class Test3 {
public static void main(String[] args) {
String a = "a";
String b="c";
String c="a";
String s = a+b+c;
}
}
字节码如下
Classfile /Users/aaron/work/java/src/com/jvm/Test3.class
Last modified 2018-4-4; size 471 bytes
MD5 checksum 3bb78402acc3a2e607a0d50705125671
Compiled from "Test3.java"
public class com.jvm.Test3
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #9.#18 // java/lang/Object."<init>":()V
#2 = String #19 // a
#3 = String #20 // c
#4 = Class #21 // java/lang/StringBuilder
#5 = Methodref #4.#18 // java/lang/StringBuilder."<init>":()V
#6 = Methodref #4.#22 // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#7 = Methodref #4.#23 // java/lang/StringBuilder.toString:()Ljava/lang/String;
#8 = Class #24 // com/jvm/Test3
#9 = Class #25 // java/lang/Object
#10 = Utf8 <init>
#11 = Utf8 ()V
#12 = Utf8 Code
#13 = Utf8 LineNumberTable
#14 = Utf8 main
#15 = Utf8 ([Ljava/lang/String;)V
#16 = Utf8 SourceFile
#17 = Utf8 Test3.java
#18 = NameAndType #10:#11 // "<init>":()V
#19 = Utf8 a
#20 = Utf8 c
#21 = Utf8 java/lang/StringBuilder
#22 = NameAndType #26:#27 // append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#23 = NameAndType #28:#29 // toString:()Ljava/lang/String;
#24 = Utf8 com/jvm/Test3
#25 = Utf8 java/lang/Object
#26 = Utf8 append
#27 = Utf8 (Ljava/lang/String;)Ljava/lang/StringBuilder;
#28 = Utf8 toString
#29 = Utf8 ()Ljava/lang/String;
{
public com.jvm.Test3();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 3: 0
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=5, args_size=1
0: ldc #2 // String a
2: astore_1
3: ldc #3 // String c
5: astore_2
6: ldc #2 // String a
8: astore_3
9: new #4 // class java/lang/StringBuilder
12: dup
13: invokespecial #5 // Method java/lang/StringBuilder."<init>":()V
16: aload_1
17: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
20: aload_2
21: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
24: aload_3
25: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
28: invokevirtual #7 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
31: astore 4
33: return
LineNumberTable:
line 6: 0
line 7: 3
line 8: 6
line 9: 9
line 10: 33
}
SourceFile: "Test3.java"
可见,#2 #3 #4 分别代表了字符串a、字符串c和对象StringBuilder,继续品尝字节码
0: ldc #2 // String a
2: astore_1
3: ldc #3 // String c
5: astore_2
6: ldc #2 // String a
8: astore_3
首先,将值分别赋给对应的变量,可以看出,相同的字符串a只会在常量池中存在一份
9: new #4 // class java/lang/StringBuilder
12: dup
同样创建两个引用,一个用于后续操作,一个用于在新的init栈桢中使用
13: invokespecial #5 // Method java/lang/StringBuilder."<init>":()V
16: aload_1
17: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
20: aload_2
21: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
24: aload_3
25: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
28: invokevirtual #7 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
初始化后,将三个变量的值依次加载到栈顶,并调用append方法进行拼接。看到这里很明显了,对于String对象的拼接操作,在字节码层面会默认转换成
StringBuilder操作,可以节省String对象的分配。添加完成后,通过toString转换成String对象。
31: astore 4
33: return
最后,显然又是赋值给变量s然后返回的操作了。所以,大家也能看出,这里创建了三个对象,字符串a、c和StringBuilder喽。
由于学习虚拟机时间不久,分析的过程可能有些纰漏,还请各位看官指正。