Java String的比较是平常开发常常遇到的坑,也是各种面试经常考察的问题,本文从Java编译后字节码的出发,来深入理解这个问题的本质,以使得我们对String有更加深入的理解。
public class TestContantPool
{
private static String a = "1234";
final String x = "34";
final String y;
public TestContantPool()
{
y = "34";
}
public void test()
{
String b = "1234";
String c = "12" + "34";
String d = new String("12345");
final String e = "34";
String f = "34";
String g = "12" + e;
String h = "12" + f;
String i = "12" + x;
String j = "12" + y;
System.out.println(a == b);
System.out.println(a == c);
System.out.println(a != d);
System.out.println(a == g);
System.out.println(a != h && a == h.intern());
System.out.println(a == i && a == i.intern());
System.out.println(a != j && a == j.intern());
}
public static void main(String[] ar)
{
new TestContantPool().test();
}
}
/**
true
true
true
true
true
true
true
*/
编译之后,我们来分析一下部分字节码,重点在于常量池(Constant Pool),静态块static{},构造方法和test()方法:
javap -verbose TestContantPool.class
Constant pool:
#5 = Utf8 a
#7 = Utf8 x
#9 = String #10 // 34
#10 = Utf8 34
#11 = Utf8 y
#15 = String #16 // 1234
#16 = Utf8 1234
#31 = Class #32 // java/lang/String
#32 = Utf8 java/lang/String
#33 = String #34 // 12345
#34 = Utf8 12345
#39 = Utf8 java/lang/StringBuilder
#40 = String #41 // 12
#41 = Utf8 12
#66 = Utf8 b
#67 = Utf8 c
#68 = Utf8 d
#69 = Utf8 e
#70 = Utf8 f
#71 = Utf8 g
#72 = Utf8 h
#73 = Utf8 i
#74 = Utf8 j
final java.lang.String x;
descriptor: Ljava/lang/String;
flags: ACC_FINAL
ConstantValue: String 34
final java.lang.String y;
descriptor: Ljava/lang/String;
flags: ACC_FINAL
static {};
descriptor: ()V
flags: ACC_STATIC
Code:
stack=1, locals=0, args_size=0
0: ldc #15 // String 1234
2: putstatic #17 // Field a:Ljava/lang/String;
5: return
LineNumberTable:
line 6: 0
LocalVariableTable:
Start Length Slot Name Signature
public bytecode.TestContantPool();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=2, locals=1, args_size=1
0: aload_0
1: invokespecial #22 // Method java/lang/Object."<init>":()V
4: aload_0
5: ldc #9 // String 34
7: putfield #24 // Field x:Ljava/lang/String;
10: aload_0
11: ldc #9 // String 34
13: putfield #26 // Field y:Ljava/lang/String;
16: return
LocalVariableTable:
Start Length Slot Name Signature
0 17 0 this Lbytecode/TestContantPool;
public void test();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=3, locals=10, args_size=1
0: ldc #15 // String 1234
2: astore_1
3: ldc #15 // String 1234
5: astore_2
6: new #31 // class java/lang/String
9: dup
10: ldc #33 // String 12345
12: invokespecial #35 // Method java/lang/String."<init>":(Ljava/lang/String;)V
15: astore_3
16: ldc #9 // String 34
18: astore 4
20: ldc #9 // String 34
22: astore 5
24: ldc #15 // String 1234
26: astore 6
28: new #38 // class java/lang/StringBuilder
31: dup
32: ldc #40 // String 12
34: invokespecial #42 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
37: aload 5
39: invokevirtual #43 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)
42: invokevirtual #47 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
45: astore 7
47: ldc #15 // String 1234
49: astore 8
51: new #38 // class java/lang/StringBuilder
54: dup
55: ldc #40 // String 12
57: invokespecial #42 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
60: aload_0
61: getfield #26 // Field y:Ljava/lang/String;
64: invokevirtual #43 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)
67: invokevirtual #47 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
70: astore 9
235: return
1, 首先分析ConstantPool,可以看到编译之后常量池中就有了String的字面量"12","34",1234"。下面代码中所有的其他String的创建都离不开常量池中的这些字面量。
2, 再看类变量a的初始化过程,根据虚拟机类加载机制可以知道,static变量和static{}代码块会在编译生成的<clinit>()方法中执行。<clinit>()是由编译器自动收集类中的所有 static变量的赋值动作和static{}代码块中的语句合并产生的,<clinit>()方法如果存在,会在实例构造函数<init>()执行。
下面两句代码是对static变量a的赋值语句,可以看出来a是从常量池中取值的:
0: ldc #15 // String 1234 (将int,float,String型常量值从常量池推送至栈顶)
2: putstatic #17 // Field a:Ljava/lang/String; (为指定的类的静态域赋值)
3,最后分析test()方法。
b和c 的赋值"1234",都是从常量池中取值:
0: ldc #15 // String 1234 (将int,float,String型常量值从常量池推送至栈顶)
2: astore_1 (将栈顶引用型数值存入第二个本地变量)
3: ldc #15 // String 1234
5: astore_2 (将栈顶引用型数值存入第三个本地变量)
d的赋值"1234",可以看到是重新调用了new了一个String对象:
6: new #31 // class java/lang/String (创建一个对象,并且将其引用压入栈顶)
9: dup (赋值栈顶数值,并且将复制值压入栈顶)
10: ldc #33 // String 12345 (将int,float,String型常量值从常量池推送至栈顶)
12: invokespecial #35 (调用超类构造方法,实例初始化方法)
15: astore_3 (将栈顶引用型数值存入第四个本地变量)
e,f,g的赋值,都是从常量池中取值:
16: ldc #9 // String 34
18: astore 4
20: ldc #9 // String 34
22: astore 5
24: ldc #15 // String 1234
26: astore 6
h的赋值,使用StringBuilder new一个String(从中可以看到,String 的"+"操作是用StringBuilder实现的):
28: new #38 // class java/lang/StringBuilder
31: dup
32: ldc #40 // String 12
34: invokespecial #42 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
37: aload 5
39: invokevirtual #43 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)
42: invokevirtual #47 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
45: astore 7
i的赋值,常量池去取:
47: ldc #15 // String 1234
49: astore 8
j的赋值,使用StringBuilder new一个String(从中可以看到,String 的"+"操作是用StringBuilder实现的):
51: new #38 // class java/lang/StringBuilder
54: dup
55: ldc #40 // String 12
57: invokespecial #42 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
60: aload_0
61: getfield #26 // Field y:Ljava/lang/String;
64: invokevirtual #43 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)
67: invokevirtual #47 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
70: astore 9
根据以上分析,只要是从常量池中取值的,"=="比较都是true;其他的或者是new String()产生的,或者是new StringBuilder()产生的, 都是在Heap区新建对象,因此都不相等。按照这种逻辑,就能够理解String对象的比较问题了。