Java虚拟机编译

最新推荐文章于 2024-03-07 18:00:49 发布

冰火两重天

最新推荐文章于 2024-03-07 18:00:49 发布

阅读量1k

点赞数

分类专栏： Java虚拟机

本文链接：https://blog.csdn.net/sprayabc/article/details/8578187

版权

Java虚拟机专栏收录该内容

27 篇文章 0 订阅

订阅专栏

本文深入探讨了Java虚拟机(JVM)字节码的结构、执行流程及控制转移、算术运算、访问运行时常量、方法调用、参数传递、局部变量表、数组操作、switch语句编译、操作数栈使用、异常处理和同步机制等关键特性，详细解释了Java编译器如何将高级语言转换为高效且平台独立的字节码。

摘要由CSDN通过智能技术生成

指令格式如下：

<index> <opcode> [<operand1> [<operand2>...]] [<comment>]

<index>是code[]数组中的指令的操作码索引。此处的code[]数组就是存储当前方法的JVM字节码的Code属性中的code数组。也可以认为<index>是相对于方法起始处的字节偏移量。<opcode>为指令的操作码的助记符号，<operandN>是指令的操作数，一条指令可以有0到多个操作数。每条指令之前的<index>可以作为控制转移指令的条转目标。需要注意的是，JVM控制转移指令的实际操作数是在当前指令的操作码集合中的地址偏移量。
每一行中，在表示运行时常量池索引操作数前，会以"#"开头。

10 ldc #1 // Push float constant 100.0

常量、局部变量的使用和控制结构

void spin() {
    int i;
    for (i = 0; i < 100; i++) {
         ; // Loop body is empty
    }
}

编译后的代码：

Method void spin()
0 iconst_0 // Push int constant 0
1 istore_1 // Store into local variable 1 (i=0)
2 goto 8 // First time through don’t increment
5 iinc 1 1 // Increment local variable 1 by 1 (i++)
8 iload_1 // Push local variable 1 (i)
9 bipush 100 // Push int constant 100
11 if_icmplt 5 // Compare and loop if less than (i < 100)
14 return // Return void when done

Java虚拟机是基于栈架构设计的，它的大多数操作都是从当前栈帧的操作数栈取出1个或多个操作数，或将结果压入操作数栈。每个方法调用，都会创建一个新的栈帧，并创建对应方法需要的操作数栈和局部变量表。每个线程在运行时的任意时刻，都会包含若干由不同方法嵌套调用而产生的栈帧，当然也包括了栈帧内部的操作数栈。

Java虚拟机经常利用操作码隐式包含操作数，如指令iconst_<i>中的i表示Int类型常量-1，0，1，2，3，4，5。这样iconst_0不需要专门为入栈操作保存一个立即操作数的值。

因为指令操作的值是来自于操作数栈中出栈的值，而不是操作局部变量本身，故在JVM已编译的代码中，在局部变量表和操作数栈之间传输值的指令很常见。如果使用(以及重用)局部变量由编译器决定，尤其对load和store指令，编译器尽可能重用局部变量表，这样使得代码高效，简洁，占用的内存少。
某些局部变量频繁进行的操作，在JVM中也有支持，iinc指令对局部变量加上一个长度为1字节有符号的递增量。
循环实现：

5 iinc 1 1 // Increment local 1 by 1 (i++)
8 iload_1 // Push local variable 1 (i)
9 bipush 100 // Push int constant 100
11 if_icmplt 5 // Compare and loop if less than (i < 100)

算术运算

Java虚拟机基于操作数栈来进行算术运算(iinc指令除外，直接对局部变量自增操作)。

int align2grain(int i, int grain) {
    return ((i + grain-1) & ~(grain-1));
}

算术运算使用到的操作数都是从操作数栈中弹出的,运算结果被压回操作数栈中。在内部运算时,中间运算也可以被当操作数使用。如~(grain-1)

iload_2 // Push grain
iconst_1 // Push int constant 1
isub // Subtract; push result
iconst_m1 // Push int constant −1
ixor // Do XOR; push result

访问运行时常量

很多数值常量，以及对象，字段和方法，都是通过当前类的运行时常量池进行访问。类型为int,long,float,double的数据，以及string实例的引用类型数据的访问将由ldc,ldc_w,ldc_w指令实现。
ldc和ldc_w指令访问运行时常量池中的对象，包括string实例，但不包括double和long类型的值。当运行时常量池的项目多过256(一个字节表示范围)时，需要ldc_w指令取代ldc指令来访问常量池。ldc2_w访问类型为double和Long的运行时常量池。对于整型常量，如byte,char,short和Int，将编译到代码之中，使用bipush,sipush和iconst_<i>指令进行访问，某些浮点常量也可以编译进代码使用fconst_<f>和dconst_<d>指令访问。

接收参数

若传递了n个参数给某个实例方法，则当前栈帧会按照约定的顺序接收这些参数，将它们保存为方法的第一个至第n个局部变量表中。实例方法需要传递一个自身实例的引用作为第0个局部变量，static方法不需要传递实例引用，所以不需要使用第0个局部变量表来保存this。

int addTwo(int i, int j) {
    return i + j;
}
//编译后代码
Method int addTwo(int,int)
0 iload_1 // Push value of local variable 1 (i)
1 iload_2 // Push value of local variable 2 (j)
2 iadd // Add; leave int result on operand stack
3 ireturn // Return int result

方法调用

对普通实例方法调用是在运行时根据对象类型进行分派的，通过invokevirtual指令实现，每条invokevirtual指令都会带有一个表示索引的参数，运行时常量池在该索引处的项为某个方法的符号引用，此符号引用可以提供方法所在对象的类型的内部二进制名称、方法名称和方法描述。

int add12and13() {
return addTwo(12, 13);
}
//编译后代码
Method int add12and13()
0 aload_0 // Push local variable 0 (this)
1 bipush 12 // Push int constant 12
3 bipush 13 // Push int constant 13
5 invokevirtual #4 // Method Example.addtwo(II)I
8 ireturn

方法调用过程：
第一，将当前实例自身引用压入操作数栈。
第二，传递方法的参数值，int值12和13入栈，调用addTwo方法时，JVM会创建心的栈帧，传递给addTwo方法的参数作为心的栈帧对应局部变量的初始值。
第三，当addTwo方法执行结束，方法返回时，返回值被压入调用者(add12and13方法)的栈帧的操作数栈。
第四，add12and13方法的返回过程由add12and13()中的ireturn指令实现。ireturn指令将把当前操作数栈的栈顶值压入调用add12and13方法的操作数栈，然后跳转至调用者方法的下一条指令继续执行。
invokevirtual指令操作数(运行时常量池索引#4)，不是class实例中的方法指令的偏移量，编译器不需要了解Class实例的内部布局，它只需要产生方法的符号引用并保存运行时常量池即可。

使用类实例

在JVM中，构造函数将会以一个编译器提供的<init>命名的方法出现，即是实例初始化方法。一旦类实例被创建，那么这个实例包含的所有实例变量，除了在本身以及父类中所定义的，都将被赋予默认初始值，接着新对象的实例初始化方法将会被调用。

Object create() {
return new Object();
}
Method java.lang.Object create()
0 new #1 // Class java.lang.Object
3 dup
4 invokespecial #4 // Method java.lang.Object.<init>()V
7 areturn

在参数传递和方法返回时，类实例与普通的数值类型没有太大区别，reference类型也有专用的指令。类实例的字段将使用getfield和putfield指令进行访问。无论方法调用指令的操作数，还是putfield,getfield指令的操作数都并非类实例中的地址偏移量。编译器会将这些字段生成符号引用，保存在运行时常量池之中，运行时常量池会在解析阶段转换成对象中真实的字段位置。

数组

在JVM中，数组也用对象来表示，数组由专门的指令集创建和操作。newarray指令创建数值类型的数组。anewarray指令创建引用类型的一维数组。multianewarray指令一次性创建多维数组。

//创建数值类型
7 newarray int // ...and create new array of int of that length
//创建对象数组
Method void createThreadArray()
4 anewarray class #1 // Create new array of class Thread
//多维数组
Method int create3DArray()[][][]
3 multianewarray #1 dim #2 // Class [[[I, a three

multianewarray指令的第一个操作数是运行时常量池索引，它表示将要被创建数组的成员类型。第二个操作数是需要创建的实际维数。所有的数组都有一个与之关联的长度属性，通过arraylength指令访问。

编译switch语句

编译器会使用tableswitch和lookupswith指令来生成swithc语句的编译代码。tableswitch指令表示switch结构中的case语句块，能高效从索引表中确定case语句的分支偏移量，当不能从索引表中确定任何一个case语句块时，default分支起作用。

int chooseNear(int i) {
    switch (i) {
        case 0: return 0;
        case 1: return 1;
        case 2: return 2;
        default: return -1;
    }
}
Method int chooseNear(int)
0 iload_1 // Push local variable 1 (argument i)
1 tableswitch 0 to 2: // Valid indices are 0 through 2
0: 28 // If i is 0, continue at 28
1: 30 // If i is 1, continue at 30
2: 32 // If i is 2, continue at 32
default:34 // Otherwise, continue at 34
28 iconst_0 // i was 0; push int constant 0...
29 ireturn // ...and return it
30 iconst_1 // i was 1; push int constant 1...
31 ireturn // ...and return it
32 iconst_2 // i was 2; push int constant 2...
33 ireturn // ...and return it
34 iconst_m1 // otherwise push int constant –1...
35 ireturn // ...and return it

tableswitch和lookupswitch指令只支持int类型的条件值，如果使用其他数值类型的条件值，必须转换成int类型值。
当switch语句的case分支条件值比较稀疏时，tableswitch指令的空间利用率较低，这种情况下使用lookupswitch指令替代。lookupswitch指令的索引表由int型的键值与对应的目标语句偏移量构成。当lookupswitch指令执行时，switch语句的条件值和索引表中的key比较。

int chooseFar(int i) {
    switch (i) {
        case -100: return -1;
        case 0: return 0;
        case 100: return 1;
        default: return -1;
    }
}
Method int chooseFar(int)
0 iload_1
1 lookupswitch 3:
−100: 36
0: 38
100: 40
default:42
36 iconst_m1
37 ireturn
38 iconst_0
39 ireturn
40 iconst_1
41 ireturn
42 iconst_m1
43 ireturn

Java 虚拟机规定的 lookupswitch 指令的索引表必须根据 key 值排序,这样使用(如采用二分搜索)将会比直接使用线性扫描搜索来得更有效率。在从索引表确定分支偏移量的过程中,lookupswitch 指令是把条件值与不同的 key 的进行比较,tableswitch 指令则只需要索引而值进行一次范围检查。

使用操作数栈

JVM为方便使用操作数栈，提供了大量的不区分操作数栈数据类型的指令。

public long nextIndex() {
    return index++;
}
private long index = 0;
Method long nextIndex()
0 aload_0 // Push this
1 dup // Make a copy of it
2 getfield #4 // One of the copies of this is consumed
// pushing long field index,above the original this
5 dup2_x1 // The long on top of the operand stack is
// inserted into the operand stack below theo riginal this
6 lconst_1 // Push long constant 1
7 ladd // The index value is incremented...
8 putfield #4 // ...and the result stored back in the field
11 lreturn // The original value of index is left on top of the operand stack, ready to be returned

JVM不允许作用于操作数栈的指令修改或拆分那写不可拆分操作数(如long或double类型操作数)。上例中使用了拷贝，且像long类型的操作对应lconst_1等操作，并没有拆分成两个动作。

抛出异常和处理异常

程序中使用throw关键字，其编译过程如下：

void cantBeZero(int i) throws TestExc {
    if (i == 0) {
        throw new TestExc();
    }
}
Method void cantBeZero(int)
0 iload_1 // Push argument 1 (i)
1 ifne 12 // If i==0, allocate instance and throw
4 new #1 // Create instance of TestExc
7 dup // One reference goes to the constructor
8 invokespecial #7 //构造器Method TestExc.<init>()V
11 athrow // 抛出异常
12 return // 如果抛出异常，永远不会到这里

try-catch结构：

void catchOne() {
    try {
        tryItOut();
    } catch (TestExc e) {
        handleExc(e);
    }
}
Method void catchOne()
0 aload_0 // Beginning of try block
1 invokevirtual #6 // Method Example.tryItOut()V
4 return //try block结束
5 astore_1 // Store thrown value in local variable 1
6 aload_0 // Push this
7 aload_1 // Push thrown value
8 invokevirtual #5 //异常处理
// Example.handleExc(LTestExc;)V
11 return // Return after handling TestExc
Exception table://异常表，from~to为没有异常代码块，若出现异常为TestExec，转向5
From To Target Type
0      4  5 Class TestExc

异常表可以有多个异常，如

void catchTwo() {
    try {
        tryItOut();
    } catch (TestExc1 e) {
        handleExc(e);
    } catch (TestExc2 e) {
        handleExc(e);
    }
}
//编译后的代码,局部代码
Method void catchTwo()
Exception table:
From To Target Type
0      4   5       Class TestExc1
0      4   12     Class TestExc2
void nestedCatch() {
    try {
        try {
            tryItOut();
        } catch (TestExc1 e) {
            handleExc1(e);
        }
    } catch (TestExc2 e) {
        handleExc2(e);
    }
}
xception table:
From To  Target Type
0      4   5        Class TestExc1
0      12  23     Class TestExc1

finally语句块

try-finally和try-catch语句基本相同。代码执行完try语句之前，无论有没有抛出异常，finally语句块中的代码都会被执行。用jsr指令编译finally语句块。

void tryFinally() {
    try {
        tryItOut();
    } finally {
        wrapItUp();
    }
}
Method void tryFinally()
0 aload_0 // Beginning of try block
1 invokevirtual #6 // Method Example.tryItOut()V
4 jsr 14 // Call finally block
7 return // End of try block
8 astore_1 // Beginning of handler for any throw
9 jsr 14 // Call finally block
12 aload_1 // Push thrown value
13 athrow // ...and rethrow the value to the invoker
14 astore_2 // Beginning of finally block
15 aload_0 // Push this
16 invokevirtual #5 // Method Example.wrapItUp()V
19 ret 2 // Return from finally block
Exception table:
From To Target Type
0         4   8        any

退出try语句的四种方式：
1)语句正常执行结束。
2)通过return语句退出
3)通过break或continue退出循环
4)抛出异常

同步

同步基于Moniter的进入和退出对象实现，无论是显式同步还是隐式同步都是如此。

void onlyMe(Foo f) {
   synchronized(f) {
     doSomething();
   }
}
Method void onlyMe(Foo)
0 aload_1 // Push f
1 dup // Duplicate it on the stack
2 astore_2 // Store duplicate in local variable 2
3 monitorenter // Enter the monitor associated with f
4 aload_0 // Holding the monitor, pass this and...
5 invokevirtual #5 // ...call Example.doSomething()V
8 aload_2 // Push local variable 2 (f)
9 monitorexit // Exit the monitor associated with f
10 goto 18 // Complete the method normally
13 astore_3 // In case of any throw, end up here
14 aload_2 // Push local variable 2 (f)
15 monitorexit // Be sure to exit the monitor!
16 aload_3 // Push thrown exception...
17 athrow // ...then rethrow the value to the invoker
18 return // Return in the normal case
Exception table:
From   To   Target Type
4        10  13      any
13      16  13      any

编译器徐必须确保无论方法通过何种方式完成，方法调用过程中的每条monitorenter指令必须有执行对应的monitorexit指令，无论这个方法是正常结束还是异常结束。为了保证方法异常完成时monitorenter和moniterexit指令可以正确配对，编译器会自动产生异常处理器，处理所有异常。