
4 篇文章 0 订阅


This chapter describes the class file format of the Java Virtual Machine. Each class file contains the definition of a single class or interface. Although a class or interface need not have an external representation literally contained in a file (for instance, because the class is generated by a class loader), we will colloquially refer to any valid representation of a class or interface as being in the class file format.

A class file consists of a stream of 8-bit bytes. All 16-bit, 32-bit, and 64-bit quantities are constructed by reading in two, four, and eight consecutive 8-bit bytes, respectively. Multibyte data items are always stored in big-endian order, where the high bytes come first. In the Java SE platform, this format is supported by interfaces and and classes such as and
类文件由8位字节流组成。所有16位、32位和64位的量都是通过分别读取2、4和8个连续的8位字节来构造的。多字节数据项总是以大端顺序存储,其中高字节优先。在Java SE平台中,Java .io. datainput和Java .io. dataoutput接口以及Java .io. datainputstream和Java .io. dataoutputstream类支持该格式。

This chapter defines its own set of data types representing class file data: The types u1, u2, and u4 represent an unsigned one-, two-, or four-byte quantity, respectively. In the Java SE platform, these types may be read by methods such as readUnsignedByte, readUnsignedShort, and readInt of the interface
本章定义了自己的一组数据类型来表示类文件数据:类型u1、u2和u4分别表示无符号的1字节、2字节或4字节数量。在Java SE平台中,这些类型可以通过Java .io. datainput接口的readUnsignedByte、readUnsignedShort和readInt等方法读取

This chapter presents the class file format using pseudostructures written in a C-like structure notation. To avoid confusion with the fields of classes and class instances, etc., the contents of the structures describing the class file format are referred to as items. Successive items are stored in the class file sequentially, without padding or alignment.

Tables, consisting of zero or more variable-sized items, are used in several class file structures. Although we use C-like array syntax to refer to table items, the fact that tables are streams of varying-sized structures means that it is not possible to translate a table index directly to a byte offset into the table.

Where we refer to a data structure as an array, it consists of zero or more contiguous fixed-sized items and can be indexed like an array.

Reference to an ASCII character in this chapter should be interpreted to mean the Unicode code point corresponding to the ASCII character.

4.1. The ClassFile Structure 类文件结构

A class file consists of a single ClassFile structure:

ClassFile {
    u4             magic;
    u2             minor_version;
    u2             major_version;
    u2             constant_pool_count;
    cp_info        constant_pool[constant_pool_count-1];
    u2             access_flags;
    u2             this_class;
    u2             super_class;
    u2             interfaces_count;
    u2             interfaces[interfaces_count];
    u2             fields_count;
    field_info     fields[fields_count];
    u2             methods_count;
    method_info    methods[methods_count];
    u2             attributes_count;
    attribute_info attributes[attributes_count];

The items in the ClassFile structure are as follows:


The magic item supplies the magic number identifying the class file format; it has the value 0xCAFEBABE.

minor_version, major_version

The values of the minor_version and major_version items are the minor and major version numbers of this class file.
Together, a major and a minor version number determine the version of the class file format.
If a class file has major version number M and minor version number m, we denote the version of its class file format as M.m. Thus, class file format versions may be ordered lexicographically, for example, 1.5 < 2.0 < 2.1.
如果一个类文件具有主版本号M和次版本号m,则表示其类文件格式的版本为M. m。因此,类文件格式版本可以按字典序排序,例如,1.5 < 2.0 < 2.1。

A Java Virtual Machine implementation can support a class file format of version v if and only if v lies in some contiguous range Mi.0 ≤ v ≤ Mj.m. The release level of the Java SE platform to which a Java Virtual Machine implementation conforms is responsible for determining the range.
当且仅当v处于某个连续范围Mi.0≤v≤Mj.m时,Java虚拟机实现可以支持版本v的类文件格式。Java虚拟机实现所遵循的Java SE平台的发布级别负责决定范围。

Oracle’s Java Virtual Machine implementation in JDK release 1.0.2 supports class file format versions 45.0 through 45.3 inclusive. JDK releases 1.1.* support class file format versions in the range 45.0 through 45.65535 inclusive. For k ≥ 2, JDK release 1.k supports class file format versions in the range 45.0 through 44+k.0 inclusive.
Oracle的Java虚拟机实现在JDK 1.0.2版本中支持类文件格式版本45.0到45.3。JDK 1.1版本。*支持45.0到45.65535之间的类文件格式版本。对于k≥2,JDK发行版1。K支持45.0到44+ K的类文件格式版本。0包容性。


The value of the constant_pool_count item is equal to the number of entries in the constant_pool table plus one. A constant_pool index is considered valid if it is greater than zero and less than constant_pool_count, with the exception for constants of type long and double noted in §4.4.5.


The constant_pool is a table of structures (§4.4) representing various string constants, class and interface names, field names, and other constants that are referred to within the ClassFile structure and its substructures. The format of each constant_pool table entry is indicated by its first “tag” byte.

The constant_pool table is indexed from 1 to constant_pool_count - 1.
constant_pool表的索引是从1到constant_pool_count - 1


The value of the access_flags item is a mask of flags used to denote access permissions to and properties of this class or interface. The interpretation of each flag, when set, is specified in Table 4.1-A.

An interface is distinguished by the ACC_INTERFACE flag being set. If the ACC_INTERFACE flag is not set, this class file defines a class, not an interface.

If the ACC_INTERFACE flag is set, the ACC_ABSTRACT flag must also be set, and the ACC_FINAL, ACC_SUPER, and ACC_ENUM flags set must not be set.

If the ACC_INTERFACE flag is not set, any of the other flags in Table 4.1-A may be set except ACC_ANNOTATION. However, such a class file must not have both its ACC_FINAL and ACC_ABSTRACT flags set (JLS §

The ACC_SUPER flag indicates which of two alternative semantics is to be expressed by the invokespecial instruction (§invokespecial) if it appears in this class or interface. Compilers to the instruction set of the Java Virtual Machine should set the ACC_SUPER flag. In Java SE 8 and above, the Java Virtual Machine considers the ACC_SUPER flag to be set in every class file, regardless of the actual value of the flag in the class file and the version of the class file.
ACC_SUPER标志表示如果出现在类或接口中,两种语义中的哪一种将由invokspecial指令(§invokspecial)表示。Java虚拟机指令集的编译器应该设置ACC_SUPER标志。在Java SE 8及以上版本中,Java虚拟机认为ACC_SUPER标志是在每个类文件中设置的,而不管类文件中该标志的实际值和类文件的版本。

The ACC_SUPER flag exists for backward compatibility with code compiled by older compilers for the Java programming language. In JDK releases prior to 1.0.2, the compiler generated access_flags in which the flag now representing ACC_SUPER had no assigned meaning, and Oracle’s Java Virtual Machine implementation ignored the flag if it was set.

The ACC_SYNTHETIC flag indicates that this class or interface was generated by a compiler and does not appear in source code.

An annotation type must have its ACC_ANNOTATION flag set. If the ACC_ANNOTATION flag is set, the ACC_INTERFACE flag must also be set.

The ACC_ENUM flag indicates that this class or its superclass is declared as an enumerated type.

All bits of the access_flags item not assigned in Table 4.1-A are reserved for future use. They should be set to zero in generated class files and should be ignored by Java Virtual Machine implementations.


The value of the this_class item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Class_info structure (§4.4.1) representing the class or interface defined by this class file.


For a class, the value of the super_class item either must be zero or must be a valid index into the constant_pool table. If the value of the super_class item is nonzero, the constant_pool entry at that index must be a CONSTANT_Class_info structure representing the direct superclass of the class defined by this class file. Neither the direct superclass nor any of its superclasses may have the ACC_FINAL flag set in the access_flags item of its ClassFile structure.

If the value of the super_class item is zero, then this class file must represent the class Object, the only class or interface without a direct superclass.

For an interface, the value of the super_class item must always be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Class_info structure representing the class Object.


The value of the interfaces_count item gives the number of direct superinterfaces of this class or interface type.


Each value in the interfaces array must be a valid index into the constant_pool table. The constant_pool entry at each value of interfaces[i], where 0 ≤ i < interfaces_count, must be a CONSTANT_Class_info structure representing an interface that is a direct superinterface of this class or interface type, in the left-to-right order given in the source for the type.
接口数组中的每个值都必须是constant_pool表的有效索引。interface [i]的每个值的constant_pool表项(0≤i < interface_count)必须是一个CONSTANT_Class_info结构,表示该类或接口类型的直接超接口,按照源类型中给出的从左到右的顺序。


The value of the fields_count item gives the number of field_info structures in the fields table. The field_info structures represent all fields, both class variables and instance variables, declared by this class or interface type.


Each value in the fields table must be a field_info structure (§4.5) giving a complete description of a field in this class or interface. The fields table includes only those fields that are declared by this class or interface. It does not include items representing fields that are inherited from superclasses or superinterfaces.


The value of the methods_count item gives the number of method_info structures in the methods table.


Each value in the methods table must be a method_info structure (§4.6) giving a complete description of a method in this class or interface. If neither of the ACC_NATIVE and ACC_ABSTRACT flags are set in the access_flags item of a method_info structure, the Java Virtual Machine instructions implementing the method are also supplied.

The method_info structures represent all methods declared by this class or interface type, including instance methods, class methods, instance initialization methods (§2.9), and any class or interface initialization method (§2.9). The methods table does not include items representing methods that are inherited from superclasses or superinterfaces.


The value of the attributes_count item gives the number of attributes in the attributes table of this class.


Each value of the attributes table must be an attribute_info structure (§4.7).

The attributes defined by this specification as appearing in the attributes table of a ClassFile structure are listed in Table 4.7-C.

The rules concerning attributes defined to appear in the attributes table of a ClassFile structure are given in §4.7.

The rules concerning non-predefined attributes in the attributes table of a ClassFile structure are given in §4.7.1.

4.2. The Internal Form of Names 名称的内部形式

4.2.1. Binary Class and Interface Names 二进制类和接口名称

Class and interface names that appear in class file structures are always represented in a fully qualified form known as binary names (JLS §13.1). Such names are always represented as CONSTANT_Utf8_info structures (§4.4.7) and thus may be drawn, where not further constrained, from the entire Unicode codespace. Class and interface names are referenced from those CONSTANT_NameAndType_info structures (§4.4.6) which have such names as part of their descriptor (§4.3), and from all CONSTANT_Class_info structures (§4.4.1).


For historical reasons, the syntax of binary names that appear in class file structures differs from the syntax of binary names documented in JLS §13.1. In this internal form, the ASCII periods (.) that normally separate the identifiers which make up the binary name are replaced by ASCII forward slashes (/). The identifiers themselves must be unqualified names (§4.2.2).

For example, the normal binary name of class Thread is java.lang.Thread. In the internal form used in descriptors in the class file format, a reference to the name of class Thread is implemented using a CONSTANT_Utf8_info structure representing the string java/lang/Thread.

4.2.2. Unqualified Names 限定名

Names of methods, fields, local variables, and formal parameters are stored as unqualified names. An unqualified name must contain at least one Unicode code point and must not contain any of the ASCII characters . ; [ / (that is, period or semicolon or left square bracket or forward slash).
方法、字段、局部变量和形式参数的名称存储为非限定名称非限定名称必须包含至少一个Unicode码位,并且不能包含任何ASCII字符 . ; [ / 即句点、分号、左方括号或正斜杠。

Method names are further constrained so that, with the exception of the special method names and (§2.9), they must not contain the ASCII characters < or > (that is, left angle bracket or right angle bracket).

Note that a field name or interface method name may be or , but no method invocation instruction may reference and only the invokespecial instruction (§invokespecial) may reference .

4.3. Descriptors 描述符

A descriptor is a string representing the type of a field or method. Descriptors are represented in the class file format using modified UTF-8 strings (§4.4.7) and thus may be drawn, where not further constrained, from the entire Unicode codespace.

4.3.1. Grammar Notation 语法符号

Descriptors are specified using a grammar. The grammar is a set of productions that describe how sequences of characters can form syntactically correct descriptors of various kinds. Terminal symbols of the grammar are shown in fixed width font. Nonterminal symbols are shown in italic type. The definition of a nonterminal is introduced by the name of the nonterminal being defined, followed by a colon. One or more alternative definitions for the nonterminal then follow on succeeding lines.

The syntax {x} on the right-hand side of a production denotes zero or more occurrences of x.

The phrase (one of) on the right-hand side of a production signifies that each of the terminal symbols on the following line or lines is an alternative definition.

4.3.2. Field Descriptors 字段描述符

A field descriptor represents the type of a class, instance, or local variable.

	(one of)
	B C D F I J S Z
	L ClassName ;
	[ ComponentType

The characters of BaseType, the L and ; of ObjectType, and the [ of ArrayType are all ASCII characters.

ClassName represents a binary class or interface name encoded in internal form (§4.2.1).

The interpretation of field descriptors as types is shown in Table 4.3-A.

A field descriptor representing an array type is valid only if it represents a type with 255 or fewer dimensions.

The field descriptor of an instance variable of type int is simply I.

The field descriptor of an instance variable of type Object is Ljava/lang/Object;. Note that the internal form of the binary name for class Object is used.

The field descriptor of an instance variable of the multidimensional array type double[][][] is [[[D.


4.3.3. Method Descriptors 方法描述符

A method descriptor contains zero or more parameter descriptors, representing the types of parameters that the method takes, and a return descriptor, representing the type of the value (if any) that the method returns.

	( {ParameterDescriptor} ) ReturnDescriptor

The character V indicates that the method returns no value (its result is void).

The method descriptor for the method:
Object m(int i, double d, Thread t) {...}
Note that the internal forms of the binary names of Thread and Object are used.

A method descriptor is valid only if it represents method parameters with a total length of 255 or less, where that length includes the contribution for this in the case of instance or interface method invocations. The total length is calculated by summing the contributions of the individual parameters, where a parameter of type long or double contributes two units to the length and a parameter of any other type contributes one unit.

A method descriptor is the same whether the method it describes is a class method or an instance method. Although an instance method is passed this, a reference to the object on which the method is being invoked, in addition to its intended arguments, that fact is not reflected in the method descriptor. The reference to this is passed implicitly by the Java Virtual Machine instructions which invoke instance methods (§2.6.1, §4.11).

4.4. The Constant Pool 常量池

Java Virtual Machine instructions do not rely on the run-time layout of classes, interfaces, class instances, or arrays. Instead, instructions refer to symbolic information in the constant_pool table.
Java Virtual Machine指令不依赖于类、接口、类实例或数组的运行时布局。相反,指令引用constant_pool表中的符号信息

All constant_pool table entries have the following general format:

cp_info {
    u1 tag;
    u1 info[];

Each item in the constant_pool table must begin with a 1-byte tag indicating the kind of cp_info entry. The contents of the info array vary with the value of tag. The valid tags and their values are listed in Table 4.4-A. Each tag byte must be followed by two or more bytes giving information about the specific constant. The format of the additional information varies with the tag value.

4.4.1. The CONSTANT_Class_info Structure CONSTANT_Class_info结构

The CONSTANT_Class_info structure is used to represent a class or an interface:

CONSTANT_Class_info {
    u1 tag;
    u2 name_index;

The items of the CONSTANT_Class_info structure are as follows:

The tag item has the value CONSTANT_Class (7).
The value of the name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info structure (§4.4.7) representing a valid binary class or interface name encoded in internal form (§4.2.1).

Because arrays are objects, the opcodes anewarray and multianewarray - but not the opcode new - can reference array “classes” via CONSTANT_Class_info structures in the constant_pool table. For such array classes, the name of the class is the descriptor of the array type (§4.3.2).

For example, the class name representing the two-dimensional array type int[][] is [[I, while the class name representing the type Thread[] is [Ljava/lang/Thread;.

An array type descriptor is valid only if it represents 255 or fewer dimensions.

4.4.2. The CONSTANT_Fieldref_info, CONSTANT_Methodref_info, and CONSTANT_InterfaceMethodref_info Structures

Fields, methods, and interface methods are represented by similar structures:

CONSTANT_Fieldref_info {
    u1 tag;
    u2 class_index;
    u2 name_and_type_index;

CONSTANT_Methodref_info {
    u1 tag;
    u2 class_index;
    u2 name_and_type_index;

CONSTANT_InterfaceMethodref_info {
    u1 tag;
    u2 class_index;
    u2 name_and_type_index;

The items of these structures are as follows:


The tag item of a CONSTANT_Fieldref_info structure has the value CONSTANT_Fieldref (9).

The tag item of a CONSTANT_Methodref_info structure has the value CONSTANT_Methodref (10).

The tag item of a CONSTANT_InterfaceMethodref_info structure has the value CONSTANT_InterfaceMethodref (11).


The value of the class_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Class_info structure (§4.4.1) representing a class or interface type that has the field or method as a member.

The class_index item of a CONSTANT_Methodref_info structure must be a class type, not an interface type.

The class_index item of a CONSTANT_InterfaceMethodref_info structure must be an interface type.

The class_index item of a CONSTANT_Fieldref_info structure may be either a class type or an interface type.


The value of the name_and_type_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_NameAndType_info structure (§4.4.6). This constant_pool entry indicates the name and descriptor of the field or method.

In a CONSTANT_Fieldref_info, the indicated descriptor must be a field descriptor (§4.3.2). Otherwise, the indicated descriptor must be a method descriptor (§4.3.3).

If the name of the method of a CONSTANT_Methodref_info structure begins with a ‘<’ (‘\u003c’), then the name must be the special name , representing an instance initialization method (§2.9). The return type of such a method must be void.
如果CONSTANT_Methodref_info结构的方法名以’<’ (‘\u003c’)开头,那么名称必须是特殊名称,代表一个实例初始化方法(§2.9)。这种方法的返回类型必须为void。

4.4.3. The CONSTANT_String_info Structure

The CONSTANT_String_info structure is used to represent constant objects of the type String:

CONSTANT_String_info {
    u1 tag;
    u2 string_index;

The items of the CONSTANT_String_info structure are as follows:

The tag item of the CONSTANT_String_info structure has the value CONSTANT_String (8).
The value of the string_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info structure (§4.4.7) representing the sequence of Unicode code points to which the String object is to be initialized.
string_index项的值必须是constant_pool表的有效索引。该索引的constant_pool条目必须是一个CONSTANT_Utf8_info结构(§4.4.7),表示 String对象要初始化的Unicode代码点序列

4.4.4. The CONSTANT_Integer_info and CONSTANT_Float_info Structures

The CONSTANT_Integer_info and CONSTANT_Float_info structures represent 4-byte numeric (int and float) constants:

CONSTANT_Integer_info {
    u1 tag;
    u4 bytes;

CONSTANT_Float_info {
    u1 tag;
    u4 bytes;

The items of these structures are as follows:


The tag item of the CONSTANT_Integer_info structure has the value CONSTANT_Integer (3).

The tag item of the CONSTANT_Float_info structure has the value CONSTANT_Float (4).


The bytes item of the CONSTANT_Integer_info structure represents the value of the int constant. The bytes of the value are stored in big-endian (high byte first) order.

The bytes item of the CONSTANT_Float_info structure represents the value of the float constant in IEEE 754 floating-point single format (§2.3.2). The bytes of the single format representation are stored in big-endian (high byte first) order.
CONSTANT_Float_info结构的bytes项表示IEEE 754浮点单格式(§2.3.2)中浮点常量的值。单一格式表示的字节以大端序(高字节优先)存储

The value represented by the CONSTANT_Float_info structure is determined as follows. The bytes of the value are first converted into an int constant bits. Then:

If bits is 0x7f800000, the float value will be positive infinity.

If bits is 0xff800000, the float value will be negative infinity.

If bits is in the range 0x7f800001 through 0x7fffffff or in the range 0xff800001 through 0xffffffff, the float value will be NaN.

In all other cases, let s, e, and m be three values that might be computed from bits:
在所有其他情况下,让s, e和m是三个值,可以从位计算:

int s = ((bits >> 31) == 0) ? 1 : -1;
int e = ((bits >> 23) & 0xff);
int m = (e == 0) ?
          (bits & 0x7fffff) << 1 :
          (bits & 0x7fffff) | 0x800000;

Then the float value equals the result of the mathematical expression s · m · 2e-150.

4.4.5. The CONSTANT_Long_info and CONSTANT_Double_info Structures

The CONSTANT_Long_info and CONSTANT_Double_info represent 8-byte numeric (long and double) constants:

CONSTANT_Long_info {
    u1 tag;
    u4 high_bytes;
    u4 low_bytes;

CONSTANT_Double_info {
    u1 tag;
    u4 high_bytes;
    u4 low_bytes;

All 8-byte constants take up two entries in the constant_pool table of the class file. If a CONSTANT_Long_info or CONSTANT_Double_info structure is the item in the constant_pool table at index n, then the next usable item in the pool is located at index n+2. The constant_pool index n+1 must be valid but is considered unusable.

In retrospect, making 8-byte constants take two constant pool entries was a poor choice.

The items of these structures are as follows:


The tag item of the CONSTANT_Long_info structure has the value CONSTANT_Long (5).

The tag item of the CONSTANT_Double_info structure has the value CONSTANT_Double (6).

high_bytes, low_bytes

The unsigned high_bytes and low_bytes items of the CONSTANT_Long_info structure together represent the value of the long constant

((long) high_bytes << 32) + low_bytes

where the bytes of each of high_bytes and low_bytes are stored in big-endian (high byte first) order.

The high_bytes and low_bytes items of the CONSTANT_Double_info structure together represent the double value in IEEE 754 floating-point double format (§2.3.2). The bytes of each item are stored in big-endian (high byte first) order.
CONSTANT_Double_info结构的high_bytes和low_bytes项一起表示IEEE 754浮点双精度格式(§2.3.2)中的双精度值。每个条目的字节都以大端序(高字节优先)存储。

The value represented by the CONSTANT_Double_info structure is determined as follows. The high_bytes and low_bytes items are converted into the long constant bits, which is equal to

((long) high_bytes << 32) + low_bytes


  • If bits is 0x7ff0000000000000L, the double value will be positive infinity.
  • If bits is 0xfff0000000000000L, the double value will be negative infinity.
  • If bits is in the range 0x7ff0000000000001L through 0x7fffffffffffffffL or in the range 0xfff0000000000001L through 0xffffffffffffffffL, the double value will be NaN.
  • In all other cases, let s, e, and m be three values that might be computed from bits:
    在所有其他情况下,让s, e和m是三个值,可以从位计算:
    int s = ((bits >> 63) == 0) ? 1 : -1;
    int e = (int)((bits >> 52) & 0x7ffL);
    long m = (e == 0) ?
               (bits & 0xfffffffffffffL) << 1 :
               (bits & 0xfffffffffffffL) | 0x10000000000000L;

Then the floating-point value equals the double value of the mathematical expression s · m · 2e-1075.

4.4.6. The CONSTANT_NameAndType_info Structure

The CONSTANT_NameAndType_info structure is used to represent a field or method, without indicating which class or interface type it belongs to:

CONSTANT_NameAndType_info {
    u1 tag;
    u2 name_index;
    u2 descriptor_index;

The items of the CONSTANT_NameAndType_info structure are as follows:

The tag item of the CONSTANT_NameAndType_info structure has the value CONSTANT_NameAndType (12).
The value of the name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info structure (§4.4.7) representing either the special method name (§2.9) or a valid unqualified name denoting a field or method (§4.2.2).
name_index项的值必须是constant_pool表的有效索引。该索引的constant_pool条目必须是一个CONSTANT_Utf8_info结构体(§4.4.7), 该结构体要么表示特殊的方法名(§2.9),要么表示一个有效的不合格的字段或方法名称(§4.2.2)。
The value of the descriptor_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info structure (§4.4.7) representing a valid field descriptor or method descriptor (§4.3.2, §4.3.3).
descriptor_index项的值必须是constant_pool表的有效索引。该索引的constant_pool条目必须是一个CONSTANT_Utf8_info结构体(§4.4.7), 表示一个有效的字段描述符或方法描述符(§4.3.2,§4.3.3)。

4.4.7. The CONSTANT_Utf8_info Structure

4.4.8. The CONSTANT_MethodHandle_info Structure

4.4.9. The CONSTANT_MethodType_info Structure

4.4.10. The CONSTANT_InvokeDynamic_info Structure

4.5. Fields 字段

4.6. Methods 方法

4.7. Attributes 属性

4.7.1. Defining and Naming New Attributes 定义和命名新属性

4.7.2. The ConstantValue Attribute ConstantValue属性

The ConstantValue attribute is a fixed-length attribute in the attributes table of a field_info structure (§4.5). A ConstantValue attribute represents the value of a constant expression (JLS §15.28), and is used as follows:

  • If the ACC_STATIC flag in the access_flags item of the field_info structure is set, then the field represented by the field_info structure is assigned the value represented by its ConstantValue attribute as part of the initialization of the class or interface declaring the field (§5.5). This occurs prior to the invocation of the class or interface initialization method of that class or interface (§2.9).
  • Otherwise, the Java Virtual Machine must silently ignore the attribute.
    There may be at most one ConstantValue attribute in the attributes table of a field_info structure.
    The ConstantValue attribute has the following format:
    ConstantValue_attribute {
        u2 attribute_name_index;
        u4 attribute_length;
        u2 constantvalue_index;

The items of the ConstantValue_attribute structure are as follows:

The value of the attribute_name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info structure (§4.4.7) representing the string “ConstantValue”.
The value of the attribute_length item of a ConstantValue_attribute structure must be two.
The value of the constantvalue_index item must be a valid index into the constant_pool table. The constant_pool entry at that index gives the constant value represented by this attribute. The constant_pool entry must be of a type appropriate to the field, as specified in Table 4.7.2-A.

4.7.3. The Code Attribute 属性的代码

The Code attribute is a variable-length attribute in the attributes table of a method_info structure (§4.6). A Code attribute contains the Java Virtual Machine instructions and auxiliary information for a method, including an instance initialization method or a class or interface initialization method (§2.9).
If the method is either native or abstract, its method_info structure must not have a Code attribute in its attributes table. Otherwise, its method_info structure must have exactly one Code attribute in its attributes table.

Code_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 max_stack;
    u2 max_locals;
    u4 code_length;
    u1 code[code_length];
    u2 exception_table_length;
    {   u2 start_pc;
        u2 end_pc;
        u2 handler_pc;
        u2 catch_type;
    } exception_table[exception_table_length];
    u2 attributes_count;
    attribute_info attributes[attributes_count];

4.7.4. The StackMapTable Attribute

StackMapTable_attribute {
    u2              attribute_name_index;
    u4              attribute_length;
    u2              number_of_entries;
    stack_map_frame entries[number_of_entries];

4.7.5. The Exceptions Attribute

Exceptions_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 number_of_exceptions;
    u2 exception_index_table[number_of_exceptions];

4.7.6. The InnerClasses Attribute

InnerClasses_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 number_of_classes;
    {   u2 inner_class_info_index;
        u2 outer_class_info_index;
        u2 inner_name_index;
        u2 inner_class_access_flags;
    } classes[number_of_classes];

4.7.7. The EnclosingMethod Attribute

EnclosingMethod_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 class_index;
    u2 method_index;

4.7.8. The Synthetic Attribute 合成属性

Synthetic_attribute {
    u2 attribute_name_index;
    u4 attribute_length;

4.7.9. The Signature Attribute

Signature_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 signature_index;
} Signatures

4.7.10. The SourceFile Attribute

SourceFile_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 sourcefile_index;

4.7.11. The SourceDebugExtension Attribute

SourceDebugExtension_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u1 debug_extension[attribute_length];

4.7.12. The LineNumberTable Attribute

LineNumberTable_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 line_number_table_length;
    {   u2 start_pc;
        u2 line_number;	
    } line_number_table[line_number_table_length];

4.7.13. The LocalVariableTable Attribute 局部变量表属性

LocalVariableTable_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 local_variable_table_length;
    {   u2 start_pc;
        u2 length;
        u2 name_index;
        u2 descriptor_index;
        u2 index;
    } local_variable_table[local_variable_table_length];

4.7.14. The LocalVariableTypeTable Attribute

LocalVariableTypeTable_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 local_variable_type_table_length;
    {   u2 start_pc;
        u2 length;
        u2 name_index;
        u2 signature_index;
        u2 index;
    } local_variable_type_table[local_variable_type_table_length];

4.7.15. The Deprecated Attribute

Deprecated_attribute {
    u2 attribute_name_index;
    u4 attribute_length;

4.7.16. The RuntimeVisibleAnnotations Attribute

RuntimeVisibleAnnotations_attribute {
    u2         attribute_name_index;
    u4         attribute_length;
    u2         num_annotations;
    annotation annotations[num_annotations];
} The element_value structure
element_value {
    u1 tag;
    union {
        u2 const_value_index;

        {   u2 type_name_index;
            u2 const_name_index;
        } enum_const_value;

        u2 class_info_index;

        annotation annotation_value;

        {   u2            num_values;
            element_value values[num_values];
        } array_value;
    } value;

4.7.17. The RuntimeInvisibleAnnotations Attribute

RuntimeInvisibleAnnotations_attribute {
    u2         attribute_name_index;
    u4         attribute_length;
    u2         num_annotations;
    annotation annotations[num_annotations];

4.7.18. The RuntimeVisibleParameterAnnotations Attribute

RuntimeVisibleParameterAnnotations_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u1 num_parameters;
    {   u2         num_annotations;
        annotation annotations[num_annotations];
    } parameter_annotations[num_parameters];

4.7.19. The RuntimeInvisibleParameterAnnotations Attribute

RuntimeInvisibleParameterAnnotations_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u1 num_parameters;
    {   u2         num_annotations;
        annotation annotations[num_annotations];
    } parameter_annotations[num_parameters];

4.7.20. The RuntimeVisibleTypeAnnotations Attribute

RuntimeVisibleTypeAnnotations_attribute {
    u2              attribute_name_index;
    u4              attribute_length;
    u2              num_annotations;
    type_annotation annotations[num_annotations];
} The target_info union The type_path structure

4.7.21. The RuntimeInvisibleTypeAnnotations Attribute

RuntimeInvisibleTypeAnnotations_attribute {
    u2              attribute_name_index;
    u4              attribute_length;
    u2              num_annotations;
    type_annotation annotations[num_annotations];

4.7.22. The AnnotationDefault Attribute

AnnotationDefault_attribute {
    u2            attribute_name_index;
    u4            attribute_length;
    element_value default_value;

4.7.23. The BootstrapMethods Attribute

BootstrapMethods_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 num_bootstrap_methods;
    {   u2 bootstrap_method_ref;
        u2 num_bootstrap_arguments;
        u2 bootstrap_arguments[num_bootstrap_arguments];
    } bootstrap_methods[num_bootstrap_methods];

4.7.24. The MethodParameters Attribute

MethodParameters_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u1 parameters_count;
    {   u2 name_index;
        u2 access_flags;
    } parameters[parameters_count];

4.8. Format Checking 检查格式

When a prospective class file is loaded by the Java Virtual Machine (§5.3), the Java Virtual Machine first ensures that the file has the basic format of a class file (§4.1). This process is known as format checking. The checks are as follows:

  • The first four bytes must contain the right magic number.
  • All recognized attributes must be of the proper length.
  • The class file must not be truncated or have extra bytes at the end.
  • The constant pool must satisfy the constraints documented throughout §4.4.
    For example, each CONSTANT_Class_info structure in the constant pool must contain in its name_index item a valid constant pool index for a CONSTANT_Utf8_info structure.
  • All field references and method references in the constant pool must have valid names, valid classes, and valid descriptors (§4.3).
    Format checking does not ensure that the given field or method actually exists in the given class, nor that the descriptors given refer to real classes. Format checking ensures only that these items are well formed. More detailed checking is performed when the bytecodes themselves are verified, and during resolution.

These checks for basic class file integrity are necessary for any interpretation of the class file contents. Format checking is distinct from bytecode verification, although historically they have been confused because both are a form of integrity check.

4.9. Constraints on Java Virtual Machine Code Java虚拟机代码约束

The code for a method, instance initialization method, or class or interface initialization method (§2.9) is stored in the code array of the Code attribute of a method_info structure of a class file (§4.7.3). This section describes the constraints associated with the contents of the Code_attribute structure.

4.9.1. Static Constraints 静态约束

The static constraints on a class file are those defining the well-formedness of the file. These constraints have been given in the previous sections, except for static constraints on the code in the class file. The static constraints on the code in a class file specify how Java Virtual Machine instructions must be laid out in the code array and what the operands of individual instructions must be.
类文件上的静态约束是那些定义文件格式良好性的约束。除了类文件中代码的静态约束外,这些约束在前面的小节中已经给出了。类文件中代码的静态约束指定了Java Virtual Machine指令必须如何在代码数组中布局,以及各个指令的操作数必须是什么。

The static constraints on the instructions in the code array are as follows:

  • Only instances of the instructions documented in §6.5 may appear in the code array. Instances of instructions using the reserved opcodes (§6.2) or any opcodes not documented in this specification must not appear in the code array.
  • 只有§6.5中记录的指令实例才能出现在代码数组中。使用保留操作码(§6.2)或本规范中没有记录的操作码的指令实例不能出现在代码数组中。

If the class file version number is 51.0 or above, then neither the jsr opcode or the jsr_w opcode may appear in the code array.

  • The opcode of the first instruction in the code array begins at index 0.

  • For each instruction in the code array except the last, the index of the opcode of the next instruction equals the index of the opcode of the current instruction plus the length of that instruction, including all its operands.
    The wide instruction is treated like any other instruction for these purposes; the opcode specifying the operation that a wide instruction is to modify is treated as one of the operands of that wide instruction. That opcode must never be directly reachable by the computation.

  • The last byte of the last instruction in the code array must be the byte at index code_length - 1.
    代码数组中最后一条指令的最后一个字节必须是索引code_length - 1处的字节。

The static constraints on the operands of instructions in the code array are as follows:

  • The target of each jump and branch instruction (jsr, jsr_w, goto, goto_w, ifeq, ifne, ifle, iflt, ifge, ifgt, ifnull, ifnonnull, if_icmpeq, if_icmpne, if_icmple, if_icmplt, if_icmpge, if_icmpgt, if_acmpeq, if_acmpne) must be the opcode of an instruction within this method.
    The target of a jump or branch instruction must never be the opcode used to specify the operation to be modified by a wide instruction; a jump or branch target may be the wide instruction itself.
  • Each target, including the default, of each tableswitch instruction must be the opcode of an instruction within this method.
    Each tableswitch instruction must have a number of entries in its jump table that is consistent with the value of its low and high jump table operands, and its low value must be less than or equal to its high value.
    No target of a tableswitch instruction may be the opcode used to specify the operation to be modified by a wide instruction; a tableswitch target may be a wide instruction itself.
  • Each target, including the default, of each lookupswitch instruction must be the opcode of an instruction within this method.
    Each lookupswitch instruction must have a number of match-offset pairs that is consistent with the value of its npairs operand. The match-offset pairs must be sorted in increasing numerical order by signed match value.
    No target of a lookupswitch instruction may be the opcode used to specify the operation to be modified by a wide instruction; a lookupswitch target may be a wide instruction itself.
  • The operand of each ldc instruction and each ldc_w instruction must be a valid index into the constant_pool table. The constant pool entry referenced by that index must be of type:
  1. CONSTANT_Integer, CONSTANT_Float, or CONSTANT_String if the class file version number is less than 49.0.
    如果类文件版本号小于49.0,则为CONSTANT_Integer, CONSTANT_Float或CONSTANT_String。
  2. CONSTANT_Integer, CONSTANT_Float, CONSTANT_String, or CONSTANT_Class if the class file version number is 49.0 or 50.0.
    如果类文件版本号为49.0或50.0,则为CONSTANT_Integer, CONSTANT_Float, CONSTANT_String或CONSTANT_Class。
  3. CONSTANT_Integer, CONSTANT_Float, CONSTANT_String, CONSTANT_Class, CONSTANT_MethodType, or CONSTANT_MethodHandle if the class file version number is 51.0 or above.
    CONSTANT_Integer, CONSTANT_Float, CONSTANT_String, CONSTANT_Class, CONSTANT_MethodType,或CONSTANT_MethodHandle如果类文件版本号是51.0或以上。
  • The operands of each ldc2_w instruction must represent a valid index into the constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_Long or CONSTANT_Double.
    The subsequent constant pool index must also be a valid index into the constant pool, and the constant pool entry at that index must not be used.
  • The operands of each getfield, putfield, getstatic, and putstatic instruction must represent a valid index into the constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_Fieldref.
  • The indexbyte operands of each invokevirtual instruction must represent a valid index into the constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_Methodref.
  • The indexbyte operands of each invokespecial and invokestatic instruction must represent a valid index into the constant_pool table. If the class file version number is less than 52.0, the constant pool entry referenced by that index must be of type CONSTANT_Methodref; if the class file version number is 52.0 or above, the constant pool entry referenced by that index must be of type CONSTANT_Methodref or CONSTANT_InterfaceMethodref.
  • The indexbyte operands of each invokeinterface instruction must represent a valid index into the constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_InterfaceMethodref.
    The value of the count operand of each invokeinterface instruction must reflect the number of local variables necessary to store the arguments to be passed to the interface method, as implied by the descriptor of the CONSTANT_NameAndType_info structure referenced by the CONSTANT_InterfaceMethodref constant pool entry.
    The fourth operand byte of each invokeinterface instruction must have the value zero.
  • The indexbyte operands of each invokedynamic instruction must represent a valid index into the constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_InvokeDynamic.
    The third and fourth operand bytes of each invokedynamic instruction must have the value zero.
  • Only the invokespecial instruction is allowed to invoke an instance initialization method (§2.9).
    No other method whose name begins with the character ‘<’ (‘\u003c’) may be called by the method invocation instructions. In particular, the class or interface initialization method specially named is never called explicitly from Java Virtual Machine instructions, but only implicitly by the Java Virtual Machine itself.
    方法调用指令不能调用名称以字符’<’ (‘\u003c’)开头的其他方法。特别是,特别命名为的类或接口初始化方法从来不会从Java虚拟机指令中显式地调用,而只能由Java虚拟机本身隐式地调用。
  • The operands of each instanceof, checkcast, new, and anewarray instruction, and the indexbyte operands of each multianewarray instruction, must represent a valid index into the constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_Class.
  • No new instruction may reference a constant pool entry of type CONSTANT_Class that represents an array type (§4.3.2). The new instruction cannot be used to create an array.
  • No anewarray instruction may be used to create an array of more than 255 dimensions.
  • A multianewarray instruction must be used only to create an array of a type that has at least as many dimensions as the value of its dimensions operand. That is, while a multianewarray instruction is not required to create all of the dimensions of the array type referenced by its indexbyte operands, it must not attempt to create more dimensions than are in the array type.
    The dimensions operand of each multianewarray instruction must not be zero.
  • The atype operand of each newarray instruction must take one of the values T_BOOLEAN (4), T_CHAR (5), T_FLOAT (6), T_DOUBLE (7), T_BYTE (8), T_SHORT (9), T_INT (10), or T_LONG (11).
  • The index operand of each iload, fload, aload, istore, fstore, astore, iinc, and ret instruction must be a non-negative integer no greater than max_locals - 1.
    每个ilload、load、load、istore、fstore、store、iinc和ret指令的索引操作数必须是一个不大于max_locals - 1的非负整数。
  • The implicit index of each iload_, fload_, aload_, istore_, fstore_, and astore_ instruction must be no greater than max_locals - 1.
    每个iload_、fload_、aload_、istore_、fstore_、astore_指令的隐式索引不能大于max_locals - 1。
  • The index operand of each lload, dload, lstore, and dstore instruction must be no greater than max_locals - 2.
    每个lload、dload、lstore和dstore指令的索引操作数不得大于max_locals - 2
    The implicit index of each lload_, dload_, lstore_, and dstore_ instruction must be no greater than max_locals - 2.
    每个lload_、dload_、lstore_、dstore_指令的隐式索引必须不大于max_locals - 2
  • The indexbyte operands of each wide instruction modifying an iload, fload, aload, istore, fstore, astore, iinc, or ret instruction must represent a non-negative integer no greater than max_locals - 1.
    修改ilload、load、load、istore、fstore、store、iinc或ret指令的每个wide指令的indexbyte操作数必须表示一个不大于max_locals - 1的非负整数。
    The indexbyte operands of each wide instruction modifying an lload, dload, lstore, or dstore instruction must represent a non-negative integer no greater than max_locals - 2.
    修改lload、dload、lstore或dstore指令的每个宽指令的indexbyte操作数必须表示一个不大于max_locals - 2的非负整数。

4.9.2. Structural Constraints 结构约束

4.10. Verification of class Files 验证类文件

Even though a compiler for the Java programming language must only produce class files that satisfy all the static and structural constraints in the previous sections, the Java Virtual Machine has no guarantee that any file it is asked to load was generated by that compiler or is properly formed. Applications such as web browsers do not download source code, which they then compile; these applications download already-compiled class files. The browser needs to determine whether the class file was produced by a trustworthy compiler or by an adversary attempting to exploit the Java Virtual Machine.
尽管Java编程语言的编译器必须只生成满足前几节中所有静态和结构约束的类文件,但Java Virtual Machine不能保证它被要求加载的任何文件都是由该编译器生成的或格式正确。网络浏览器等应用程序不下载源代码,然后编译;这些应用程序下载已经编译好的类文件。浏览器需要确定类文件是由可信的编译器生成的,还是由试图利用Java虚拟机的对手生成的。

An additional problem with compile-time checking is version skew. A user may have successfully compiled a class, say PurchaseStockOptions, to be a subclass of TradingClass. But the definition of TradingClass might have changed since the time the class was compiled in a way that is not compatible with pre-existing binaries. Methods might have been deleted or had their return types or modifiers changed. Fields might have changed types or changed from instance variables to class variables. The access modifiers of a method or variable may have changed from public to private. For a discussion of these issues, see Chapter 13, “Binary Compatibility,” in The Java Language Specification, Java SE 8 Edition.
编译时检查的另一个问题是版本倾斜。用户可能已经成功地将类(例如PurchaseStockOptions)编译为TradingClass的子类。但是TradingClass的定义可能已经改变了,因为这个类的编译方式与已经存在的二进制文件不兼容。方法可能已被删除,或者其返回类型或修饰符已更改。字段可能更改了类型,或者从实例变量更改为类变量。方法或变量的访问修饰符可能已从公共更改为私有。有关这些问题的讨论,请参阅《Java语言规范,Java SE 8版》第13章“二进制兼容性”。

Because of these potential problems, the Java Virtual Machine needs to verify for itself that the desired constraints are satisfied by the class files it attempts to incorporate. A Java Virtual Machine implementation verifies that each class file satisfies the necessary constraints at linking time (§5.4).
Link-time verification enhances the performance of the run-time interpreter. Expensive checks that would otherwise have to be performed to verify constraints at run time for each interpreted instruction can be eliminated. The Java Virtual Machine can assume that these checks have already been performed. For example, the Java Virtual Machine will already know the following:

  • There are no operand stack overflows or underflows.

  • All local variable uses and stores are valid.

  • The arguments to all the Java Virtual Machine instructions are of valid types.
    There are two strategies that Java Virtual Machine implementations may use for verification:

  • Verification by type checking must be used to verify class files whose version number is greater than or equal to 50.0.

  • Verification by type inference must be supported by all Java Virtual Machine implementations, except those conforming to the Java ME CLDC and Java Card profiles, in order to verify class files whose version number is less than 50.0.
    所有的Java虚拟机实现都必须支持类型推断验证,除了那些符合Java ME CLDC和Java Card配置文件的,以便验证版本号小于50.0的类文件。
    Verification on Java Virtual Machine implementations supporting the Java ME CLDC and Java Card profiles is governed by their respective specifications.
    对支持Java ME CLDC和Java Card配置文件的Java虚拟机实现的验证由它们各自的规范管理。

In both strategies, verification is mainly concerned with enforcing the static and structural constraints from §4.9 on the code array of the Code attribute (§4.7.3). However, there are three additional checks outside the Code attribute which must be performed during verification:

  • Ensuring that final classes are not subclassed.
  • Ensuring that final methods are not overridden (§5.4.5).
  • Checking that every class (except Object) has a direct superclass.

4.10.1. Verification by Type Checking 通过类型检查验证 Accessors for Java Virtual Machine Artifacts 用于Java虚拟机构件的访问器 Verification Type System 类型验证系统

The type checker enforces a type system based upon a hierarchy of verification types, illustrated below.

Verification type hierarchy:

                /                          \
               /                            \
            oneWord                       twoWord
           /   |   \                     /       \
          /    |    \                   /         \
        int  float  reference        long        double
                     /     \
                    /       \_____________
                   /                      \
                  /                        \
           uninitialized                    +------------------+
            /         \                     |  Java reference  |
           /           \                    |  type hierarchy  |
uninitializedThis  uninitialized(Offset)    +------------------+  
                                                    null Instruction Representation 指令表示 Stack Map Frame Representation 堆栈映射帧表示 Type Checking Abstract and Native Methods 类型检查抽象方法和本机方法 Type Checking Methods with Code 代码检查类型方法 Type Checking Load and Store Instructions 类型检查加载和存储指令 Type Checking for protected Members 类型检查受保护成员 Type Checking Instructions 类型检查指令

4.10.2. Verification by Type Inference 通过类型推断验证 The Process of Verification by Type Inference 类型推断验证的过程

During linking, the verifier checks the code array of the Code attribute for each method of the class file by performing data-flow analysis on each method. The verifier ensures that at any given point in the program, no matter what code path is taken to reach that point, all of the following are true:

  • The operand stack is always the same size and contains the same types of values.
  • No local variable is accessed unless it is known to contain a value of an appropriate type.
  • Methods are invoked with the appropriate arguments.
  • Fields are assigned only using values of appropriate types.
  • All opcodes have appropriately typed arguments on the operand stack and in the local variable array.
    For efficiency reasons, certain tests that could in principle be performed by the verifier are delayed until the first time the code for the method is actually invoked. In so doing, the verifier avoids loading class files unless it has to.

For example, if a method invokes another method that returns an instance of class A, and that instance is assigned only to a field of the same type, the verifier does not bother to check if the class A actually exists. However, if it is assigned to a field of the type B, the definitions of both A and B must be loaded in to ensure that A is a subclass of B.
例如,如果一个方法调用了另一个返回类a实例的方法,并且该实例只分配给相同类型的字段,验证器就不会费心检查类a是否实际存在。但是,如果它被分配给类型为B的字段,a和B的定义都必须加载进来,以确保a是B的子类。 The Bytecode Verifier 字节码校验器

The code for each method is verified independently. First, the bytes that make up the code are broken up into a sequence of instructions, and the index into the code array of the start of each instruction is placed in an array. The verifier then goes through the code a second time and parses the instructions. During this pass a data structure is built to hold information about each Java Virtual Machine instruction in the method. The operands, if any, of each instruction are checked to make sure they are valid. For instance:
每个方法的代码都是独立验证的。首先,组成代码的字节被分解成一个指令序列,每个指令开始的代码数组的索引被放置在一个数组中。然后验证器会第二次遍历代码并解析指令。在此传递过程中,将构建一个数据结构来保存方法中关于每个Java Virtual Machine指令的信息。检查每条指令的操作数(如果有的话),以确保它们是有效的。例如:

  • Branches must be within the bounds of the code array for the method.
  • The targets of all control-flow instructions are each the start of an instruction. In the case of a wide instruction, the wide opcode is considered the start of the instruction, and the opcode giving the operation modified by that wide instruction is not considered to start an instruction. Branches into the middle of an instruction are disallowed.
  • No instruction can access or modify a local variable at an index greater than or equal to the number of local variables that its method indicates it allocates.
  • All references to the constant pool must be to an entry of the appropriate type. (For example, the instruction getfield must reference a field.)
  • The code does not end in the middle of an instruction.
  • Execution cannot fall off the end of the code.
  • For each exception handler, the starting and ending point of code protected by the handler must be at the beginning of an instruction or, in the case of the ending point, immediately past the end of the code. The starting point must be before the ending point. The exception handler code must start at a valid instruction, and it must not start at an opcode being modified by the wide instruction.
    对于每个异常处理程序,受处理程序保护的代码的起始点和结束点必须在指令的开始处,或者在结束点的情况下,直接超过代码的结束点。起点必须在终点之前。异常处理程序代码必须从有效指令开始,而不能从被wide指令修改的操作码开始 Values of Types long and double long和double类型的值

Values of the long and double types are treated specially by the verification process.

Whenever a value of type long or double is moved into a local variable at index n, index n+1 is specially marked to indicate that it has been reserved by the value at index n and must not be used as a local variable index. Any value previously at index n+1 becomes unusable.

Whenever a value is moved to a local variable at index n, the index n-1 is examined to see if it is the index of a value of type long or double. If so, the local variable at index n-1 is changed to indicate that it now contains an unusable value. Since the local variable at index n has been overwritten, the local variable at index n-1 cannot represent a value of type long or double.

Dealing with values of types long or double on the operand stack is simpler; the verifier treats them as single values on the stack. For example, the verification code for the dadd opcode (add two double values) checks that the top two items on the stack are both of type double. When calculating operand stack length, values of type long and double have length two.

Untyped instructions that manipulate the operand stack must treat values of type long and double as atomic (indivisible). For example, the verifier reports a failure if the top value on the stack is a double and it encounters an instruction such as pop or dup. The instructions pop2 or dup2 must be used instead.
操作操作数堆栈的非类型化指令必须将long和double类型的值视为原子(不可分割)。例如,如果堆栈上的最上面的值是一个double,并且遇到了像pop或dup这样的指令,验证器就会报告失败。必须使用pop2或dup2指令来代替。 Instance Initialization Methods and Newly Created Objects 实例初始化方法和新创建的对象 Exceptions and finally 异常和最终处理

To implement the try-finally construct, a compiler for the Java programming language that generates class files with version number 50.0 or below may use the exception-handling facilities together with two special instructions: jsr (“jump to subroutine”) and ret (“return from subroutine”). The finally clause is compiled as a subroutine within the Java Virtual Machine code for its method, much like the code for an exception handler. When a jsr instruction that invokes the subroutine is executed, it pushes its return address, the address of the instruction after the jsr that is being executed, onto the operand stack as a value of type returnAddress. The code for the subroutine stores the return address in a local variable. At the end of the subroutine, a ret instruction fetches the return address from the local variable and transfers control to the instruction at the return address.
为了实现try-finally构造,生成版本号为50.0或更低版本的类文件的Java编程语言编译器可以使用异常处理工具和两个特殊指令:jsr(“跳转到子例程”)和ret(“从子例程返回”)。finally子句在Java Virtual Machine代码中被编译为其方法的子例程,非常类似于异常处理程序的代码。**当调用子例程的jsr指令被执行时,它会将其返回地址(位于正在执行的jsr之后的指令地址)作为returnAddress类型的值推入操作数堆栈。**子例程的代码将返回地址存储在一个局部变量中。在子例程的末尾,ret指令从本地变量获取返回地址,并将控制转移到返回地址处的指令。

Control can be transferred to the finally clause (the finally subroutine can be invoked) in several different ways. If the try clause completes normally, the finally subroutine is invoked via a jsr instruction before evaluating the next expression. A break or continue inside the try clause that transfers control outside the try clause executes a jsr to the code for the finally clause first. If the try clause executes a return, the compiled code does the following:

  1. Saves the return value (if any) in a local variable.
  2. Executes a jsr to the code for the finally clause.
  3. Upon return from the finally clause, returns the value saved in the local variable.

The compiler sets up a special exception handler, which catches any exception thrown by the try clause. If an exception is thrown in the try clause, this exception handler does the following:

  1. Saves the exception in a local variable.
  2. Executes a jsr to the finally clause.
  3. Upon return from the finally clause, rethrows the exception.

For more information about the implementation of the try-finally construct, see §3.13.

The code for the finally clause presents a special problem to the verifier. Usually, if a particular instruction can be reached via multiple paths and a particular local variable contains incompatible values through those multiple paths, then the local variable becomes unusable. However, a finally clause might be called from several different places, yielding several different circumstances:

  • The invocation from the exception handler may have a certain local variable that contains an exception.
  • The invocation to implement return may have some local variable that contains the return value.
  • The invocation from the bottom of the try clause may have an indeterminate value in that same local variable.

The code for the finally clause itself might pass verification, but after completing the updating all the successors of the ret instruction, the verifier would note that the local variable that the exception handler expects to hold an exception, or that the return code expects to hold a return value, now contains an indeterminate value.

Verifying code that contains a finally clause is complicated. The basic idea is the following:

  • Each instruction keeps track of the list of jsr targets needed to reach that instruction. For most code, this list is empty. For instructions inside code for the finally clause, it is of length one. For multiply nested finally code (extremely rare!), it may be longer than one.

  • 每条指令都跟踪到达该指令所需的jsr目标列表。对于大多数代码,这个列表是空的。对于finally子句代码中的指令,其长度为1。对于多重嵌套的finally代码(非常罕见!),它可能大于1。

  • For each instruction and each jsr needed to reach that instruction, a bit vector is maintained of all local variables accessed or modified since the execution of the jsr instruction.

  • 对于每条指令和达到该指令所需的每个jsr,都维护自jsr指令执行以来访问或修改的所有局部变量的位向量。

  • When executing the ret instruction, which implements a return from a subroutine, there must be only one possible subroutine from which the instruction can be returning. Two different subroutines cannot “merge” their execution to a single ret instruction.

  • 当执行ret指令时,该指令实现子例程的返回,必须只有一个可能的子例程可以从该指令返回。两个不同的子例程不能将它们的执行“合并”为一个ret指令。

  • To perform the data-flow analysis on a ret instruction, a special procedure is used. Since the verifier knows the subroutine from which the instruction must be returning, it can find all the jsr instructions that call the subroutine and merge the state of the operand stack and local variable array at the time of the ret instruction into the operand stack and local variable array of the instructions following the jsr. Merging uses a special set of values for local variables:

For any local variable that the bit vector (constructed above) indicates has been accessed or modified by the subroutine, use the type of the local variable at the time of the ret.
For other local variables, use the type of the local variable before the jsr instruction.

4.11. Limitations of the Java Virtual Machine Java虚拟机的限制

The following limitations of the Java Virtual Machine are implicit in the class file format:

  • The per-class or per-interface constant pool is limited to 65535 entries by the 16-bit constant_pool_count field of the ClassFile structure (§4.1). This acts as an internal limit on the total complexity of a single class or interface.

  • The number of fields that may be declared by a class or interface is limited to 65535 by the size of the fields_count item of the ClassFile structure (§4.1).

  • Note that the value of the fields_count item of the ClassFile structure does not include fields that are inherited from superclasses or superinterfaces.

  • The number of methods that may be declared by a class or interface is limited to 65535 by the size of the methods_count item of the ClassFile structure (§4.1).

  • 类或接口可以声明的方法的数量被ClassFile结构中methods_count项的大小限制为65535(§4.1)。

  • Note that the value of the methods_count item of the ClassFile structure does not include methods that are inherited from superclasses or superinterfaces.

  • The number of direct superinterfaces of a class or interface is limited to 65535 by the size of the interfaces_count item of the ClassFile structure (§4.1).

  • The greatest number of local variables in the local variables array of a frame created upon invocation of a method (§2.6) is limited to 65535 by the size of the max_locals item of the Code attribute (§4.7.3) giving the code of the method, and by the 16-bit local variable indexing of the Java Virtual Machine instruction set.

  • Note that values of type long and double are each considered to reserve two local variables and contribute two units toward the max_locals value, so use of local variables of those types further reduces this limit.

  • The size of an operand stack in a frame (§2.6) is limited to 65535 values by the max_stack field of the Code attribute (§4.7.3).

  • Note that values of type long and double are each considered to contribute two units toward the max_stack value, so use of values of these types on the operand stack further reduces this limit.

  • The number of method parameters is limited to 255 by the definition of a method descriptor (§4.3.3), where the limit includes one unit for this in the case of instance or interface method invocations.

  • Note that a method descriptor is defined in terms of a notion of method parameter length in which a parameter of type long or double contributes two units to the length, so parameters of these types further reduce the limit.

  • The length of field and method names, field and method descriptors, and other constant string values (including those referenced by ConstantValue (§4.7.2) attributes) is limited to 65535 characters by the 16-bit unsigned length item of the CONSTANT_Utf8_info structure (§4.4.7).

  • Note that the limit is on the number of bytes in the encoding and not on the number of encoded characters. UTF-8 encodes some characters using two or three bytes. Thus, strings incorporating multibyte characters are further constrained.

  • The number of dimensions in an array is limited to 255 by the size of the dimensions opcode of the multianewarray instruction and by the constraints imposed on the multianewarray, anewarray, and newarray instructions (§4.9.1, §4.9.2).

  • 0
  • 1
    觉得还不错? 一键收藏
  • 0




当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


