所谓的跳转,其实属于程序控制转移的范畴,分为有条件跳转和无条件跳转。有条件跳转又分为一般条件跳转和复合条件跳转。
- 一般条件跳转:>、 =、 <、!=、 >=、<=、== null 、!= null
- 复合条件跳转:switch...case...
- 无条件跳转:goto
在 ASM 中,将一般条件跳转和无条件跳转归于一类处理,复合条件跳转单独处理。下面先来看看一般条件跳转和无条件跳转。
一般条件跳转
先来看一段源码,这段代码来自于 org.objectweb.asm.MethodWriter:
@Override
public void visitJumpInsn(final int opcode, final Label label) {
lastBytecodeOffset = code.length;
// Add the instruction to the bytecode of the method.
// Compute the 'base' opcode, i.e. GOTO or JSR if opcode is GOTO_W or JSR_W, otherwise opcode.
int baseOpcode =
opcode >= Constants.GOTO_W ? opcode - Constants.WIDE_JUMP_OPCODE_DELTA : opcode;
boolean nextInsnIsJumpTarget = false;
if ((label.flags & Label.FLAG_RESOLVED) != 0
&& label.bytecodeOffset - code.length < Short.MIN_VALUE) {
// Case of a backward jump with an offset < -32768. In this case we automatically replace GOTO
// with GOTO_W, JSR with JSR_W and IFxxx <l> with IFNOTxxx <L> GOTO_W <l> L:..., where
// IFNOTxxx is the "opposite" opcode of IFxxx (e.g. IFNE for IFEQ) and where <L> designates
// the instruction just after the GOTO_W.
if (baseOpcode == Opcodes.GOTO) {
code.putByte(Constants.GOTO_W);
} else if (baseOpcode == Opcodes.JSR) {
code.putByte(Constants.JSR_W);
} else {
// Put the "opposite" opcode of baseOpcode. This can be done by flipping the least
// significant bit for IFNULL and IFNONNULL, and similarly for IFEQ ... IF_ACMPEQ (with a
// pre and post offset by 1). The jump offset is 8 bytes (3 for IFNOTxxx, 5 for GOTO_W).
code.putByte(baseOpcode >= Opcodes.IFNULL ? baseOpcode ^ 1 : ((baseOpcode + 1) ^ 1) - 1);
code.putShort(8);
// Here we could put a GOTO_W in theory, but if ASM specific instructions are used in this
// method or another one, and if the class has frames, we will need to insert a frame after
// this GOTO_W during the additional ClassReader -> ClassWriter round trip to remove the ASM
// specific instructions. To not miss this additional frame, we need to use an ASM_GOTO_W
// here, which has the unfortunate effect of forcing this additional round trip (which in
// some case would not have been really necessary, but we can't know this at this point).
code.putByte(Constants.ASM_GOTO_W);
hasAsmInstructions = true;
// The instruction after the GOTO_W becomes the target of the IFNOT instruction.
nextInsnIsJumpTarget = true;
}
label.put(code, code.length - 1, true);
} else if (baseOpcode != opcode) {
// Case of a GOTO_W or JSR_W specified by the user (normally ClassReader when used to remove
// ASM specific instructions). In this case we keep the original instruction.
code.putByte(opcode);
label.put(code, code.length - 1, true);
} else {
// Case of a jump with an offset >= -32768, or of a jump with an unknown offset. In these
// cases we store the offset in 2 bytes (which will be increased via a ClassReader ->
// ClassWriter round trip if it turns out that 2 bytes are not sufficient).
code.putByte(baseOpcode);
label.put(code, code.length - 1, false);
}
// If needed, update the maximum stack size and number of locals, and stack map frames.
if (currentBasicBlock != null) {
Label nextBasicBlock = null;
if (compute == COMPUTE_ALL_FRAMES) {
currentBasicBlock.frame.execute(baseOpcode, 0, null, null);
// Record the fact that 'label' is the target of a jump instruction.
label.getCanonicalInstance().flags |= Label.FLAG_JUMP_TARGET;
// Add 'label' as a successor of the current basic block.
addSuccessorToCurrentBasicBlock(Edge.JUMP, label);
if (baseOpcode != Opcodes.GOTO) {
// The next instruction starts a new basic block (except for GOTO: by default the code
// following a goto is unreachable - unless there is an explicit label for it - and we
// should not compute stack frame types for its instructions).
nextBasicBlock = new Label();
}
} else if (compute == COMPUTE_INSERTED_FRAMES) {
currentBasicBlock.frame.execute(baseOpcode, 0, null, null);
} else if (compute == COMPUTE_MAX_STACK_AND_LOCAL_FROM_FRAMES) {
// No need to update maxRelativeStackSize (the stack size delta is always negative).
relativeStackSize += STACK_SIZE_DELTA[baseOpcode];
} else {
if (baseOpcode == Opcodes.JSR) {
// Record the fact that 'label' designates a subroutine, if not already done.
if ((label.flags & Label.FLAG_SUBROUTINE_START) == 0) {
label.flags |= Label.FLAG_SUBROUTINE_START;
hasSubroutines = true;
}
currentBasicBlock.flags |= Label.FLAG_SUBROUTINE_CALLER;
// Note that, by construction in this method, a block which calls a subroutine has at
// least two successors in the control flow graph: the first one (added below) leads to
// the instruction after the JSR, while the second one (added here) leads to the JSR
// target. Note that the first successor is virtual (it does not correspond to a possible
// execution path): it is only used to compute the successors of the basic blocks ending
// with a ret, in {@link Label#addSubroutineRetSuccessors}.
addSuccessorToCurrentBasicBlock(relativeStackSize + 1, label);
// The instruction after the JSR starts a new basic block.
nextBasicBlock = new Label();
} else {
// No need to update maxRelativeStackSize (the stack size delta is always negative).
relativeStackSize += STACK_SIZE_DELTA[baseOpcode];
addSuccessorToCurrentBasicBlock(relativeStackSize, label);
}
}
// If the next instruction starts a new basic block, call visitLabel to add the label of this
// instruction as a successor of the current block, and to start a new basic block.
if (nextBasicBlock != null) {
if (nextInsnIsJumpTarget) {
nextBasicBlock.flags |= Label.FLAG_JUMP_TARGET;
}
visitLabel(nextBasicBlock);
}
if (baseOpcode == Opcodes.GOTO) {
endCurrentBasicBlockWithNoSuccessor();
}
}
}
参数:
opcode:即待执行的操作指令;
label:待跳转到的块。
代码核心思想归纳如下:
- 计算 baseOpcode;
- 操作码放入 code 中,并为 label 创建前向引用;
- 判断 currentBasicBlock 是否为 null,不为 null 接着向下执行,此时只关注 compute 为 COMPUTE_ALL_FRAMES 这一分支,执行 opcode 指令,变更 label 的 flag 属性,接着设置 label 为 currentBasicBlock 的 successor,如果不是无条件跳转指令 GOTO,创建一个新的 Lable,即 nextBasicBlock,接着切换到这个新建的 Label。如果是无条件跳转指令 GOTO,则调用 endCurrentBasicBlockWithNoSuccessor,此时也会创建一个新的 Label,作为最后一个 Lable,lastBasicBlock,并将 currentBasicBlock 置为 null。
从上面可知,不管是一般条件跳转,还是无条件跳转,都会创建一个隐形的 Label,不同的是,当为一般条件跳转时,自动切换到隐形的 Label 执行,表示不符合跳转条件时的执行分支。
常见指令介绍
指令 | 描述 |
---|---|
IFNULL | 栈顶为 null 时,跳转到指定 label,隐形 Label 条件为 != null |
IFNONNULL | 栈顶不为 null 时,跳转到指定 label,隐形 Label 条件为 == null |
IFEQ | 栈顶元素和常量 0 比较,即 == 0 时,跳转到指定 label,隐形 Label 条件为 != 0,布尔条件为 true |
IFNE | 栈顶元素不等于 0 时,跳转到指定 label,隐形 Label 条件为 == 0,布尔条件为 false |
IF_ICMPEQ | 栈顶两个元素比较,相等时跳转到指定 label,隐形 Label 条件为 != |
IF_ICMPNE | 栈顶两个元素比较,不相等时跳转到指定 label,隐形 Label 条件为 == |
用法示例
public class Generate48 implements Opcodes {
public static void main(String[] args) {
String generateClassName = "ASM$Generate48";
ClassLoaderUtils.outputClass(generate(generateClassName), generateClassName);
}
private static byte[] generate(String generateClassName) {
ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_FRAMES);
// declare_class
cw.visit(V1_8, ACC_PUBLIC, generateClassName, null, "java/lang/Object", null);
// declare_method
MethodVisitor mv = cw.visitMethod(ACC_PUBLIC, "ageDesc", "(I)Ljava/lang/String;", null, null);
mv.visitVarInsn(ILOAD, 1);
mv.visitIntInsn(BIPUSH, 18);
Label l1 = new Label();
/**
* IF_ICMPNE 用于两个数比较,执行时 pop(2)
*
* IF_ICMPNE,两个数不相等,跳转到 l1,此时隐形块条件 ==
*/
mv.visitJumpInsn(IF_ICMPNE, l1);
mv.visitLdcInsn("刚满十八岁");
mv.visitInsn(ARETURN);
mv.visitLabel(l1);
mv.visitVarInsn(ALOAD, 1);
mv.visitIntInsn(BIPUSH, 18);
Label l2 = new Label();
mv.visitJumpInsn(IF_ICMPGE, l2); // var1 >= 18 跳转到 l2,这样隐形块条件 var1 < 18
mv.visitLdcInsn("未成年人");
mv.visitInsn(ARETURN);
mv.visitLabel(l2);
mv.visitLdcInsn("成年人");
mv.visitInsn(ARETURN);
mv.visitMaxs(0, 0);
return cw.toByteArray();
}
}
生成的代码如下:
public class ASM$Generate48 {
public String ageDesc(int var1) {
if (var1 == 18) {
return "刚满十八岁";
} else {
return var1 < 18 ? "未成年人" : "成年人";
}
}
}
可以看到,看似复杂的三目运算,其实还是通过简单的跳转指令实现的。
复合条件跳转
在 ASM 中,switch...case... 由以下两个指令实现:
- LOOKUPSWITCH:对应方法 MethodWriter#visitLookupSwitchInsn,case 条件比较稀疏时采用此指令
- TABLESWITCH:对应方法 MethodWriter#visitTableSwitchInsn,case 条件比较紧凑时采用此指令
下面来看看 MethodWriter#visitLookupSwitchInsn 这个方法:
@Override
public void visitLookupSwitchInsn(final Label dflt, final int[] keys, final Label[] labels) {
lastBytecodeOffset = code.length;
// Add the instruction to the bytecode of the method.
code.putByte(Opcodes.LOOKUPSWITCH).putByteArray(null, 0, (4 - code.length % 4) % 4);
dflt.put(code, lastBytecodeOffset, true);
code.putInt(labels.length);
for (int i = 0; i < labels.length; ++i) {
code.putInt(keys[i]);
labels[i].put(code, lastBytecodeOffset, true);
}
// If needed, update the maximum stack size and number of locals, and stack map frames.
visitSwitchInsn(dflt, labels);
}
private void visitSwitchInsn(final Label dflt, final Label[] labels) {
if (currentBasicBlock != null) {
if (compute == COMPUTE_ALL_FRAMES) {
currentBasicBlock.frame.execute(Opcodes.LOOKUPSWITCH, 0, null, null);
// Add all the labels as successors of the current basic block.
addSuccessorToCurrentBasicBlock(Edge.JUMP, dflt);
dflt.getCanonicalInstance().flags |= Label.FLAG_JUMP_TARGET;
for (Label label : labels) {
addSuccessorToCurrentBasicBlock(Edge.JUMP, label);
label.getCanonicalInstance().flags |= Label.FLAG_JUMP_TARGET;
}
} else if (compute == COMPUTE_MAX_STACK_AND_LOCAL) {
// No need to update maxRelativeStackSize (the stack size delta is always negative).
--relativeStackSize;
// Add all the labels as successors of the current basic block.
addSuccessorToCurrentBasicBlock(relativeStackSize, dflt);
for (Label label : labels) {
addSuccessorToCurrentBasicBlock(relativeStackSize, label);
}
}
// End the current basic block.
endCurrentBasicBlockWithNoSuccessor();
}
}
由于有了前面的经验,这段代码就比较简单了,先来看看每个参数的意思:
dflt:default 操作块
keys:case 条件转化为的 int 数组,由于都需要转化为 int 型处理,所以上面的“稀疏”、“紧凑”的意思其实就是这个转化成的 int 数组元素是否连贯。
labels:case 条件对应的执行 Label 数组
代码核心思想就是,将操作码放入 code,为 dflt 和 labels 设置前向引用,设置 dflt 和 每一个 case 对应的 Label 为 currentBasicBlock 的 successor。
下面思考一个问题,当条件为 String 类型时,怎么操作?
首先我们知道,传入的 keys 是一个 int 类型数组,那么 String 怎么转变为 int 类型呢?答案就是求 hashCode。
用法示例
先来看一段代码:
public class MyMethod7 {
public void print(String deviceType) {
switch (deviceType) {
case "JBJ":
System.out.println("搅拌机");
break;
case "TPJ":
System.out.println("摊铺机");
break;
case "GLYLJ":
System.out.println("钢轮压路机");
break;
case "JLYLJ":
System.out.println("胶轮压路机");
break;
default:
System.out.println("No device of this type");
}
}
}
上面这段代码如果由 ASM 生成,会怎么操作呢?
public class Generate502 implements Opcodes {
public static void main(String[] args) {
String generateClassName = "ASM$Generate502";
ClassLoaderUtils.outputClass(generate(generateClassName), generateClassName);
}
private static byte[] generate(String generateClassName) {
ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_MAXS);
// declare_class
cw.visit(V1_8, ACC_PUBLIC, generateClassName, null, "java/lang/Object", null);
// declare_method
MethodVisitor mv = cw.visitMethod(ACC_PUBLIC, "print", "(Ljava/lang/String;)V", null, null);
// case 条件为字符串,需转换为 hashCode 值,接着构造一个 int 数组,对其排序后作为 keys 传入跳转操作方法
Map<String, String> data = new HashMap<>();
data.put("GLYLJ", "钢轮压路机");
data.put("JLYLJ", "胶轮压路机");
data.put("TPJ", "摊铺机");
data.put("JBJ", "搅拌机");
int len = data.size();
Map<Integer, String> maps = new HashMap<>();
int[] switchKeys = new int[len];
AtomicInteger index = new AtomicInteger(0);
data.keySet().forEach(x -> {
int hashCode = x.hashCode();
switchKeys[index.getAndIncrement()] = hashCode;
maps.put(hashCode, x);
});
Arrays.sort(switchKeys);
mv.visitInsn(ICONST_M1); // 常量 -1
mv.visitVarInsn(ISTORE, 2);
mv.visitVarInsn(ALOAD, 1);
mv.visitMethodInsn(INVOKEVIRTUAL, "java/lang/String", "hashCode", "()I", false);
// 创建 case 对应 Label 数组
Label[] labels = new Label[len];
Label[] labels1 = new Label[len];
for (int i = 0; i < len; i ++) {
labels[i] = new Label();
labels1[i] = new Label();
}
Label def_ = new Label();
mv.visitLookupSwitchInsn(def_, switchKeys, labels);
for (int i = 0; i < labels.length; i++) {
mv.visitLabel(labels[i]);
mv.visitVarInsn(ALOAD, 1);
mv.visitLdcInsn(maps.get(switchKeys[i]));
mv.visitMethodInsn(INVOKEVIRTUAL, "java/lang/String", "equals", "(Ljava/lang/Object;)Z", false);
mv.visitJumpInsn(IFEQ, def_); // 隐形块条件 true
mv.visitIntInsn(BIPUSH, i);
mv.visitVarInsn(ISTORE, 2);
mv.visitJumpInsn(GOTO, def_);
}
mv.visitLabel(def_);
mv.visitVarInsn(ILOAD, 2);
Label switch_default = new Label();
Label switch_out = new Label();
mv.visitTableSwitchInsn(0, len - 1, switch_default, labels1);
for (int i = 0; i < labels1.length; i++) {
mv.visitLabel(labels1[i]);
mv.visitFieldInsn(GETSTATIC, "java/lang/System", "out", "Ljava/io/PrintStream;");
mv.visitLdcInsn(data.get(maps.get(switchKeys[i])));
mv.visitMethodInsn(INVOKEVIRTUAL, "java/io/PrintStream", "println", "(Ljava/lang/String;)V", false);
mv.visitJumpInsn(GOTO, switch_out);
}
mv.visitLabel(switch_default);
mv.visitFieldInsn(GETSTATIC, "java/lang/System", "out", "Ljava/io/PrintStream;");
mv.visitLdcInsn("No device of this type");
mv.visitMethodInsn(INVOKEVIRTUAL, "java/io/PrintStream", "println", "(Ljava/lang/String;)V", false);
mv.visitLabel(switch_out);
mv.visitInsn(RETURN);
mv.visitMaxs(2, 3);
return cw.toByteArray();
}
}
先是将 String 类型通过求hashCode,转化为 int 类型,接着排序,作为参数 keys,此处定义了一个变量,每次操作后为这个变量赋值,作为下一次 switch...case... 条件使用;接着进行第二次 switch...case... 操作,此时由于定义的变量赋值都比较“紧凑”,采用了 TABLESWITCH,最后看看这段代码生成的类和方法具体是什么?
public class ASM$Generate502 {
public void print(String var1) {
byte var2 = -1;
switch(var1.hashCode()) {
case 73234:
if (var1.equals("JBJ")) {
var2 = 0;
}
break;
case 83278:
if (var1.equals("TPJ")) {
var2 = 1;
}
break;
case 67922066:
if (var1.equals("GLYLJ")) {
var2 = 2;
}
break;
case 70692629:
if (var1.equals("JLYLJ")) {
var2 = 3;
}
}
switch(var2) {
case 0:
System.out.println("搅拌机");
break;
case 1:
System.out.println("摊铺机");
break;
case 2:
System.out.println("钢轮压路机");
break;
case 3:
System.out.println("胶轮压路机");
break;
default:
System.out.println("No device of this type");
}
}
}
顺便说一句,这其实也是 JVM 中对于 String 类型的 switch...case... 的字节码实现。