目录《工作五年,我从零开始学代码》
上篇文章《最强大、最古老、最本源的代码:指令INSTRUCTION》
目录
在上篇文章中,介绍了MIC-1机器以及其指令集IJVM,本文将使用java代码实现它们,一共有两种不同的实现,第一种包含了编译器和IJVM,编译器将java汇编语言编译成IJVM机器码,最后由IJVM读取运行。第二种复杂了些,它包含(1)MIC-1模拟器,上面运行着完整的微程序,用来对IJVM指令进行翻译执行(2)MAL [Micro Assembly Language translator微汇编语言翻译器](3)编译器(4)IJVM(5)GUI界面
本文标题的TOY-IJVM,意思是玩具IJVM,它仅仅起到演示的作用
简单IJVM实现
第一步:IJVM实现
首先介绍第一种实现吧,第一步是编写IJVM,它读取一个可执行文件并解析(类似于.class文件,我们规定好了内容格式),接着填充内存区域(方法区、常量池),开辟栈内存空间。然后初始化PC、LV、SP等各种指针,比如PC指向方法区的起始地址。最后就是执行了,开始读取方法区的指令,按照不同指令执行不同操作,下面给出IJVM的框架代码
“Talk is cheap. Show me the code.” ― Linus Torvalds
public class IJVM {
// 常量池
protected int[] constantPool;
// 栈
protected int[] stack;
// 方法区
protected byte[] methodArea;
// PC指针
protected int pc;
// 栈帧基地址
protected int lv;
// 栈顶指针
protected int sp;
// main
public static void main(String args[]) {
IJVM machine = new IJVM("可执行文件地址");
machine.run();
}
public IJVM(String fname) {
// 分配栈空间
stack = new int[4096];
// 读取可执行文件,并填充方法区、常量池
DataInputStream input = new DataInputStream(new FileInputStream(fname));
readMethods(input);
readConstantPool(input);
// 初始化各类指针
pc = 0;
lv = -1;
sp = -1;
}
// 填充方法区
private void readMethods(DataInputStream input) {
// 规定可执行文件中,按照顺序先后保存方法区大小,方法区代码,常量池大小,常量
int size = input.readInt();
methodArea = new byte[size];
for (int i = 0; i < size; i++)
methodArea[i] = input.readByte();
}
// 填充常量池
private void readConstantPool(DataInputStream input) {
int size = input.readInt();
constantPool = new int[size];
for (int i = 0; i < size; i++)
constantPool[i] = input.readInt();
}
// 运行程序。遇到HALT指令则退出
public void run() {
byte opcode;
do {
int oldPC = pc;
opcode = getOpcode(pc);
executeOpcode(opcode);
} while (opcode != 0x04); /* HALT */
}
// 根据PC地址取指令
protected byte getOpcode(int addr) {
return methodArea[addr];
}
// 执行指令
protected void executeOpcode(int opcode) {
switch (opcode) {
case 0x10: /* BIPUSH byte */
stackPush((int) (methodArea[pc + 1]));
pc = pc + 2;
break;
case 0x59: /* DUP */
temp = stackPop();
stackPush(temp);
stackPush(temp);
pc = pc + 1;
break;
...
...
...
}
}
// 根据指令,执行出栈操作
protected int stackPop() {
return stack[sp--];
}
// 根据指令,执行入栈操作
protected void stackPush(int top) {
stack[++sp] = top;
}
}
第二步:编译器实现
IJVM能够执行的只有特定格式的二进制形式文件,而我们写的代码是IJVM汇编,所以需要编译器帮我们做一个转换。这里实现的编译器读取一个文件,经过解析后,将可执行内容输出到一个文件中,下面给出框架代码
public class IJVMAssembler {
// 常量池
int[] constantPool;
// 方法区
byte[] methodArea;
// 符号引用。比如方法名称引用
SymbolTable symbols;
// 方法区计数器
int ip;
// 常量池计数器
int cppTop;
// 指令集
InstructionTable instructions;
public static void main(String[] args) {
BufferedReader input = new BufferedReader(new FileReader("输入文件地址"));
IJVMAssembler as = new IJVMAssembler(input);
as.write(new DataOutputStream(new FileOutputStream("输出文件地址")));
}
public IJVMAssembler(BufferedReader input) {
constantPool = new int[65536];
methodArea = new byte[65536];
symbols = new SymbolTable();
ip = 0;
cppTop = 0;
// 指令集包含的所有指令
instructions = new InstructionTable();
instructions.add("BIPUSH", 16, 1, 1);
instructions.add("DUP", 89, 0, 0);
instructions.add("GOTO", -89, 1, 2);
instructions.add("IADD", 96, 0, 0);
instructions.add("IAND", 126, 0, 0);
instructions.add("IFEQ", -103, 1, 2);
instructions.add("IFLT", -101, 1, 2);
instructions.add("IF_ICMPEQ", -97, 1, 2);
instructions.add("IINC", -124, 2, 1);
instructions.add("ILOAD", 21, 1, 1);
instructions.add("INVOKEVIRTUAL", -74, 1, 2);
instructions.add("IOR", -128, 0, 0);
instructions.add("IRETURN", -84, 0, 0);
instructions.add("ISTORE", 54, 1, 1);
instructions.add("ISUB", 100, 0, 0);
instructions.add("LDC_W", 19, 1, 2);
instructions.add("NOP", 0, 0, 0);
instructions.add("POP", 87, 0, 0);
instructions.add("SWAP", 95, 0, 0);
instructions.add("IPRINT", 1, 0, 0);
instructions.add("IREAD", 2, 0, 0);
instructions.add("SPRINT", 3, 1, 2);
instructions.add("HALT", 4, 0, 0);
instructions.add("DUMP", 5, 0, 0);
instructions.add("IMUL", 6, 0, 0);
instructions.add("IDIV", 7, 0, 0);
instructions.add("IREM", 8, 0, 0);
instructions.add("IEXP", 9, 0, 0);
assemble(input);
}
private void assemble(BufferedReader input) {
String line;
int lineCount = 1;
// 一行一行读取汇编代码,并解析
while ((line = input.readLine()) != null) {
AssemblyLine aline = new AssemblyLine(line);
// 该行为标号
if (aline.getLabel().length() > 0) {
}
// 该行为方法定义
if (aline.getMethod().length() > 0) {
}
// 该行为普通指令
if (aline.getMnemonic().length() > 0) {
}
lineCount++;
}
}
// 按照特定顺序,将可执行内容写入输出文件
public void write(DataOutputStream output) {
output.writeInt(ip);
for (int i = 0; i < ip; i++)
output.writeByte(methodArea[i]);
output.writeInt(cppTop);
for (int i = 0; i < cppTop; i++)
output.writeInt(constantPool[i]);
}
}
到这里已经给出了该方案的实现思路,或许可以点此下载完整源码,本地跑起来感受一下。差点忘了,你可能还不知道如何书写IJVM汇编呢,这里给出几个例子供参考
1、输入一个数字,判断你是否适合喝酒。是一个较为简单的例子
SPRINT "How old are you?\n"
IREAD
DUP
BIPUSH 18
ISUB
IFLT novote
DUP
BIPUSH 21
ISUB
IFLT nodrink
SPRINT "You are old enough to vote and drink.\n"
:end
HALT
:novote
SPRINT "You can't vote and you can't drink.\n"
GOTO end
:nodrink
SPRINT "You can vote but you can't drink.\n"
GOTO end
2、输入一个数字,输出其平方。包含了一次方法调用,栈帧较为简单
BIPUSH 0
SPRINT "Enter a number: \n"
IREAD
INVOKEVIRTUAL square
IPRINT
SPRINT "\n"
HALT
# square takes one parameter (x) and has two local variables (i and tot)
# x is local variable number 1
# i is local variable number 2
# tot is local variable number 3
square 2 2
# i = 0; tot = 0;
BIPUSH 0
DUP
ISTORE 2
ISTORE 3
# if (x < 0)
ILOAD 1
IFLT negative
GOTO loop
# x = -x;
:negative
BIPUSH 0
ILOAD 1
ISUB
ISTORE 1
# while (i != x)
:loop
ILOAD 2
ILOAD 1
IF_ICMPEQ end
# tot = tot + x;
ILOAD 1
ILOAD 3
IADD
ISTORE 3
# i++;
IINC 2 1
GOTO loop
#return tot;
:end
ILOAD 3
IRETURN
3、一个小游戏。包含多次方法调用,栈帧结构复杂
:input
BIPUSH 0
SPRINT "Welcome to NIM. The object of the game is to take the last\n"
SPRINT "stone. You may take 1, 2, or 3 stones during your turn.\n\n"
SPRINT "How many stones would you like to play with?\n"
IREAD
DUP
IFLT retry
DUP
IFEQ retry
INVOKEVIRTUAL playGame
POP
HALT
:retry
SPRINT "Please enter a positive number of stones.\n"
GOTO input
playGame 2 0
BIPUSH 0
ILOAD 1
INVOKEVIRTUAL mod4
IFEQ userFirst
SPRINT "Computer goes first.\n"
GOTO computerTurn
:userFirst
SPRINT "You go first.\n"
:gameloop
BIPUSH 0
ILOAD 1
INVOKEVIRTUAL printStones
POP
BIPUSH 0
ILOAD 1
INVOKEVIRTUAL userTurn
ILOAD 1
SWAP
ISUB
ISTORE 1
ILOAD 1
IFEQ userWins
:computerTurn
BIPUSH 0
ILOAD 1
INVOKEVIRTUAL printStones
POP
BIPUSH 0
ILOAD 1
INVOKEVIRTUAL computerMove
ILOAD 1
SWAP
ISUB
ISTORE 1
ILOAD 1
IFEQ computerWins
GOTO gameloop
:userWins
SPRINT "You won! I demand a rematch!\n"
BIPUSH 0
IRETURN
:computerWins
SPRINT "I won! I am champion of the world!\n"
BIPUSH 0
IRETURN
printStones 2 1
BIPUSH 0
ISTORE 2
:loop
ILOAD 1
ILOAD 2
IF_ICMPEQ end
SPRINT "o"
IINC 2 1
GOTO loop
:end
SPRINT "\n"
BIPUSH 0
IRETURN
userTurn 2 1
:userInput
SPRINT "How many stones do you take?\n"
IREAD
DUP
BIPUSH 3
SWAP
ISUB
IFLT tooMany
DUP
BIPUSH 1
ISUB
IFLT tooFew
DUP
ILOAD 1
SWAP
ISUB
IFLT tooMany
IRETURN
:tooMany
SPRINT "You can't take that many stones!\n"
POP
GOTO userInput
:tooFew
SPRINT "You must take at least one stone.\n"
POP
GOTO userInput
computerMove 2 1
BIPUSH 0
ILOAD 1
INVOKEVIRTUAL mod4
DUP
IFEQ takeOne
GOTO tellUser
:takeOne
BIPUSH 1
ISUB
:tellUser
DUP
DUP
BIPUSH 1
IF_ICMPEQ singular
SPRINT "Computer takes "
IPRINT
SPRINT " stones.\n"
IRETURN
:singular
SPRINT "Computer takes 1 stone.\n"
POP
IRETURN
mod4 2 1
:mod4loop
ILOAD 1
BIPUSH 4
ISUB
IFLT endmod4
ILOAD 1
BIPUSH 4
ISUB
ISTORE 1
GOTO mod4loop
:endmod4
ILOAD 1
IRETURN
复杂IJVM实现
这种实现方案在文章开头简单介绍过,其过于复杂,对于我们搞应用的程序员来说,明白其基本原理即可,以后如果有需求再深钻也不迟,所以这里只给出使用方法和源代码
左侧Assembly IJVM Program输入框,可以输入你编写的汇编代码。点击Translate & Load按钮,Assembler Output中会显示编译结果。如果一切ok,可以点击Start按钮执行,或者Step单步执行。中间紫色部分会展示当前栈帧结构。界面右侧展示MIC-1、方法区、常量池的状态
示例汇编代码
.constant
c1 100
c2 152
.end-constant
.main
.var
var1
var2
.end-var
BIPUSH 10
ISTORE var1
BIPUSH 15 //This is a comment
ISTORE var2
LDCW objref
ILOAD var1
ILOAD var2
INVOKEVIRTUAL met
.end-main
.method met(p1, p2)//This is another comment
ILOAD p1
ILOAD p2
IADD
LDCW c1
IADD
IRETURN
.end-method
总结
这篇文章实现了一个简单的玩具IJVM,它只是用于了解机器的底层原理和JVM入门使用。以后的文章中,我会给出一个完整的JVM分析,而且会侧重分析GC这种面试官爱问的八股文问题。不过在此之前,还有非常多的知识需要学习