X86-64 Architecture Guide

最新推荐文章于 2023-09-01 22:26:44 发布

mounter625

最新推荐文章于 2023-09-01 22:26:44 发布

阅读量352

点赞数 1

X86-64 Architecture Guide

For the code-generation project, we expect your compiler to produce simple assembly code. We shall expose you to a subset of the x86-64 platform.对于代码生成项目, 我们希望编译器生成简单的汇编代码。我们将向您展示x86-64 平台的一个子集。

Example

Consider the following Decaf program: 考虑下面的Decaf程序:

class Program {
    int foo(int x) {
        return x + 3;
    }
    void main() {
        int y;

        y = foo(callout("get_int_035"));
        if (y == 15) {
            callout("printf", "Indeed! \'tis 15!\n");
        } else {
            callout("printf", "What! %d\n", y);
        }
    }
}

For the code generation phase of the compiler project, you are encouraged to output simple and inefficient (but correct!) assembly code. 对于编译器项目的代码生成阶段, 鼓励您输出简单、低效 (但正确的) 汇编代码。This assembly code can assign every variable and temporary to a location on the current stack frame. Every expression value will be loaded from the stack, manipulated using the registers %r10 and %r11, and then the result will be written back to the stack, ready to be used in another expression. Compiling the above decaf code using this simple scheme might look like this: 此汇编代码可以将变量和临时值放在当前堆栈上的某个位置。每个表达式值都将从堆栈中加载, 使用寄存器% r10 和% r11 进行操作, 然后将结果写入堆栈中, 准备在另一个表达式中使用。使用这个简单的方案编译上述Decaf的代码可能如下所述:

foo:
    enter   $(8*2), $0
    mov     %rdi, -8(%rbp)

    mov     -8(%rbp), %r10
    add     $3, %r10
    mov     %r10, -16(%rbp)

    mov     -16(%rbp), %rax
    leave
    ret

    .globl main
main:
    enter   $(8 * 6), $0

    call    get_int_035
    mov     %rax, -8(%rbp)

    mov     -8(%rbp), %rdi
    call    foo
    mov     %rax, -16(%rbp)
    mov     -16(%rbp), %r10
   mov     %r10, -24(%rbp)

    mov     -24(%rbp), %r10
    mov     $15, %r11
    cmp     %r10, %r11
    mov     $0, %r11
    mov     $1, %r10
    cmove   %r10, %r11
    mov     %r11, -32(%rbp)

    mov     -32(%rbp), %r10
    mov     $1, %r11
    cmp     %r10, %r11
    je      .fifteen

    mov     $.what, %rdi
    mov     -24(%rbp), %rsi
    mov     $0, %rax
    call    printf
    mov     %rax, -40(%rbp)
    jmp     .fifteen_done

.fifteen:
    mov     $.indeed, %rdi
    mov     $0, %rax
    call   printf
    mov     %rax, -48(%rbp)

.fifteen_done:
    mov     $0, %rax
    leave
    ret

.indeed:
    .string "Indeed, \'tis 15!\n"

.what:
    .string "What! %d\n"

We shall dissect this assembly listing carefully and relate it to the Decaf code. Note that this is not the only possible assembly of the program; it only serves as an illustration of some techniques you can use in this project phase. 我们将仔细分析这个汇编代码, 并将其与Decaf的代码联系起来。请注意, 这不是程序的唯一可能的汇编代码;它只是用来说明在这个项目阶段可以使用的一些技术。

foo:
    enter $(8 * 2), $0
    mov %rdi, -8(%rbp)
    ...
    leave
    ret

This is the standard boilerplate code for a function definition. The first line creates a label which names the entry point of the function. 这是函数定义的标准样板代码。第一行创建一个标签来命名函数的入口点。The following enter instruction sets up the stack frame. 下面的 enter 指令设置堆栈帧. After the function is done with its actual work, the leave instruction restores the stack frame for the caller, and ret passes control back to the caller. 在函数完成其实际工作后, "leave" 指令会恢复调用方的堆栈帧, ret将控制传递回调用方。
Notice that one of the operands to enter is a static arithmetic expression. Such expressions are evaluated by the assembler and converted into constants in the final output. 请注意, enter的一个操作数是静态算术表达式。这些表达式由汇编器计算, 并转换为最终输出中的常量。
Enter first saves the callers frame (base) pointer (%rbp) unto the stack. Then it sets the frame pointer to the stack pointer (%rsp) to set the current frame pointer. Enter next allocates N bytes (where N is the left operand) of stack space to be used for locals and temporaries of the stack frame. It does this by subtracting N from %rsp (remember that the stack grows downward toward address 0). This space is allocated after the caller's frame (base) pointer is saved to the stack. Enter 首先将调用方的帧 (基) 指针 (%rbp) 保存到堆栈上。然后, 它将帧指针设置为堆栈指针 (%rsp) 以设置当前帧指针。然后Enter分配用于本地和堆栈帧临时变量的堆栈空间的 n 个字节 (其中 n 为左操作数)。它通过从%rsp 中减去 N 来实现这一点 (请记住, 堆栈向下增长到地址 0)。此空间在将调用方的帧 (基) 指针保存到堆栈之后分配。
The mov instruction moves the 1st argument (passed in %rdi) to its place on the stack. The argument occupies the first stack position (stack entries are 8 bytes) after the base pointer (%rbp). 0(%rbp) stores the previous frame's base pointer. Mov指令将第一个参数 (在%rdi中传入的) 移动到堆栈上的位置。在基指针 (%rbp) 之后, 该参数占据第一个堆栈位置 (堆栈项为8个字节)。 0 (%rbp) 存储上一个帧的基指针。

   mov          -8(%rbp), %r10
    add     $3, %r10
    mov          %r10, -16(%rbp)

The purpose of foo(int) is to add 3 to its argument, and return the result. The first mov instruction fetches the argument from the stack and places it in the temporay register %r10. The next instruction increments the value in %r10 by the literal or immediate value 3. Note that immediate values are always prefixed by a ‘$’. 函数foo (int) 的目的是将3加到其参数上, 并返回结果。第一个 mov指令从堆栈中提取参数, 并将其放在临时寄存器% r10 中。下一条指令将% r10 中的值加3。请注意, 立即数总是以 "$" 为前缀。
The second mov instruction stores the value of the addition back onto the stack at the second position of the frame (after the saved %rbp). 第二个mov 指令将加法的结果值存储在帧的第二个位置上 (保存在%rbp后)

mov -16(%rbp), %rax

According to the calling convention, a function must place its return value in the %rax register, so foo has succeeded in returning x + 3 by moving the value of the x + 3 expression into %rax. 根据calling convention，函数必须将其返回值放在％rax寄存器中，因此foo通过将x + 3表达式的值移动到％rax中成功返回x + 3。

    .globl main
main:
    enter   $(8 * 6), $0
    ...

The .globl main directive makes the symbol main accessible to modules other than this one. This is important, because the C run-time library, which we link against, expects to find a main procedure to call at program startup. .globl main 指令使符号在本模块之外使用。这一点很重要, 因为我们要链接到的 C 运行库，希望找到一个在程序启动时调用的main过程。
The enter instruction allocates space for 6 quadwords on the stack: one for a local variable and 5 for temporaries. enter 指令为堆栈上的6 quadword分配空间: 一个用于局部变量, 其它5个用于临时变量。

call get_int_035
mov %rax, -8(%rbp)

We call the get_int_035 function, which reads an integer from standard input and returns it. The function takes no arguments. 我们调用 get_int_035 函数, 它从标准输入读取一个整数并返回它。函数不带参数
The integer is returned in %rax, and we store the value of the method call expression onto the stack. 整数在%rax 中返回, 我们将方法调用表达式的值存储到堆栈上。

   mov     -8(%rbp), %rdi
    call    foo
    mov     %rax, -16(%rbp)
    mov     -16(%rbp), %r10
    mov     %r10, -24(%rbp)

Now we are ready to call foo. We start by loading the temporary that stored the return value of get_int_035 into %rdi. According to the calling convention defined in the Linux ABI (see below), %rdi is used to pass the first argument. Then we call foo.
Once foo returns, we store the return value, stored in %rax, onto the stack at location -16(%rbp). 一旦函数foo 返回, 我们存储的返回值, 存储在%rax, 在堆栈上的位置-16 (%rbp)。
Next, we perform the assigment to y of the return value of foo by loading the temporary into %r10 and storing %r10 into the stack location designated for y, -24(%rbp). 接下来, 我们通过将临时值加载到% r10, 并将% r10 存储到指定为 y， -24 (%rbp) 的堆栈位置中, 来执行 foo 的返回价值的分配。

    mov     -24(%rbp), %r10
    mov     $15, %r11
    cmp     %r10, %r11
    mov     $0, %r11
    mov     $1, %r10
    cmove   %r10, %r11
    mov     %r11, -32(%rbp)

This sequence demonstrates how a comparison operation might be implemented using only two registers and temporary storage. We begin by loading the values to compare, i.e., y and the literal 15, into registers. This is necessary because the comparison instructions only work on register operands. 此序列演示如何仅使用两个寄存器和临时变量来实现比较操作。我们首先加载立即数 15, 到寄存器。这是必要的, 因为比较指令只对寄存器操作数起作用。
Then, we perform the actual comparison using the cmp instruction. The result of the comparison is to change the internal flags register. 然后, 使用 cmp 指令进行实际比较。比较的结果将更改内部标志寄存器。
Our aim is to store a boolean value—1 or 0—in a temporary variable as the result of this operation. To set this up, we place the two possible values, 1 and 0, in registers %r10 and %r11. 我们的目标是在一个临时变量中存储一个布尔值—1或 0-作为这个操作的结果。为了设置这个, 我们将两个可能的值 (1 和 0) 放在寄存器% r10 和% r11 中。
Then we use the cmove instruction (read c-mov-e, or conditional move if equal) to decide whether our output value should be 0 or 1, based on the flags set by our previous comparison. The instruction puts the result in %r11. 然后我们使用 cmove 指令 (读 c-mov-e, 或如果相等条件移动), 来决定我们的输出值应该是0或 1, 根据我们以前比较设置的标志。指令将结果放在% r11
Finally, we store the boolean value from %r11 to a temporary variable at -32(%rbp). 最后, 我们将布尔值从% r11 存储到 -32 (%rbp) 的临时变量中。

    mov     -32(%rbp), %r10
    mov     $1, %r11
    cmp     %r10, %r11
    je      .fifteen
    ...
    jmp     .fifteen_done
.fifteen:
    ...
.fifteen_done:

This is the standard linearized structure of a conditional statement. We compare a boolean variable to 1, and perform a je (jump if equal) instruction which jumps to its target block if the comparison succeeded. If the comparison failed, je acts as a no-op. 这是条件语句的标准线性化结构。我们将布尔变量与1进行比较, 并执行一个je (如果相等则跳转) 指令, 如果比较成功, 则跳转到其目标块。如果比较失败, je则不动。
We mark the end of the target block with a label, and jump to it at the end of the fall-through block. Conventionally, such local labels, which do not define functions, are named starting with a period. 我们用标签标记目标块的末尾，并在块的末尾跳转到它。通常，这种不定义功能的本地标签以句点开始命名。

        mov     $.what, %rdi
       mov     -24(%rbp), %rsi
       mov    $0, %rax
       call    printf
       mov    %rax, -40(%rbp)

The block of instructions performs the false (else) block of the if statement. 指令块执行if语句的false（else）块。
We first load the value of the .what string (see below) into %rdi. Next we load the value of y into %rsi. %rsi is the register designated for the second argument of a function. 我们首先将.what字符串的值（见下文）加载到％rdi中。接下来，我们将y的值加载到％rsi中。％rsi是为函数的第二个参数指定的寄存器。
The third mov instruction is necessary for printf because printf uses a variable list of arguments. We must assign 0 to %rax to let printf know that we are not using SSE registers to pass any of the arguments. printf需要第三个mov指令，因为printf使用变量参数列表。我们必须将0分配给％rax让printf知道我们没有使用SSE寄存器来传递任何参数。
After the call, the final move instruction stores the return value of printf onto the stack. Note that the return value is never referenced. 调用后，最终mov指令将printf的返回值存储到堆栈中。请注意，返回值永远不会被引用。

        mov     $0, %rax
       leave
       ret

At the end of the procedure, we set %rax to 0 to indicate that the program has terminated successfully (used even though the main method is declared to be of type void). 在过程结束时，我们将％rax设置为0以指示程序已成功结束（即使main方法声明为void类型，也会使用）

.indeed:
.string "Indeed, \'tis 15!\n"

.what:
.string "What! %d\n"

These labels point to static strings defined in the program. They are used as arguments to callout functions. 这些标签指向程序中定义的静态字符串。它们用作callout函数的参数。

Reference

This handout only mentions a small subset of the rich possibilities provided by the x86-64 instruction set and architecture. For a more complete (but still readable) introduction, consult The AMD64 Architecture Programmer’s Manual, Volume 1: Application Programming. Another helpful resource is the UC Davis AT\&T Assembly Syntax Guide本讲义仅提及x86-64指令集和体系结构提供的丰富可能性的一小部分。有关更完整（但仍然可读）的介绍，请参阅“AMD64架构程序员手册”第1卷：应用程序编程。另一个有用的资源是加州大学戴维斯分校AT \＆T程序集语法指南

Registers

In the assembly syntax accepted by gcc, register names are always prefixed with %. All of these registers are 64 bits wide. 在gcc接受的汇编语法中，寄存器名称始终以％为前缀。所有这些寄存器都是64位宽。

The register file is as follows: 寄存器文件如下

Register	Purpose	Saved across calls
%rax	temp register; return value	No
%rbx	callee-saved	Yes
%rcx	used to pass 4th argument to functions	No
%rdx	used to pass 3rd argument to functions	No
%rsp	stack pointer	Yes
%rbp	callee-saved; base pointer	Yes
%rsi	used to pass 2nd argument to functions	No
%rdi	used to pass 1st argument to functions	No
%r8	used to pass 5th argument to functions	No
%r9	used to pass 6th argument to functions	No
%r10-r11	temporary	No
%r12-r15	callee-saved registers	Yes

For the code generation phase of the project you will not be performing register allocation. You should use %r10 and %r11 for temporary values that you load from the stack. 对于项目的代码生成阶段，您将不会执行寄存器分配。您应该将％r10和％r11用于从堆栈加载的临时值。

Instruction Set

Each mnemonic opcode presented here represents a family of instructions. Within each family, there are variants which take different argument types (registers, immediate values, or memory addresses) and/or argument sizes (byte, word, double-word, or quad-word). The former can be distinguished from the prefixes of the arguments, and the latter by an optional one-letter suffix on the mnemonic. 此处提供的每个助记符操作码代表一系列指令。在每个族中，存在采用不同参数类型（寄存器，立即值或存储器地址）和/或参数大小（字节，字，双字或四字）的变体。前者可以与参数的前缀区分开来，后者可以通过助记符上的可选单字母后缀来区分。

For example, a mov instruction which sets the value of the 64-bit %rax register to the immediate value 3 can be written as例如，将64位％rax寄存器的值设置为立即值3的mov指令可以写为

movq $3, %rax

Immediate operands are always prefixed by $. Un-prefixed operands are treated as memory addresses, and should be avoided since they are confusing. 立即操作数总是以$为前缀。未加前缀的操作数被视为内存地址，应该避免使用它们，因为它们令人困惑。

For instructions which modify one of their operands, the operand which is modified appears second. This differs from the convention used by Microsoft’s and Borland’s assemblers, which are commonly used on DOS and Windows. 对于修改其操作数之一的指令，修改的操作数出现在第二位。这与Microsoft和Borland的汇编程序使用的惯例不同，后者通常用于DOS和Windows。

Opcode	Description
Copying values
mov src, dest	Copies a value from a register, immediate value or memory address to a register or memory address. 将寄存器，立即值或存储器地址中的值复制到另一个寄存器或存储器地址。
cmove %src, %dest	Copies from register %src to register %dest if the last comparison operation had the corresponding result (cmove: equality, cmovne: inequality, cmovg: greater, cmovl: less, cmovge: greater or equal, cmovle: less or equal). 如果最后一个比较操作具有相应的结果，则从寄存器％src复制到寄存器％dest（cmove：等于，cmovne：不相等，cmovg：大于，cmovl：小于，cmovge：大于或等于，cmovle：小于或等于）
cmovne %src, %dest
cmovg %src, %dest
cmovl %src, %dest
cmovge %src, %dest
cmovle %src, %dest
Stack management堆栈管理
enter $x, $0	Sets up a procedure’s stack frame by first pushing the current value of %rbp on to the stack, storing the current value of %rsp in %rbp, and finally decreasing %rsp to make room for x byte-sized local variables. 通过首先将％rbp的当前值推送到堆栈，将％rsp的当前值存储在％rbp中，最后减少％rsp以为x字节大小的局部变量腾出空间来设置过程的堆栈帧。
leave	Removes local variables from the stack frame by restoring the old values of %rsp and %rbp. 通过恢复％rsp和％rbp的旧值的方式从堆栈框架中删除局部变量。
push src	Decreases %rsp and places src at the new memory location pointed to by %rsp. Here, src can be a register, immediate value or memory address. 减少％rsp并将src放在％rsp指向的新内存位置。这里，src可以是寄存器，立即值或内存地址。
pop dest	Copies the value stored at the location pointed to by %rsp to dest and increases %rsp. Here, dest can be a register or memory location. 将存储在％rsp指向的位置的值复制到dest并增加％rsp。这里，dest可以是寄存器或存储器位置。
Control flow
call target	Jump unconditionally to target and push return value (current PC + 1) onto stack. 无条件地跳转到目标并将返回值（当前PC + 1）推入堆栈。
ret	Pop the return address off the stack and jump unconditionally to this address. 将返回地址弹出堆栈并无条件跳转到此地址。
jmp target	Jump unconditionally to target, which is specified as a memory location (for example, a label). 无条件跳转到目标，指定为内存位置（例如，标签）。
je target	Jump to target if the last comparison had the corresponding result (je: equality; jne: inequality). 如果最后一次比较具有相应的结果，则跳转到目标（je：相等; jne：不相等）。
jne target
Arithmetic and logic
add src, dest	Add src to dest. 将src添加到dest。
sub src, dest	Subtract src from dest. 从dest减去src
imul src, dest	Multiply dest by src. 乘以src的dest
idiv divisor	Divide rdx:rax by divisor. Store quotient in rax and store remainder in rdx. 用除数除以rdx：rax。在rax中存储商并将余数存储在rdx中。
shr reg	Shift reg to the left or right by value in cl (low 8 bits of rcx). 通过cl中的值向左或向右移位reg（rcx的低8位）。
shl reg
ror src, dest	Rotate dest to the left or right by src bits. 通过src位向左或向右旋转dest
cmp src, dest	Set flags corresponding to whether dest is less than, equal to, or greater than src设置对应于dest是否小于，等于或大于src的标志

Stack Organization

Global and local variables are stored on the stack, a region of memory that is typically addressed by offsets from the registers %rbp and %rsp. Each procedure call results in the creation of a stack frame where the procedure can store local variables and temporary intermediate values for that invocation.The stack is organized as follows: 全局变量和局部变量存储在堆栈中，堆栈是内存区域，通常由寄存器％rbp和％rsp的偏移量进行寻址。每个过程调用都会导致创建一个堆栈帧，其中过程可以存储该调用的局部变量和临时中间值。堆栈的结构如下：

Position	Contents	Frame
8n+16(%rbp)	argument n	Previous
...	...
16(%rbp)	argument 7
8(%rbp)	return address	Current
0(%rbp)	previous %rbp value
-8(%rbp)	locals and temps
...
0(%rsp)

Calling Convention

We will use the standard Linux function calling convention. The calling convention is defined in detail in System V Application Binary Interface—AMD64 Architecture Processor Supplement. We will summarize the calling convention as it applies to decaf. 我们将使用标准的Linux函数调用约定。调用约定在System V Application Binary Interface-AMD64 Architecture Processor Supplement中详细定义。我们将总结适用于decaf的调用约定。

The caller uses registers to pass the first 6 arguments to the callee. Given the arguments in left-to-right order, the order of registers used is: %rdi, %rsi, %rdx, %rcx, %r8, and %r9. Any remaining arguments are passed on the stack in reverse order so that they can be popped off the stack in order. 调用者使用寄存器将前6个参数传递给被调用者。给定从左到右顺序的参数，使用的寄存器的顺序是：％rdi，％rsi，％rdx，％rcx，％r8和％r9。任何剩余的参数都以相反的顺序传递给堆栈，以便它们可以按顺序从堆栈中弹出。

The callee is responsible for perserving the value of registers %rbp %rbx, and %r12-r15, as these registers are owned by the caller. The remaining registers are owned by the callee. 被调用者负责保留寄存器％rbp％rbx和％r12-r15的值，因为这些寄存器由调用者拥有。其余寄存器由被调用者拥有。

The callee places its return value in %rax and is responsible for cleaning up its local variables as well as for removing the return address from the stack. 被调用者将其返回值放在％rax中，负责清理其局部变量以及从堆栈中删除返回地址。

The call, enter, leave and ret instructions make it easy to follow this calling convention. call, enter, leave 和 ret指令可以很容易地遵循这个调用约定。

Since we follow the standard linux ABI, we can call C functions and library functions using our callout structure. For the purposes of the project we are only going to call printf and get_int_035. When calling printf, we must set the value of register %rax to 0 before issuing the call instruction. This is because printf uses a variable number of arguments and %rax specifies how many SSE registers are used for the arguments. For our purposes the value will always be 0. Since callouts can only return an single integer value, we have provided a function get_int_035(), which will read a single integer input from the terminal and return its integer value. This function is included in the 6035 static library. We cannot use scanf because it returns the number of items read. 由于我们遵循标准的linux ABI，我们可以使用我们的callout结构调用C函数和库函数。出于项目目的，我们只打算调用printf和get_int_035。调用printf时，必须在发出调用指令之前将寄存器％rax的值设置为0。这是因为printf使用可变数量的参数，而％rax指定参数使用了多少个SSE寄存器。为了我们的目的，值总是为0.由于callouts只能返回一个整数值，我们提供了一个函数get_int_035（），它将从终端读取一个整数输入并返回其整数值。此功能包含在6035静态库中。我们不能使用scanf，因为它返回读取的项目数。