汇编入门文档翻译 & sample分析-CSDN博客

先发布在简书了： www.jianshu.com/p/66ca218d2…

写在前面：这篇只是尝试对原文部分段落进行翻译，并对其中的sample做简单的分析。

原文

kakaroto.homelinux.net/2017/11/int…

翻译部分段落

1. Stack

栈内存储数据

上图展示了栈的LIFO（后进先出）性质，它在RAM中是向下增长（和图中向上增加是相反的）。栈用来存储局部变量和函数的返回地址（来自上一层函数调用该函数后的指令）。

一个栈里有多个栈帧，可以看到当前的栈帧，包括了它所有的变量，和调用它的函数的返回地址，在它之上，是上一个函数的帧，也是包含了它自己的变量和调用它的函数的返回地址，依此类推，main函数在栈的顶端。

2. Registers

The second thing I want you to understand is that the processor has multiple “registers”. 
You can think of a register as a variable, but there are only 9 total registers on x86, with only 7 of them usable. 
So, on the x86 processor, the various registers are: EAX, EBX, ECX, EDX, EDI, ESI, EBP, ESP, EIP.

There are two registers in there that are special:

The EIP (Instruction Pointer) contains the address of the current instruction being executed.
The ESP (Stack Pointer) contains the address of the stack.
复制代码

寄存器介绍:

可以把一个寄存器当作一个变量，x86处理器上一共只有9个寄存器，其中只有7个是可用的。 x86上的变量寄存器是这些：EAX, EBX, ECX, EDX, EDI, ESI, EBP, ESP, EIP.

其中两个比较特殊： EIP：指令指针，存储当前正在执行的指令的地址； ESP：栈指针，存储当前栈的地址。

3. Instructions

“MOV“: move data from one operand into another
“ADD/SUB/MUL/DIV“: Add, Substract, Multiply, Divide one operand with another and store the result in a register
“AND/OR/XOR/NOT/NEG“: Perform logical and/or/xor/not/negate operations on the operand
“SHL/SHR“: Shift Left/Shift Right the bits in the operand
“CMP/TEST“: Compare one register with an operand
“JMP/JZ/JNZ/JB/JS/etc.”: Jump to another instruction (Jump unconditionally, Jump if Zero, Jump if Not Zero, Jump if Below, Jump if Sign, etc.)
“PUSH/POP“: Push an operand into the stack, or pop a value from the stack into a register
“CALL“: Call a function. This is the equivalent of doing a “PUSH %EIP+4” + “JMP”. I’ll get into calling conventions later..
“RET“: Return from a function. This is the equivalent of doing a “POP %EIP”
复制代码

主要看下“Call”指令，调用一个函数，相当于两步：

PUSH %EIP+4
JMP
复制代码

Sample

code

int main() {
    return add_a_and_b(2, 3);
}
int add_a_and_b(int a, int b) {
    return a + b;
}
复制代码

编译后的汇编

_main:
   push   3                ; Push the second argument '3' into the stack
   push   2                ; Push the first argument '2' into the stack
   call   _add_a_and_b     ; Call the _add_a_and_b function. This will put the address of the next
                           ; instruction (add) into the stack, then it will jump into the _add_a_and_b
                           ; function by putting the address of the first instruction in the _add_a_and_b
                           ; label (push %ebx) into the EIP register
   add    %esp, 8          ; Add 8 to the esp, which effectively pops out the two values we just pushed into it
   ret                     ; Return to the parent function.... 

_add_a_and_b:
   push   %ebx             ; We're going to modify %ebx, so we need to push it to the stack
                           ; so we can restore its value when we're done
   mov    %eax, [%esp+8]   ; Move the first argument (8 bytes above the stack pointer) into EAX
   mov    %ebx, [%esp+12]  ; Move the second argument (12 bytes above the stack pointer) into EBX
   add    %eax, %ebx       ; Add EAX and EBX and store the result into EAX
   pop    %ebx             ; Pop EBX to restore its previous value
   ret                     ; Return back into the main. This will pop the value on the stack (which was
                           ; the address of the next instruction in the main function that was pushed into
                           ; the stack when the 'call' instruction was executed) into the EIP register
复制代码

对main函数指令的解释

把参数3写入栈;
把参数2写入栈;
调用_add_a_and_b函数，会把下一条指令（add）的地址写进栈里，然后通过把_add_a_and_b标签里的第一条指令地址(push %ebx)写入EIP寄存器，来跳转到_add_a_and_b这个函数。
esp寄存器加8个字节，即把push两个value值的空间pop释放掉;
返回上一级父层函数。

stack存储状态

For the purposes of this exercise, 
we’re going to assume that the _main function is located in memory at the address 0xFFFF0000, 
and that each instructoin is 4 bytes long 
(the size of each instruction can vary depending on the instruction and on its operands).

So you can see, we first pushed 3 into the stack, %esp was lowered, 
then we pushed 2 into the stack, %esp was lowered, 

then we did a ‘call _add_a_and_b’, which stored the address of the next instruction 
(4 instructions into the main, so ‘_main+16’) into the stack and esp was lowered, then we pushed %ebx, 
which I assumed here contained a value of 0, and the %esp was lowered again. 

If we now wanted to access the first argument to the function (2), we need to access %esp+8, 
which will let us skip the saved %ebx and the ‘Return address’ that are in the stack 
(since we’re working with 32 bits, each value is 4 bytes). 

And in order to access the second argument (3), we need to access %esp+12.
复制代码

stack状态解释

假定main函数在内存里起始地址是0xFFFF0000，每个指令占4个字节（指令大小可以根据指令和操作数变化）

push 3入栈后，ESP指针位置降低（减4）；然后push 2入栈，ESP又降低（减4）。

接着call _add_a_and_b函数。所做的操作是：存储main函数里call的下一条指令地址到栈里。因为call的下一条指令是“add %esp, 8 ”，所以是保存"add %esp, 8"这条指令的地址到stack中，而它在main里是第四条指令，所以是main函数地址+16字节（按一条指令占4字节算）。

此处存在争议点，函数的地址和函数里的第一条指令地址是不是一致？如果是一致的话，"add %esp, 8"的地址应该是main+12才对。

跳转到add函数后，先把ebx寄存器的原始值push入栈（这里假设ebx的原始值为0），这时esp指针位置会再次降低（减4）。

步骤图解

再来看add函数里这两句指令：

mov %eax, [%esp+8]
mov %ebx, [%esp+12]
复制代码

通过移动ESP指针位置来获取之前设置的两个值。

ESP从保存完ebp初始值的位置（最低位置）开始，
第一次，上移两个位置（+8字节）来获取value 2，
再上移1个位置（再+4，一共是+12）来获取value 3。

以上，暂时翻译这些。

如有争议不准确处，欢迎指出，感谢！