先发布在简书了: www.jianshu.com/p/66ca218d2…
写在前面: 这篇只是尝试对原文部分段落进行翻译,并对其中的sample做简单的分析。
原文
kakaroto.homelinux.net/2017/11/int…
翻译部分段落
1. Stack
- 栈内存储数据
上图展示了栈的LIFO(后进先出)性质,它在RAM中是向下增长(和图中向上增加是相反的)。栈用来存储局部变量和函数的返回地址(来自上一层函数调用该函数后的指令)。
一个栈里有多个栈帧,可以看到当前的栈帧,包括了它所有的变量,和调用它的函数的返回地址,在它之上,是上一个函数的帧,也是包含了它自己的变量和调用它的函数的返回地址,依此类推,main函数在栈的顶端。
2. Registers
The second thing I want you to understand is that the processor has multiple “registers”.
You can think of a register as a variable, but there are only 9 total registers on x86, with only 7 of them usable.
So, on the x86 processor, the various registers are: EAX, EBX, ECX, EDX, EDI, ESI, EBP, ESP, EIP.
There are two registers in there that are special:
The EIP (Instruction Pointer) contains the address of the current instruction being executed.
The ESP (Stack Pointer) contains the address of the stack.
复制代码
寄存器介绍:
可以把一个寄存器当作一个变量,x86处理器上一共只有9个寄存器,其中只有7个是可用的。 x86上的变量寄存器是这些:EAX, EBX, ECX, EDX, EDI, ESI, EBP, ESP, EIP.
其中两个比较特殊: EIP:指令指针,存储当前正在执行的指令的地址; ESP:栈指针,存储当前栈的地址。
3. Instructions
“MOV“: move data from one operand into another
“ADD/SUB/MUL/DIV“: Add, Substract, Multiply, Divide one operand with another and store the result in a register
“AND/OR/XOR/NOT/NEG“: Perform logical and/or/xor/not/negate operations on the operand
“SHL/SHR“: Shift Left/Shift Right the bits in the operand
“CMP/TEST“: Compare one register with an operand
“JMP/JZ/JNZ/JB/JS/etc.”: Jump to another instruction (Jump unconditionally, Jump if Zero, Jump if Not Zero, Jump if Below, Jump if Sign, etc.)
“PUSH/POP“: Push an operand into the stack, or pop a value from the stack into a register
“CALL“: Call a function. This is the equivalent of doing a “PUSH %EIP+4” + “JMP”. I’ll get into calling conventions later..
“RET“: Return from a function. This is the equivalent of doing a “POP %EIP”
复制代码
主要看下“Call”指令,调用一个函数,相当于两步:
PUSH %EIP+4
JMP
复制代码
Sample
code
int main() {
return add_a_and_b(2, 3);
}
int add_a_and_b(int a, int b) {
return a + b;
}
复制代码
编译后的汇编
_main:
push 3 ; Push the second argument '3' into the stack
push 2 ; Push the first argument '2' into the stack
call _add_a_and_b ; Call the _add_a_and_b function. This will put the address of the next
; instruction (add) into the stack, then it will jump into the _add_a_and_b
; function by putting the address of the first instruction in the _add_a_and_b
; label (push %ebx) into the EIP register
add %esp, 8 ; Add 8 to the esp, which effectively pops out the two values we just pushed into it
ret ; Return to the parent function....
_add_a_and_b:
push %ebx ; We're going to modify %ebx, so we need to push it to the stack
; so we can restore its value when we're done
mov %eax, [%esp+8] ; Move the first argument (8 bytes above the stack pointer) into EAX
mov %ebx, [%esp+12] ; Move the second argument (12 bytes above the stack pointer) into EBX
add %eax, %ebx ; Add EAX and EBX and store the result into EAX
pop %ebx ; Pop EBX to restore its previous value
ret ; Return back into the main. This will pop the value on the stack (which was
; the address of the next instruction in the main function that was pushed into
; the stack when the 'call' instruction was executed) into the EIP register
复制代码
对main函数指令的解释
- 把参数3写入栈;
- 把参数2写入栈;
- 调用_add_a_and_b函数,会把下一条指令(add)的地址写进栈里,然后通过把_add_a_and_b标签里的第一条指令地址(push %ebx)写入EIP寄存器,来跳转到_add_a_and_b这个函数。
- esp寄存器加8个字节,即把push两个value值的空间pop释放掉;
- 返回上一级父层函数。
stack存储状态
For the purposes of this exercise,
we’re going to assume that the _main function is located in memory at the address 0xFFFF0000,
and that each instructoin is 4 bytes long
(the size of each instruction can vary depending on the instruction and on its operands).
So you can see, we first pushed 3 into the stack, %esp was lowered,
then we pushed 2 into the stack, %esp was lowered,
then we did a ‘call _add_a_and_b’, which stored the address of the next instruction
(4 instructions into the main, so ‘_main+16’) into the stack and esp was lowered, then we pushed %ebx,
which I assumed here contained a value of 0, and the %esp was lowered again.
If we now wanted to access the first argument to the function (2), we need to access %esp+8,
which will let us skip the saved %ebx and the ‘Return address’ that are in the stack
(since we’re working with 32 bits, each value is 4 bytes).
And in order to access the second argument (3), we need to access %esp+12.
复制代码
stack状态解释
假定main函数在内存里起始地址是0xFFFF0000,每个指令占4个字节(指令大小可以根据指令和操作数变化)
push 3入栈后,ESP指针位置降低(减4); 然后push 2入栈,ESP又降低(减4)。
接着call _add_a_and_b函数。 所做的操作是:存储main函数里call的下一条指令地址到栈里。 因为call的下一条指令是“add %esp, 8 ”,所以是保存"add %esp, 8"这条指令的地址到stack中,而它在main里是第四条指令,所以是main函数地址+16字节(按一条指令占4字节算)。
此处存在争议点,函数的地址和函数里的第一条指令地址是不是一致?如果是一致的话,"add %esp, 8"的地址应该是main+12才对。
跳转到add函数后,先把ebx寄存器的原始值push入栈(这里假设ebx的原始值为0),这时esp指针位置会再次降低(减4)。
步骤图解
再来看add函数里这两句指令:
mov %eax, [%esp+8]
mov %ebx, [%esp+12]
复制代码
通过移动ESP指针位置来获取之前设置的两个值。
- ESP从保存完ebp初始值的位置(最低位置)开始,
- 第一次,上移两个位置(+8字节)来获取value 2,
- 再上移1个位置(再+4,一共是+12)来获取value 3。
以上, 暂时翻译这些。
如有争议不准确处,欢迎指出,感谢!