x86-64汇编基础-CSDN博客

本文链接：https://blog.csdn.net/qq_20892953/article/details/122023099

分析一个简单的汇编代码

部分常见的寄存器

寄存器	16位	32位	64位
累加寄存器	AX	EAX	RAX
基址寄存器	BX	EBX	RBX
计数寄存器	CX	ECX	RCX
数据寄存器	DX	EDX	RDX
堆栈基指针	BP	EBP	RBP
变址寄存器	SI	ESI	RSI
堆栈顶指针	SP	ESP	RSP
指令寄存器	IP	EIP	RIP

一个x86-64的CPU，包含一组16个存储64位值的「通用目的寄存器」。
这些寄存器用来存储「整数数据」和「指针」。

最初的8086中，有8个16位寄存器，即「ax」到「sp」。
扩展到IA32架构时，这些寄存器也扩展到32位，也即「eax」到「esp」。
扩展到x86-64位后，原来的8个寄存器扩展成64位，即「rax」到「rsp」，然后新增了8个寄存器「r8」到「r15」。

8086：第一代单芯片、16位微处理器之一。
IA32：Intel 32位体系结构（Intel Architecture 32-bit）
Intel64：IA32的64位扩展，也称x86-64

环境信息

gcc -v
使用内建 specs。
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/lto-wrapper
目标：x86_64-redhat-linux
配置为：../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --disable-libgcj --with-isl=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/isl-install --with-cloog=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/cloog-install --enable-gnu-indirect-function --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
线程模型：posix
gcc 版本 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC)

C语言代码

int add_a_and_b(int a, int b) {
   return a + b;
}

int main() {
   return add_a_and_b(8, 5);
}

汇编代码

执行gcc -S -fno-asynchronous-unwind-tables test_asm.c就可以得到汇编代码。
使用-fno-asynchronous-unwind-tables选项，是为了禁用cfi指令。

关于CFI指令的用处，有一个解释：On some architectures, exception handling must be managed with Call Frame Information directives. These directives are used in the assembly to direct exception handling. These directives are available on Linux on POWER, if, for any reason (portability of the code base, for example), the GCC generated exception handling information is not sufficient.
下述是ATT格式的汇编代码。ATT格式也是GCC、OBJDUMP等工具的默认格式。Microsoft的工具和Intel的文档，汇编代码都是Intel格式的。这两种格式不太相同，比如：movq(ATT格式)、mov(Intel格式)。GCC也可以产生Intel格式的汇编代码，只需要带上参数-masm=intel。

	.file	"test_asm.c"
	.text
	.globl	add_a_and_b
	.type	add_a_and_b, @function
add_a_and_b:
	pushq	%rbp			; (6)
	movq	%rsp, %rbp		; (7)
	movl	%edi, -4(%rbp)	; (8)
	movl	%esi, -8(%rbp)	; (9)
	movl	-8(%rbp), %eax	; (10)
	movl	-4(%rbp), %edx	; (11)
	addl	%edx, %eax		; (12)
	popq	%rbp			; (13)
	ret						; (14)
	.size	add_a_and_b, .-add_a_and_b
	.globl	main
	.type	main, @function
main:
	pushq	%rbp			; (1)
	movq	%rsp, %rbp		; (2)
	movl	$5, %esi		; (3)
	movl	$8, %edi		; (4)
	call	add_a_and_b		; (5)
	popq	%rbp			; (15)
	ret						; (16)
	.size	main, .-main
	.ident	"GCC: (GNU) 4.8.5 20150623 (Red Hat 4.8.5-39)"
	.section	.note.GNU-stack,"",@progbits

(1) pushq %rbp

rbp寄存器，是ebp寄存器64位扩展。意思是扩展栈指针寄存器，存储栈中最高位数据的内存地址。
rbp寄存器的值，在(1)入栈，在(15)出栈。

这主要是为了把函数中用到的rbp寄存器的内容，恢复到函数调用前的状态。
在进入函数之前，我们无法确定rbp寄存器的值是什么，但是由于函数内部也会使用rbp寄存器，所以就需要暂时把rbp寄存器的值先存到栈里面，函数处理完成之后，再从栈中将值恢复到rbp寄存器。

在函数的入口处，将rbp的值入栈保存，在函数的出口处出栈，这是C语言编译器的规定。
这样做是为了确保函数在调用前后，rbp寄存器的值不会改变。

push和pop指令只有一个操作数，我们不需要指定将值push到哪里，以及将哪里的值pop到寄存器。
是因为，对栈进行读写的内存地址，是由rsp栈指针寄存器管理的。

push入栈和pop出栈指令执行之后，rsp寄存器存储的栈指针的值会自动更新。
因为栈是从高地址位向低地址位生长。
push指令是增加栈元素的操作，所以执行push后，rsp寄存器的值会-4（64位机器就是-8）。
pop指令是减少栈元素的操作，所以执行pop后，rsp寄存器的值会+4（64位机器就是+8）。

我们可以认为，push和pop指令，就是用来在寄存器和栈（主存）之间进行操作的。
push指令就是将寄存器的值，保存到主存中。
pop指令就是将主存中保存的值恢复到寄存器里。