大家学C语言时都会写hello world程序
#include <stdio.h>
int main()
{
printf("hello world\n");
return 0;
}
$gcc hello.c
$ ./a.out
hello world
实际上上诉过程分为4个步骤:
- 预处理(Prepressing)
- 编译(Compilation)
- 汇编(Assembly)
- 链接(Linking)
编译过程
(图一)
分解动作
- 预编译
$gcc -E hello.c -o hello.i
或者
$cpp hello.c > hello.i
预编译主要处理源代码中以"#“开始的预编译指令:
(1) 展开#define
(2) 处理条件编译#if #ifdef #elif #else #endif
(3) 展开#include的文件
(4) 删除注释 // /**/
(5) 添加行号和文件名标识, 用于各种调试
(6) 保留#pragma编译器指令
- 编译
$gcc -S hello.i -o hello.s
产生汇编代码,现在版本的GCC把预编译和编译合并成一个步骤
$/usr/lib/gcc/x86_64-linux-gnu/4.4/cc1 hello.c
main
Analyzing compilation unit
Performing interprocedural optimizations
<visibility> <early_local_cleanups> <summary generate> <inline>Assembling functions:
main
Execution times (seconds)
parser : 0.02 (100%) usr 0.00 ( 0%) sys 0.01 (100%) wall 318 kB (19%) ggc
TOTAL : 0.02 0.00 0.01 1668 kB
main
Analyzing compilation unit
Performing interprocedural optimizations
<visibility> <early_local_cleanups> <summary generate> <inline>Assembling functions:
main
Execution times (seconds)
parser : 0.02 (100%) usr 0.00 ( 0%) sys 0.01 (100%) wall 318 kB (19%) ggc
TOTAL : 0.02 0.00 0.01 1668 kB
得到输出文件hello.s
$ vi hello.s
.file "hello.c"
.section .rodata
.LC0:
.string "hello world"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
movl $.LC0, %edi
call puts
movl $0, %eax
leave
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 4.4.3-4ubuntu5.1) 4.4.3"
.section .note.GNU-stack,"",@progbits
- 汇编
汇编器是将汇编代码转换成机器可以执行的指令
$as hello.s -o hello.o
或者
$gcc -c hello.s -o hello.o
打开看hello.o文件已经全都是二进制的文件了
- 链接
如果我们们在编译的时候打印出整个过程
$ gcc hello.c -o hello -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.4.3-4ubuntu5.1' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i486 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5.1)
COLLECT_GCC_OPTIONS='-o' 'hello' '-v' '-mtune=generic'
/usr/lib/gcc/x86_64-linux-gnu/4.4.3/cc1 -quiet -v hello.c -D_FORTIFY_SOURCE=2 -quiet -dumpbase hello.c -mtune=generic -auxbase hello -version -fstack-protector -o /tmp/ccnewg3s.s
GNU C (Ubuntu 4.4.3-4ubuntu5.1) version 4.4.3 (x86_64-linux-gnu)
compiled by GNU C version 4.4.3, GMP version 4.3.2, MPFR version 2.4.2-p1.
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../x86_64-linux-gnu/include"
ignoring nonexistent directory "/usr/include/x86_64-linux-gnu"
#include "..." search starts here:
#include <...> search starts here:
/usr/local/include
/usr/lib/gcc/x86_64-linux-gnu/4.4.3/include
/usr/lib/gcc/x86_64-linux-gnu/4.4.3/include-fixed
/usr/include
End of search list.
GNU C (Ubuntu 4.4.3-4ubuntu5.1) version 4.4.3 (x86_64-linux-gnu)
compiled by GNU C version 4.4.3, GMP version 4.3.2, MPFR version 2.4.2-p1.
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 462394bb0ac77cba16b6fb6b32589358
COLLECT_GCC_OPTIONS='-o' 'hello' '-v' '-mtune=generic'
as -V -Qy -o /tmp/ccgJalgD.o /tmp/ccnewg3s.s
GNU assembler version 2.20.1 (x86_64-linux-gnu) using BFD version (GNU Binutils for Ubuntu) 2.20.1-system.20100303
COMPILER_PATH=/usr/lib/gcc/x86_64-linux-gnu/4.4.3/:/usr/lib/gcc/x86_64-linux-gnu/4.4.3/:/usr/lib/gcc/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.4.3/:/usr/lib/gcc/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.4.3/:/usr/lib/gcc/x86_64-linux-gnu/
LIBRARY_PATH=/usr/lib/gcc/x86_64-linux-gnu/4.4.3/:/usr/lib/gcc/x86_64-linux-gnu/4.4.3/:/usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/:/lib/../lib/:/usr/lib/../lib/:/usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../:/lib/:/usr/lib/:/usr/lib/x86_64-linux-gnu/
COLLECT_GCC_OPTIONS='-o' 'hello' '-v' '-mtune=generic'
/usr/lib/gcc/x86_64-linux-gnu/4.4.3/collect2 --build-id --eh-frame-hdr -m elf_x86_64 --hash-style=both -dynamic-linker /lib64/ld-linux-x86-64.so.2</strong> -o hello -z relro /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/crt1.o /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/crti.o /usr/lib/gcc/x86_64-linux-gnu/4.4.3/crtbegin.o -L/usr/lib/gcc/x86_64-linux-gnu/4.4.3 -L/usr/lib/gcc/x86_64-linux-gnu/4.4.3 -L/usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib -L/lib/../lib -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../.. -L/usr/lib/x86_64-linux-gnu /tmp/ccgJalgD.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/4.4.3/crtend.o /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/crtn.o
过程大概是这样:
(1)启动GCC
(2)调用usr/lib/gcc/x86_64-linux-gnu/4.4.3/cc1 进行预处理和编译生成临时文件/tmp/ccnewg3s.s
(3)调用汇编器as生成目标文件/tmp/ccgJalgD.o
(4)最后collect2调用ld生成可执行文件
最后的链接工作去掉那些文件的目录会清晰的多
collect2 --build-id --eh-frame-hdr -m elf_x86_64 --hash-style=both -dynamic-linker ld-linux-x86-64.so.2 -o hello</span><span style="background-color: rgb -z relro crt1.o crti.o crtbegin.o -L/usr/lib/ -L/lib -L/gcc ccgJalgD.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed crtend.o crtn.o
简单的例子说明链接器的作用
比如用两个文件main.c 和 func.c ,main.c在执行的时候要调用func.c中的foo(),那么我们在main.c调用foo的时候就要知道foo的函数地址,但由于每个文件都是单独编译的在编译main.c的时候并不知道foo函数地址,所以暂时将调用foo的指令的目标地址搁置,等待最后链接的时候再去将这些指令的目标地址修正,如果没有链接器就需要我们人为的去填入foo的地址。但如果下一次编译func.c foo的地址变了,那么就不可能每次都人为的去调整foo的地址了,所以链接器起到了一个调用地址自动修正的一个过程,也就是重定位吧,当然链接器不止是这么简单的功能。