程序员的自我修养 ch4 静态链接

参考《程序员的自我修养》ch4.

1. 空间与地址分配
这里的空间分配只关注于虚拟地址空间的分配
现在的链接器空间分配基本上都采用 相同类型合并 的策略,使用这种方法的链接器一般采用一种叫 两步链接(Two-pass Linking) 的方法。 也就是说整个过程分两步:
第一步 空间与地址分配;
第二步 符号解析与重定位,这一步是链接的核心,特别是重定位;

>> ld a.o b.o -e main -o ab
>> objdump -h a.o

a.o:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000034  00000000  00000000  00000034  2**2
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  1 .data         00000000  00000000  00000000  00000068  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000000  00000000  00000000  00000068  2**2
                  ALLOC
  3 .comment      00000024  00000000  00000000  00000068  2**0
                  CONTENTS, READONLY
  4 .note.GNU-stack 00000000  00000000  00000000  0000008c  2**0
                  CONTENTS, READONLY
>> objdump -h b.o

b.o:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         0000003e  00000000  00000000  00000034  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .data         00000004  00000000  00000000  00000074  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000000  00000000  00000000  00000078  2**2
                  ALLOC
  3 .comment      00000024  00000000  00000000  00000078  2**0
                  CONTENTS, READONLY
  4 .note.GNU-stack 00000000  00000000  00000000  0000009c  2**0
                  CONTENTS, READONLY

>> objdump -h ab

ab:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000072  08048094  08048094  00000094  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .data         00000004  08049108  08049108  00000108  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  2 .comment      00000048  00000000  00000000  0000010c  2**0
                  CONTENTS, READONLY

VMA: Virtual Memory Address
LMA: Load Memory Address

链接后目标文件中所使用的地址已经是程序在进程中的虚拟地址,例如.text的起始地址是08048094。在linux下,进程空间地址从08048000开始分配。

2. 符号解析与重定位

>> objdump -d a.o

a.o:     file format elf32-i386


Disassembly of section .text:

00000000 <main>:
   0:   8d 4c 24 04             lea    0x4(%esp),%ecx
   4:   83 e4 f0                and    $0xfffffff0,%esp
   7:   ff 71 fc                pushl  -0x4(%ecx)
   a:   55                      push   %ebp
   b:   89 e5                   mov    %esp,%ebp
   d:   51                      push   %ecx
   e:   83 ec 24                sub    $0x24,%esp
  11:   c7 45 f8 64 00 00 00    movl   $0x64,-0x8(%ebp)
  18:   c7 44 24 04 00 00 00    movl   $0x0,0x4(%esp)
  1f:   00
  20:   8d 45 f8                lea    -0x8(%ebp),%eax
  23:   89 04 24                mov    %eax,(%esp)
  26:   e8 fc ff ff ff          call   27 <main+0x27>
  2b:   83 c4 24                add    $0x24,%esp
  2e:   59                      pop    %ecx
  2f:   5d                      pop    %ebp
  30:   8d 61 fc                lea    -0x4(%ecx),%esp
  33:   c3                      ret

重定位后,

>> objdump -d ab

ab:     file format elf32-i386


Disassembly of section .text:

08048094 <main>:
 8048094:       8d 4c 24 04             lea    0x4(%esp),%ecx
 8048098:       83 e4 f0                and    $0xfffffff0,%esp
 804809b:       ff 71 fc                pushl  -0x4(%ecx)
 804809e:       55                      push   %ebp
 804809f:       89 e5                   mov    %esp,%ebp
 80480a1:       51                      push   %ecx
 80480a2:       83 ec 24                sub    $0x24,%esp
 80480a5:       c7 45 f8 64 00 00 00    movl   $0x64,-0x8(%ebp)
 80480ac:       c7 44 24 04 08 91 04    movl   $0x8049108,0x4(%esp)
 80480b3:       08
 80480b4:       8d 45 f8                lea    -0x8(%ebp),%eax
 80480b7:       89 04 24                mov    %eax,(%esp)
 80480ba:       e8 09 00 00 00          call   80480c8 <swap>
 80480bf:       83 c4 24                add    $0x24,%esp
 80480c2:       59                      pop    %ecx
 80480c3:       5d                      pop    %ebp
 80480c4:       8d 61 fc                lea    -0x4(%ecx),%esp
 80480c7:       c3                      ret

2.1 重定位表 Relocation Table
对于每个要被重定位的ELF section都有一个对应的重定位section。

>> objdump -r a.o

a.o:     file format elf32-i386

RELOCATION RECORDS FOR [.text]:
OFFSET   TYPE              VALUE
0000001c R_386_32          s
00000027 R_386_PC32        swap

这里的1c和27是.text中需要重定位的地方。

/* Relocation table entry without addend (in section of type SHT_REL).  */

typedef struct
{
  Elf32_Addr    r_offset;               /* Address */
  Elf32_Word    r_info;                 /* Relocation type and symbol index */
} Elf32_Rel;

2.2 符号解析

>> ld a.o
ld: warning: cannot find entry symbol _start; defaulting to 0000000008048074
a.o: In function `main':
a.c:(.text+0x1c): undefined reference to `s'
a.c:(.text+0x27): undefined reference to `swap'
链接器会查找由所有输入目标文件的符号表组成的全局符号表,找到相应的符号后进行重定位。

2.3 指令修正
see: http://stackoverflow.com/questions/12412064/meaning-of-r-386-32-r-386-pc32-in-rel-text-section-of-elf 


R_386_32 is a relocation that places the absolute 32-bit address of the symbol into the specified memory location. R_386_PC32 is a relocation that places the PC-relative 32-bit address of the symbol into the specified memory location. R_386_32 is useful for static data, as shown here, since the compiler just loads the relocated symbol address into some register and then treats it as a pointer. R_386_PC32 is useful for function references since it can be used as an immediate argument to call. See elf_machdep.c for an example of how the relocations are processed.

3. COMMON块
由于链接器本身不支持符号的类型,即变量类型对于链接器来说是透明的,它只知道一个符号的名字,并不知道类型是否一致。 因此在处理弱符号时,多个弱符号定义类型会出现不一致的情况,这需要链接器来处理。

现在编译器和链接器都支持一种叫 COMMON块(Common Block) 的机制。 当同名的多个弱引用符号的类型不一致,以符号的占用空间以最大的大小为准。 在目标文件中,标注为“SHN_COMMON”类型的符号即用这种机制处理。

在前面章节中,存在编译器将 未初始化的全局变量 定义为SHN_COMMON类型。 那么为什么编译器不直接把 未初始化的全局变量 也当作 未初始化的局部静态变量 一样处理, 为它在BSS段分配空间,而是将其标记为一个COMMON类型的变量?

这是因为在未链接前,弱符号最终所占空间的大小是未知的,因为其它目标文件里该弱符号所占空间可能比本目标文件所占空间要大,因此无法在BSS段内分配空间。但当链接器读取了所有输入的目标文件后,任何一个弱符号的最终大小都可以确定了,所以它可以在最终输出文件的BSS段分配空间。

GCC允许加"-fno-common"参数把所有未初始化的全局变量不以COMMON块的形式处理,或者使用"__attribute__((nocommon))"扩展。

>> cat common.c
int g __attribute__((nocommon));

>> cat common1.c
int g = 0;

int main()
{
}

>> gcc common.c  common1.c
/tmp/ccMtuul2.o:(.bss+0x0): multiple definition of `g'
/tmp/ccsjyUP2.o:(.bss+0x0): first defined here
collect2: ld returned 1 exit status

4. C++相关的问题
4.1 重复代码消除

>> cat temp.C
#include <iostream>
using namespace std;

template <class T> T fun(T t)
{
        return t * 2;
}

int main()
{
        int a = fun(2);
        double b = fun(2.0);
        cout << a << "," << b << endl;
}

>> readelf -S temp.o
There are 18 section headers, starting at offset 0x550:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         0000000000000000  00000040
       00000000000001f9  0000000000000000  AX       0     0     4
  [ 2] .rela.text        RELA             0000000000000000  00000ef0
       00000000000001f8  0000000000000018          16     1     8
  [ 3] .data             PROGBITS         0000000000000000  0000023c
       0000000000000000  0000000000000000  WA       0     0     4
  [ 4] .bss              NOBITS           0000000000000000  0000023c
       0000000000000001  0000000000000000  WA       0     0     4
  [ 5] .rodata           PROGBITS         0000000000000000  0000023c
       0000000000000002  0000000000000000   A       0     0     1
  [ 6] .gnu.linkonce.t._ PROGBITS         0000000000000000  0000023e
       0000000000000034  0000000000000000  AX       0     0     2
  [ 7] .gnu.linkonce.t._ PROGBITS         0000000000000000  00000272
       000000000000000e  0000000000000000  AX       0     0     2
  [ 8] .gnu.linkonce.t._ PROGBITS         0000000000000000  00000280
       0000000000000026  0000000000000000  AX       0     0     2
  [ 9] .ctors            PROGBITS         0000000000000000  000002a8
       0000000000000008  0000000000000000  WA       0     0     8
  [10] .rela.ctors       RELA             0000000000000000  000010e8
       0000000000000018  0000000000000018          16     9     8
  [11] .eh_frame         PROGBITS         0000000000000000  000002b0
       00000000000001a0  0000000000000000   A       0     0     8
  [12] .rela.eh_frame    RELA             0000000000000000  00001100
       00000000000000d8  0000000000000018          16    11     8
  [13] .note.GNU-stack   PROGBITS         0000000000000000  00000450
       0000000000000000  0000000000000000           0     0     1
  [14] .comment          PROGBITS         0000000000000000  00000450
       000000000000002a  0000000000000000           0     0     1
  [15] .shstrtab         STRTAB           0000000000000000  0000047a
       00000000000000d1  0000000000000000           0     0     1
  [16] .symtab           SYMTAB           0000000000000000  000009d0
       0000000000000348  0000000000000018          17    18     8
  [17] .strtab           STRTAB           0000000000000000  00000d18
       00000000000001d1  0000000000000000           0     0     1

".gnu.linkonce.name"

4.2 全局构造和析构
.init和.fini段

参考http://l4u-00.jinr.ru/usoft/WWW/www_debian.org/Documentation/elf/node3.html

.fini 
 This section holds executable instructions that contribute to the process termination code. That is, when a program exits normally, the system arranges to execute the code in this section. 
.init 
 This section holds executable instructions that contribute to the process initialization code. That is, when a program starts to run the system arranges to execute the code in this section before the main program entry point (called  main in C programs).

4.3 ABI
application binary interface

5. 静态库链接

>> pwd
/usr/lib
>> ar -t libc.a |wc
   1429    1429   16645
>> objdump -t libc.a |grep -w printf
reg-printf.o:     file format elf64-x86-64
printf-prs.o:     file format elf64-x86-64
printf.o:     file format elf64-x86-64
0000000000000000 g     F .text  000000000000009d printf
printf-parsemb.o:     file format elf64-x86-64
printf-parsewc.o:     file format elf64-x86-64

>> ar -x /usr/lib/libc.a
>> ld hello.o libc/printf.o -o a
ld: warning: cannot find entry symbol _start; defaulting to 00000000004000b0
libc/printf.o: In function `_IO_printf':
(.text+0x6b): undefined reference to `stdout'
libc/printf.o: In function `_IO_printf':
(.text+0x91): undefined reference to `vfprintf'

>> gcc -static --verbose -fno-builtin hello.c
Reading specs from /usr/lib/gcc/x86_64-linux-gnu/3.4.6/specs
Configured with: ../src/configure -v --enable-languages=c,c++,f77,pascal --prefix=/usr --libexecdir=/usr/lib --with-gxx-include-dir=/usr/include/c++/3.4 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --program-suffix=-3.4 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug x86_64-linux-gnu
Thread model: posix
gcc version 3.4.6 (Ubuntu 3.4.6-6ubuntu5)
 /usr/lib/gcc/x86_64-linux-gnu/3.4.6/cc1 -quiet -v hello.c -quiet -dumpbase hello.c -mtune=k8 -auxbase hello -fno-builtin -version -o /tmp/ccKPJUxj.s
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/usr/include/x86_64-linux-gnu"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /usr/lib/gcc/x86_64-linux-gnu/3.4.6/include
 /usr/include
End of search list.
GNU C version 3.4.6 (Ubuntu 3.4.6-6ubuntu5) (x86_64-linux-gnu)
        compiled by GNU C version 3.4.6 (Ubuntu 3.4.6-6ubuntu5).
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
 as --traditional-format -V -Qy --64 -o /tmp/ccMTOoIt.o /tmp/ccKPJUxj.s
GNU assembler version 2.18.0 (x86_64-linux-gnu) using BFD version (GNU Binutils for Ubuntu) 2.18.0.20080103
 /usr/lib/gcc/x86_64-linux-gnu/3.4.6/collect2 -m elf_x86_64 -static /usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib/crt1.o /usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib/crti.o /usr/lib/gcc/x86_64-linux-gnu/3.4.6/crtbeginT.o -L/usr/lib/gcc/x86_64-linux-gnu/3.4.6 -L/usr/lib/gcc/x86_64-linux-gnu/3.4.6 -L/usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib -L/usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../.. -L/lib/../lib -L/usr/lib/../lib /tmp/ccMTOoIt.o --start-group -lgcc -lgcc_eh -lc --end-group /usr/lib/gcc/x86_64-linux-gnu/3.4.6/crtend.o /usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib/crtn.o


6. 链接过程控制
使用链接控制脚本,ld链接器在用户没有指定链接器脚本的时候会使用默认链接脚本,使用“ld -verbose”命令可查看默认的链接脚本。
/usr/lib/ldscripts/elf_i386.x  => for normal executables
/usr/lib/ldscripts/elf_i386.xs => link shared library

可以使用-T参数指定链接控制脚本
http://blog.csdn.net/joker0910/article/details/7678056

非常经典的一片介绍linker script的文章
http://blogimg.chinaunix.net/blog/upfile2/090619175409.pdf

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值