源文件SimpleSection.c为:
int printf(const char* format, ...); int global_init_var=84; int global_uninit_var; void func1(int i){ printf("%d\n", i); } int main(void){ static int static_var=85; static int static_var2; int a=1; int b; func1(static_var+static_var2+a+b); return a; }
经过: $ gcc -c SimpleSection.c 预处理 -> 编译(产生汇编代码) -> 汇编(产生obj文件),尚未链接。
此时得到SimpleSection.o,下面均是对SimpleSection.o的观察分析:
一、查看ELF文件整体
按照ELF格式文件从前到后的顺序查看(ELF文件头,program headers, sections, section headers):
[hadoop@sam1 test]$ readelf -h SimpleSection.o ==> 查看ELF格式的目标文件 "File Header"
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 272 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 40 (bytes)
Number of section headers: 11
Section header string table index: 8
[hadoop@sam1 test]$ readelf -S SimpleSection.o ==> 查看ELF格式的目标文件 "Section Header Table"
There are 11 section headers, starting at offset 0x110:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00000000 000034 000050 00 AX 0 0 4
[ 2] .rel.text REL 00000000 000420 000028 08 9 1 4
==>.rel.text:对于必须要重定位的代码段和数据段,都会有一个相应的重定位表——因为链接器处理目标文件时,须要对目标文件中“对绝对地址引用的位置”进行重定位。
[ 3] .data PROGBITS 00000000 000084 000008 00 WA 0 0 4
[ 4] .bss NOBITS 00000000 00008c 000004 00 WA 0 0 4
[ 5] .rodata PROGBITS 00000000 00008c 000004 00 A 0 0 1
[ 6] .comment PROGBITS 00000000 000090 00002d 01 MS 0 0 1
[ 7] .note.GNU-stack PROGBITS 00000000 0000bd 000000 00 0 0 1
[ 8] .shstrtab STRTAB 00000000 0000bd 000051 00 0 0 1
==> .shstrtab段表字符串表:保存段表中用到的字符串,如“段名”
[ 9] .symtab SYMTAB 00000000 0002c8 0000f0 10 10 10 4
==> .symtab符号表:目标文件中的“函数”和“变量”统称为符号(Symbol)
[10] .strtab STRTAB 00000000 0003b8 000066 00 0 0 1
==> .strtab字符串表:保存普通的字符串
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
还可以反汇编SimpleSection.o中的.text段或者所有段:
objdump -d SImpleSection.o ==> --disassemble: Display the contents of executable sections (i.e. [.text])
objdump -D SImpleSection.o ==> --disassemble-all: Display the contents of all sections
二、查看ELF文件细节
[hadoop@sam1 test]$ readelf -s SimpleSection.o ==> 查看ELF格式目标文件的“符号表”(符号表也是目标文件中的一个段,即.symtab这个段)
Symbol table '.symtab' contains 15 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 FILE LOCAL DEFAULT ABS SimpleSection.c
2: 00000000 0 SECTION LOCAL DEFAULT 1
3: 00000000 0 SECTION LOCAL DEFAULT 3
4: 00000000 0 SECTION LOCAL DEFAULT 4
5: 00000000 0 SECTION LOCAL DEFAULT 5
6: 00000004 4 OBJECT LOCAL DEFAULT 3 static_var.1222
7: 00000000 4 OBJECT LOCAL DEFAULT 4 static_var2.1223
8: 00000000 0 SECTION LOCAL DEFAULT 7
9: 00000000 0 SECTION LOCAL DEFAULT 6
10: 00000000 4 OBJECT GLOBAL DEFAULT 3 global_init_var
11: 00000004 4 OBJECT GLOBAL DEFAULT COM global_uninit_var
12: 00000000 27 FUNC GLOBAL DEFAULT 1 func1
13: 00000000 0 NOTYPE GLOBAL DEFAULT UND printf
14: 0000001b 53 FUNC GLOBAL DEFAULT 1 main
理解Ndx字段:
static_var.1222为3,表示在第3个section[.data]中;
static_var2.1223为4,表示在第4个section[.bss]中;
global_init_var为3,表示在第3个section中[.data];
global_uninit_var为COM,表示这个全局变量未显式初始化,下面有详解;
func1为1,表示在第1个section[.text]中;
printf为UND,表示这个函数在外部模块定义,下面有详解;
main为1,表示在第1个section[.text]中;
Ndx为COM的含义:
gcc treats uninitialised globals which are not explicitly declared extern as "common" symbols (hence "COM").
Multiple definitions of the same common symbol (across multiple object files) are merged together by the linker when creating the final executable, so that they all refer to the same storage. One of the object files may initialise it to a particular value (in which case it will end up in the data section); if no object files initialise it, is will end up in the BSS; if more than one object initialises it, you'll get a linker error.
In summary, if you have, say, two definitions of int a:
int a; in one object and int a; in another object is OK: both refer to the same a, initialised to 0
int a; in one object and int a = 42; in another object is OK: both refer to the same a, initialised to 42
int a = 23; in one object and int a= 42; in another object will give a link error.
Do note that the use of multiple definitions of the same symbol across two objects is not technically allowed by standard C; but it is supported by many compilers, including gcc, as an extension. (It's listed under "Common extensions" - no pun intended - in the C99 spec.)
Ndx为UND的含义:
#readelf -a SimpleSection.o| grep UND
你会看到很多熟悉的库函数(i.e. printf),但是现在还是没有确定它的地址,用的UND,当你运行这个程序的时候,loader会把这个应用装到内存,同时确定这些动态连接的函数的地址,loader会到这个应用程序头的一个字段找到这个应用程序依赖的共享库。例如:
在另一个文件a.c中显式定义int global_uninit_var=7; 并编译gcc -c a.c产生a.o
然后将gcc SimpleSection.o a.o -o b
# readelf -d b(显示Dynamic section)既可以看到类似
Dynamic section at offset 0x4c4 contains 20 entries:
Tag Type Name/Value
0x00000001 (NEEDED) Shared library: [libc.so.6]
...
可知b依赖 libc.so.6,但是loader会到哪个路径去找这个文件呢?它会到环境变量LD_LIBRARY_PATH 去找这个文件,找到之后,把共享库装入内存(如果是第一次使用这个库),把这些库函数在内存的地址反填到应用程序中,这样这个应用就可以运行了。