32位系统,0x00000000-0xBFFFFFFFFF 这3GB是分配给用户空间,0xC00000000-0xFFFFFFFFFF 这1GB是分配给内核空间。
64位系统,0x0000000000000000~0xffffffff80000000。是用户空间地址,0xffffffff80000000~0xffffffffffffffff是内核空间地址。
首先我们来看一个程序的section组成,
administrator@lab-bar-generic:~/Test/Test$ size a.out
text data bss dec hex filename
1068 512 16 1596 63c a.out
text 段为代码段, data段为已经初始化的全局变量或者静态变量,bss段为未初始化的全局变量或者静态变量。
administrator@lab-bar-generic:~/Test/Test$ readelf -h a.out
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x4003e0
Start of program headers: 64 (bytes into file)
Start of section headers: 4416 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 9
Size of section headers: 64 (bytes)
Number of section headers: 31
Section header string table index: 28
Entry point address为程序的入口地址,此地址也是就_start的地址,0x4003e0为线性地址,也就是虚拟地址,此程序不管运行了多少多少次,拷贝了多少次,此地址都是不变的,变得是此虚拟地址映射的物理内存地址。
通过objdump查看程序的符号表如下:
administrator@lab-bar-generic:~/Test/Test$ objdump -t a.out
a.out: file format elf64-x86-64
SYMBOL TABLE:
0000000000400238 l d .interp 0000000000000000 .interp
0000000000400254 l d .note.ABI-tag 0000000000000000 .note.ABI-tag
0000000000400274 l d .note.gnu.build-id 0000000000000000 .note.gnu.build-id
0000000000400298 l d .hash 0000000000000000 .hash
00000000004002b0 l d .gnu.hash 0000000000000000 .gnu.hash
00000000004002d0 l d .dynsym 0000000000000000 .dynsym
0000000000400318 l d .dynstr 0000000000000000 .dynstr
0000000000400350 l d .gnu.version 0000000000000000 .gnu.version
0000000000400358 l d .gnu.version_r 0000000000000000 .gnu.version_r
0000000000400378 l d .rela.dyn 0000000000000000 .rela.dyn
0000000000400390 l d .rela.plt 0000000000000000 .rela.plt
00000000004003a8 l d .init 0000000000000000 .init
00000000004003c0 l d .plt 0000000000000000 .plt
00000000004003e0 l d .text 0000000000000000 .text
00000000004005b8 l d .fini 0000000000000000 .fini
00000000004005c8 l d .rodata 0000000000000000 .rodata
00000000004005cc l d .eh_frame_hdr 0000000000000000 .eh_frame_hdr
00000000004005f0 l d .eh_frame 0000000000000000 .eh_frame
0000000000600e18 l d .ctors 0000000000000000 .ctors
0000000000600e28 l d .dtors 0000000000000000 .dtors
0000000000600e38 l d .jcr 0000000000000000 .jcr
0000000000600e40 l d .dynamic 0000000000000000 .dynamic
0000000000600fe0 l d .got 0000000000000000 .got
0000000000600fe8 l d .got.plt 0000000000000000 .got.plt
0000000000601008 l d .data 0000000000000000 .data
0000000000601018 l d .bss 0000000000000000 .bss
0000000000000000 l d .comment 0000000000000000 .comment
000000000040040c l F .text 0000000000000000 call_gmon_start
0000000000000000 l df *ABS* 0000000000000000 crtstuff.c
0000000000600e18 l O .ctors 0000000000000000 __CTOR_LIST__
0000000000600e28 l O .dtors 0000000000000000 __DTOR_LIST__
0000000000600e38 l O .jcr 0000000000000000 __JCR_LIST__
0000000000400430 l F .text 0000000000000000 __do_global_dtors_aux
0000000000601018 l O .bss 0000000000000001 completed.7382
0000000000601020 l O .bss 0000000000000008 dtor_idx.7384
00000000004004a0 l F .text 0000000000000000 frame_dummy
0000000000000000 l df *ABS* 0000000000000000 crtstuff.c
0000000000600e20 l O .ctors 0000000000000000 __CTOR_END__
0000000000400668 l O .eh_frame 0000000000000000 __FRAME_END__
0000000000600e38 l O .jcr 0000000000000000 __JCR_END__
0000000000400580 l F .text 0000000000000000 __do_global_ctors_aux
0000000000000000 l df *ABS* 0000000000000000 main.c
0000000000600fe8 l O .got.plt 0000000000000000 .hidden _GLOBAL_OFFSET_TABLE_
0000000000600e14 l .ctors 0000000000000000 .hidden __init_array_end
0000000000600e14 l .ctors 0000000000000000 .hidden __init_array_start
0000000000600e40 l O .dynamic 0000000000000000 .hidden _DYNAMIC
0000000000601008 w .data 0000000000000000 data_start
00000000004004e0 g F .text 0000000000000002 __libc_csu_fini
00000000004003e0 g F .text 0000000000000000 _start
0000000000000000 w *UND* 0000000000000000 __gmon_start__
0000000000000000 w *UND* 0000000000000000 _Jv_RegisterClasses
00000000004005b8 g F .fini 0000000000000000 _fini
0000000000000000 F *UND* 0000000000000000 __libc_start_main@@GLIBC_2.2.5
00000000004005c8 g O .rodata 0000000000000004 _IO_stdin_used
0000000000601008 g .data 0000000000000000 __data_start
0000000000601010 g O .data 0000000000000000 .hidden __dso_handle
0000000000600e30 g O .dtors 0000000000000000 .hidden __DTOR_END__
00000000004004f0 g F .text 0000000000000089 __libc_csu_init
0000000000601018 g *ABS* 0000000000000000 __bss_start
0000000000601028 g *ABS* 0000000000000000 _end
0000000000601018 g *ABS* 0000000000000000 _edata
00000000004004c4 g F .text 0000000000000012 main
00000000004003a8 g F .init 0000000000000000 _init
linux程序中虚拟地址和物理地址又是怎么联系起来的呢?
linux中,每个进程通过一个task_struct的结构体描述,每个进程的地址空间都通过一个mm_struct描述,而每个段通过虚拟内存区域(VMA,Virtual Memory Area)来定义,也就是vm_area_struct来描述。它们之间的关系如下:
一个Linux程序有很多的section,如果每个section都用一个VMA来表示的话会占用很多的物理内存,elf有一个装载的segment,与前面的section不同,前面的section主要用于链接,而segment主要用于装载进内存。segment包含了很多的section,如下所示。
administrator@lab-bar-generic:~/Test/Test$ readelf -l a.out
Elf file type is EXEC (Executable file)
Entry point 0x4003e0
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x00000000000001f8 0x00000000000001f8 R E 8
INTERP 0x0000000000000238 0x0000000000400238 0x0000000000400238
0x000000000000001c 0x000000000000001c R 1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x000000000000066c 0x000000000000066c R E 200000
LOAD 0x0000000000000e18 0x0000000000600e18 0x0000000000600e18
0x0000000000000200 0x0000000000000210 RW 200000
DYNAMIC 0x0000000000000e40 0x0000000000600e40 0x0000000000600e40
0x00000000000001a0 0x00000000000001a0 RW 8
NOTE 0x0000000000000254 0x0000000000400254 0x0000000000400254
0x0000000000000044 0x0000000000000044 R 4
GNU_EH_FRAME 0x00000000000005cc 0x00000000004005cc 0x00000000004005cc
0x0000000000000024 0x0000000000000024 R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 8
GNU_RELRO 0x0000000000000e18 0x0000000000600e18 0x0000000000600e18
0x00000000000001e8 0x00000000000001e8 R 1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
03 .ctors .dtors .jcr .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
08 .ctors .dtors .jcr .dynamic .got
实际上,即使把多个section合并到几个段segment,每个段segment还是又很能产生较大的页内碎片,怎样解决这个问题呢?Unix巧妙的通过各个segment接壤部分共享一个物理页来解决这个问题。
参考文章:
http://www.cnblogs.com/chengxuyuancc/archive/2013/04/17/3026920.html
http://www.ibm.com/developerworks/cn/linux/l-lvm64/index.html
http://soft.chinabyte.com/os/51/12324551.shtml