1. 前言
本文主要参考了ELF格式解析的学习笔记,此笔记中详细介绍了ELF的基本格式,并对其中的各个部分进行了详细的说明。其中<<详解ELF重定位原理>>又对ELF重定位的原理进行了一个简要说明,本文主要用一个程序实例解释了elf的基本组成及可重定位原理。
如上可理解为elf的总体布局,主要引用自 《ELF格式解析》学习笔记(一)
2. 实例代码
//main.c
#include "part.h"
extern int g_int1, g_int2;
extern char *g_str1, *g_str2;
int main()
{
int i = 0;
char *str = "abc";
func_1(i);
func_2(str);
g_int1 = 9;
g_str1 = "defg";
printf("Global integer is %d and %d, global string is %s and %s.\n", g_int1, g_int2,
g_str1, g_str2);
return i;
}
//part.c
#include "part.h"
int g_int1, g_int2 = 5;
char *g_str1, *g_str2 = "xyz";
void func_1(int i)
{
int j;
j = i;
printf("func_1 : j = %d\n", j);
}
void func_2(char *str)
{
printf("func_2 : str = %s\n", str);
}
//part.h
#ifndef _PART_H_
#define _PART_H_
void func_1(int i);
void func_2(char *str);
#endif
//Makefile
all: main_s main_d
main_s: main.o part.o
gcc main.o part.o -o main_s
main_d: main.o libpart.so
gcc main.o -L. -lpart -o main_d
libpart.so: part.o
gcc --shared part.o -o libpart.so
main.o: main.c
gcc -c main.c -o main.o
part.o: part.c
gcc -c part.c -o part.o
clean:
rm -f *.o *.so main_s main_d
3. 重定位原理说明
以main函数调用func_1函数的重定位为例,说明重定位的原理。
编译后生成如下的文件
ubuntu@VM-0-9-ubuntu:~/test/elf_test$ ls
libpart.so main.c main_d main.o main_s Makefile part.c part.h part.o
查看main.o的信息:
ubuntu@VM-0-9-ubuntu:~/test/elf_test$ readelf -a main.o
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 1304 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 64 (bytes)
Number of section headers: 13
Section header string table index: 12
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .text PROGBITS 0000000000000000 00000040
000000000000007d 0000000000000000 AX 0 0 1
[ 2] .rela.text RELA 0000000000000000 00000378
0000000000000120 0000000000000018 I 10 1 8
[ 3] .data PROGBITS 0000000000000000 000000bd
0000000000000000 0000000000000000 WA 0 0 1
[ 4] .bss NOBITS 0000000000000000 000000bd
0000000000000000 0000000000000000 WA 0 0 1
[ 5] .rodata PROGBITS 0000000000000000 000000c0
000000000000004a 0000000000000000 A 0 0 8
[ 6] .comment PROGBITS 0000000000000000 0000010a
000000000000002a 0000000000000001 MS 0 0 1
[ 7] .note.GNU-stack PROGBITS 0000000000000000 00000134
0000000000000000 0000000000000000 0 0 1
[ 8] .eh_frame PROGBITS 0000000000000000 00000138
0000000000000038 0000000000000000 A 0 0 8
[ 9] .rela.eh_frame RELA 0000000000000000 00000498
0000000000000018 0000000000000018 I 10 8 8
[10] .symtab SYMTAB 0000000000000000 00000170
00000000000001b0 0000000000000018 11 9 8
[11] .strtab STRTAB 0000000000000000 00000320
0000000000000054 0000000000000000 0 0 1
[12] .shstrtab STRTAB 0000000000000000 000004b0
0000000000000061 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
l (large), p (processor specific)
There are no section groups in this file.
There are no program headers in this file.
There is no dynamic section in this file.
Relocation section '.rela.text' at offset 0x378 contains 12 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000012 000500000002 R_X86_64_PC32 0000000000000000 .rodata - 4
000000000020 000b00000004 R_X86_64_PLT32 0000000000000000 func_1 - 4
00000000002c 000c00000004 R_X86_64_PLT32 0000000000000000 func_2 - 4
000000000032 000d00000002 R_X86_64_PC32 0000000000000000 g_int1 - 8
00000000003d 000500000002 R_X86_64_PC32 0000000000000000 .rodata + 0
000000000044 000e00000002 R_X86_64_PC32 0000000000000000 g_str1 - 4
00000000004b 000f00000002 R_X86_64_PC32 0000000000000000 g_str2 - 4
000000000052 000e00000002 R_X86_64_PC32 0000000000000000 g_str1 - 4
000000000058 001000000002 R_X86_64_PC32 0000000000000000 g_int2 - 4
00000000005e 000d00000002 R_X86_64_PC32 0000000000000000 g_int1 - 4
00000000006a 000500000002 R_X86_64_PC32 0000000000000000 .rodata + c
000000000074 001100000004 R_X86_64_PLT32 0000000000000000 printf - 4
Relocation section '.rela.eh_frame' at offset 0x498 contains 1 entry:
Offset Info Type Sym. Value Sym. Name + Addend
000000000020 000200000002 R_X86_64_PC32 0000000000000000 .text + 0
The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.
Symbol table '.symtab' contains 18 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS main.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
4: 0000000000000000 0 SECTION LOCAL DEFAULT 4
5: 0000000000000000 0 SECTION LOCAL DEFAULT 5
6: 0000000000000000 0 SECTION LOCAL DEFAULT 7
7: 0000000000000000 0 SECTION LOCAL DEFAULT 8
8: 0000000000000000 0 SECTION LOCAL DEFAULT 6
9: 0000000000000000 125 FUNC GLOBAL DEFAULT 1 main
10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _GLOBAL_OFFSET_TABLE_
11: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_1
12: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_2
13: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND g_int1
14: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND g_str1
15: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND g_str2
16: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND g_int2
17: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND printf
No version information found in this file.
如上可以看出每个section都对应一个可重定位的section,如.text section对应.rela.text的section, 从.rela.text中可以看出,也就是说0x20位置,需要重新定位到func_1所在的位置
000000000020 000b00000004 R_X86_64_PLT32 0000000000000000 func_1 - 4
objdump -dS main.o, 查看main.o的反汇编代码,来找到0x20所处位置:
ubuntu@VM-0-9-ubuntu:~/test/elf_test$ objdump -dS main.o
main.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <main>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 83 ec 10 sub $0x10,%rsp
8: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp)
f: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax # 16 <main+0x16>
16: 48 89 45 f8 mov %rax,-0x8(%rbp)
1a: 8b 45 f4 mov -0xc(%rbp),%eax
1d: 89 c7 mov %eax,%edi
1f: e8 00 00 00 00 callq 24 <main+0x24>
24: 48 8b 45 f8 mov -0x8(%rbp),%rax
28: 48 89 c7 mov %rax,%rdi
2b: e8 00 00 00 00 callq 30 <main+0x30>
30: c7 05 00 00 00 00 09 movl $0x9,0x0(%rip) # 3a <main+0x3a>
37: 00 00 00
3a: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax # 41 <main+0x41>
41: 48 89 05 00 00 00 00 mov %rax,0x0(%rip) # 48 <main+0x48>
48: 48 8b 35 00 00 00 00 mov 0x0(%rip),%rsi # 4f <main+0x4f>
4f: 48 8b 0d 00 00 00 00 mov 0x0(%rip),%rcx # 56 <main+0x56>
56: 8b 15 00 00 00 00 mov 0x0(%rip),%edx # 5c <main+0x5c>
5c: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # 62 <main+0x62>
62: 49 89 f0 mov %rsi,%r8
65: 89 c6 mov %eax,%esi
67: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # 6e <main+0x6e>
6e: b8 00 00 00 00 mov $0x0,%eax
73: e8 00 00 00 00 callq 78 <main+0x78>
78: 8b 45 f4 mov -0xc(%rbp),%eax
7b: c9 leaveq
7c: c3 retq
0x20所处位置的值为00 00 00 00, 这四个字节将在链接阶段被重新替换为func_1符号所在的真正位置
1f: e8 00 00 00 00 callq 24 <main+0x24>
进一步反汇编查看链接后的main_s
000000000000065a <main>:
65a: 55 push %rbp
65b: 48 89 e5 mov %rsp,%rbp
65e: 48 83 ec 10 sub $0x10,%rsp
662: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp)
669: 48 8d 05 48 01 00 00 lea 0x148(%rip),%rax # 7b8 <_IO_stdin_used+0x8>
670: 48 89 45 f8 mov %rax,-0x8(%rbp)
674: 8b 45 f4 mov -0xc(%rbp),%eax
677: 89 c7 mov %eax,%edi
679: e8 59 00 00 00 callq 6d7 <func_1>
67e: 48 8b 45 f8 mov -0x8(%rbp),%rax
682: 48 89 c7 mov %rax,%rdi
685: e8 77 00 00 00 callq 701 <func_2>
68a: c7 05 94 09 20 00 09 movl $0x9,0x200994(%rip) # 201028 <g_int1>
691: 00 00 00
694: 48 8d 05 21 01 00 00 lea 0x121(%rip),%rax # 7bc <_IO_stdin_used+0xc>
69b: 48 89 05 8e 09 20 00 mov %rax,0x20098e(%rip) # 201030 <g_str1>
6a2: 48 8b 35 6f 09 20 00 mov 0x20096f(%rip),%rsi # 201018 <g_str2>
6a9: 48 8b 0d 80 09 20 00 mov 0x200980(%rip),%rcx # 201030 <g_str1>
6b0: 8b 15 5a 09 20 00 mov 0x20095a(%rip),%edx # 201010 <g_int2>
6b6: 8b 05 6c 09 20 00 mov 0x20096c(%rip),%eax # 201028 <g_int1>
6bc: 49 89 f0 mov %rsi,%r8
6bf: 89 c6 mov %eax,%esi
6c1: 48 8d 3d 00 01 00 00 lea 0x100(%rip),%rdi # 7c8 <_IO_stdin_used+0x18>
6c8: b8 00 00 00 00 mov $0x0,%eax
6cd: e8 5e fe ff ff callq 530 <printf@plt>
6d2: 8b 45 f4 mov -0xc(%rbp),%eax
6d5: c9 leaveq
6d6: c3 retq
00000000000006d7 <func_1>:
6d7: 55 push %rbp
6d8: 48 89 e5 mov %rsp,%rbp
6db: 48 83 ec 20 sub $0x20,%rsp
6df: 89 7d ec mov %edi,-0x14(%rbp)
6e2: 8b 45 ec mov -0x14(%rbp),%eax
6e5: 89 45 fc mov %eax,-0x4(%rbp)
6e8: 8b 45 fc mov -0x4(%rbp),%eax
6eb: 89 c6 mov %eax,%esi
6ed: 48 8d 3d 12 01 00 00 lea 0x112(%rip),%rdi # 806 <_IO_stdin_used+0x56>
6f4: b8 00 00 00 00 mov $0x0,%eax
6f9: e8 32 fe ff ff callq 530 <printf@plt>
6fe: 90 nop
6ff: c9 leaveq
700: c3 retq
从中可以看出
1f: e8 00 00 00 00 callq 24 <main+0x24>
被替换为:
679: e8 59 00 00 00 callq 6d7 <func_1>
那这个替换是如何完成的?
主要是通过如下公式:
链接后的地址 = 被调用函数的地址 - call指令所在地址 - 表示地址的字节长度
其中:
链接后的地址59为最低位,被调用函数地址为0x6d7,call指令所在地址为0x67a,func_1地址长度根据重定位段.rela.text可知为0x4,因此可得:0x59 = 0x6d7 - 0x67a - 0x4