ELF文件格式解析

1. 前言

本文主要参考了ELF格式解析的学习笔记,此笔记中详细介绍了ELF的基本格式,并对其中的各个部分进行了详细的说明。其中<<详解ELF重定位原理>>又对ELF重定位的原理进行了一个简要说明,本文主要用一个程序实例解释了elf的基本组成及可重定位原理。

在这里插入图片描述
在这里插入图片描述
如上可理解为elf的总体布局,主要引用自 《ELF格式解析》学习笔记(一)

2. 实例代码

//main.c
#include "part.h"
extern int g_int1, g_int2;
extern char *g_str1, *g_str2;
int main() 
{
        int i = 0;
        char *str = "abc";
        func_1(i);
        func_2(str);
        g_int1 = 9;
        g_str1 = "defg";
        printf("Global integer is %d and %d, global string is %s and %s.\n", g_int1, g_int2,
        g_str1, g_str2);
        return i;
}
//part.c
#include "part.h"
int g_int1, g_int2 = 5;
char *g_str1, *g_str2 = "xyz";
void func_1(int i)  
{
        int j;
        j = i;
        printf("func_1 : j = %d\n", j); 
}
void func_2(char *str) 
{
        printf("func_2 : str = %s\n", str);
}
//part.h
#ifndef _PART_H_
#define _PART_H_

void func_1(int i); 
void func_2(char *str);
#endif
//Makefile
all: main_s main_d
main_s: main.o part.o
        gcc main.o part.o -o main_s
main_d: main.o libpart.so
        gcc main.o -L. -lpart -o main_d
libpart.so: part.o
        gcc --shared part.o -o libpart.so
main.o: main.c
        gcc -c main.c -o main.o
part.o: part.c
        gcc -c part.c -o part.o
clean:
        rm -f *.o *.so main_s main_d

3. 重定位原理说明

以main函数调用func_1函数的重定位为例,说明重定位的原理。
编译后生成如下的文件

ubuntu@VM-0-9-ubuntu:~/test/elf_test$ ls 
libpart.so  main.c  main_d  main.o  main_s  Makefile  part.c  part.h  part.o

查看main.o的信息:

ubuntu@VM-0-9-ubuntu:~/test/elf_test$ readelf -a main.o
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          1304 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           64 (bytes)
  Number of section headers:         13
  Section header string table index: 12

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         0000000000000000  00000040
       000000000000007d  0000000000000000  AX       0     0     1
  [ 2] .rela.text        RELA             0000000000000000  00000378
       0000000000000120  0000000000000018   I      10     1     8
  [ 3] .data             PROGBITS         0000000000000000  000000bd
       0000000000000000  0000000000000000  WA       0     0     1
  [ 4] .bss              NOBITS           0000000000000000  000000bd
       0000000000000000  0000000000000000  WA       0     0     1
  [ 5] .rodata           PROGBITS         0000000000000000  000000c0
       000000000000004a  0000000000000000   A       0     0     8
  [ 6] .comment          PROGBITS         0000000000000000  0000010a
       000000000000002a  0000000000000001  MS       0     0     1
  [ 7] .note.GNU-stack   PROGBITS         0000000000000000  00000134
       0000000000000000  0000000000000000           0     0     1
  [ 8] .eh_frame         PROGBITS         0000000000000000  00000138
       0000000000000038  0000000000000000   A       0     0     8
  [ 9] .rela.eh_frame    RELA             0000000000000000  00000498
       0000000000000018  0000000000000018   I      10     8     8
  [10] .symtab           SYMTAB           0000000000000000  00000170
       00000000000001b0  0000000000000018          11     9     8
  [11] .strtab           STRTAB           0000000000000000  00000320
       0000000000000054  0000000000000000           0     0     1
  [12] .shstrtab         STRTAB           0000000000000000  000004b0
       0000000000000061  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)

There are no section groups in this file.

There are no program headers in this file.

There is no dynamic section in this file.

Relocation section '.rela.text' at offset 0x378 contains 12 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000012  000500000002 R_X86_64_PC32     0000000000000000 .rodata - 4
000000000020  000b00000004 R_X86_64_PLT32    0000000000000000 func_1 - 4
00000000002c  000c00000004 R_X86_64_PLT32    0000000000000000 func_2 - 4
000000000032  000d00000002 R_X86_64_PC32     0000000000000000 g_int1 - 8
00000000003d  000500000002 R_X86_64_PC32     0000000000000000 .rodata + 0
000000000044  000e00000002 R_X86_64_PC32     0000000000000000 g_str1 - 4
00000000004b  000f00000002 R_X86_64_PC32     0000000000000000 g_str2 - 4
000000000052  000e00000002 R_X86_64_PC32     0000000000000000 g_str1 - 4
000000000058  001000000002 R_X86_64_PC32     0000000000000000 g_int2 - 4
00000000005e  000d00000002 R_X86_64_PC32     0000000000000000 g_int1 - 4
00000000006a  000500000002 R_X86_64_PC32     0000000000000000 .rodata + c
000000000074  001100000004 R_X86_64_PLT32    0000000000000000 printf - 4

Relocation section '.rela.eh_frame' at offset 0x498 contains 1 entry:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000020  000200000002 R_X86_64_PC32     0000000000000000 .text + 0

The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.

Symbol table '.symtab' contains 18 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS main.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4 
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 
     6: 0000000000000000     0 SECTION LOCAL  DEFAULT    7 
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    8 
     8: 0000000000000000     0 SECTION LOCAL  DEFAULT    6 
     9: 0000000000000000   125 FUNC    GLOBAL DEFAULT    1 main
    10: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND _GLOBAL_OFFSET_TABLE_
    11: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND func_1
    12: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND func_2
    13: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND g_int1
    14: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND g_str1
    15: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND g_str2
    16: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND g_int2
    17: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND printf

No version information found in this file.

如上可以看出每个section都对应一个可重定位的section,如.text section对应.rela.text的section, 从.rela.text中可以看出,也就是说0x20位置,需要重新定位到func_1所在的位置

000000000020  000b00000004 R_X86_64_PLT32    0000000000000000 func_1 - 4

objdump -dS main.o, 查看main.o的反汇编代码,来找到0x20所处位置:

ubuntu@VM-0-9-ubuntu:~/test/elf_test$ objdump -dS main.o

main.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <main>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 83 ec 10             sub    $0x10,%rsp
   8:   c7 45 f4 00 00 00 00    movl   $0x0,-0xc(%rbp)
   f:   48 8d 05 00 00 00 00    lea    0x0(%rip),%rax        # 16 <main+0x16>
  16:   48 89 45 f8             mov    %rax,-0x8(%rbp)
  1a:   8b 45 f4                mov    -0xc(%rbp),%eax
  1d:   89 c7                   mov    %eax,%edi
  1f:   e8 00 00 00 00          callq  24 <main+0x24>
  24:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  28:   48 89 c7                mov    %rax,%rdi
  2b:   e8 00 00 00 00          callq  30 <main+0x30>
  30:   c7 05 00 00 00 00 09    movl   $0x9,0x0(%rip)        # 3a <main+0x3a>
  37:   00 00 00 
  3a:   48 8d 05 00 00 00 00    lea    0x0(%rip),%rax        # 41 <main+0x41>
  41:   48 89 05 00 00 00 00    mov    %rax,0x0(%rip)        # 48 <main+0x48>
  48:   48 8b 35 00 00 00 00    mov    0x0(%rip),%rsi        # 4f <main+0x4f>
  4f:   48 8b 0d 00 00 00 00    mov    0x0(%rip),%rcx        # 56 <main+0x56>
  56:   8b 15 00 00 00 00       mov    0x0(%rip),%edx        # 5c <main+0x5c>
  5c:   8b 05 00 00 00 00       mov    0x0(%rip),%eax        # 62 <main+0x62>
  62:   49 89 f0                mov    %rsi,%r8
  65:   89 c6                   mov    %eax,%esi
  67:   48 8d 3d 00 00 00 00    lea    0x0(%rip),%rdi        # 6e <main+0x6e>
  6e:   b8 00 00 00 00          mov    $0x0,%eax
  73:   e8 00 00 00 00          callq  78 <main+0x78>
  78:   8b 45 f4                mov    -0xc(%rbp),%eax
  7b:   c9                      leaveq 
  7c:   c3                      retq   

0x20所处位置的值为00 00 00 00, 这四个字节将在链接阶段被重新替换为func_1符号所在的真正位置

 1f:   e8 00 00 00 00          callq  24 <main+0x24>

进一步反汇编查看链接后的main_s

000000000000065a <main>:
 65a:   55                      push   %rbp
 65b:   48 89 e5                mov    %rsp,%rbp
 65e:   48 83 ec 10             sub    $0x10,%rsp
 662:   c7 45 f4 00 00 00 00    movl   $0x0,-0xc(%rbp)
 669:   48 8d 05 48 01 00 00    lea    0x148(%rip),%rax        # 7b8 <_IO_stdin_used+0x8>
 670:   48 89 45 f8             mov    %rax,-0x8(%rbp)
 674:   8b 45 f4                mov    -0xc(%rbp),%eax
 677:   89 c7                   mov    %eax,%edi
 679:   e8 59 00 00 00          callq  6d7 <func_1>
 67e:   48 8b 45 f8             mov    -0x8(%rbp),%rax
 682:   48 89 c7                mov    %rax,%rdi
 685:   e8 77 00 00 00          callq  701 <func_2>
 68a:   c7 05 94 09 20 00 09    movl   $0x9,0x200994(%rip)        # 201028 <g_int1>
 691:   00 00 00 
 694:   48 8d 05 21 01 00 00    lea    0x121(%rip),%rax        # 7bc <_IO_stdin_used+0xc>
 69b:   48 89 05 8e 09 20 00    mov    %rax,0x20098e(%rip)        # 201030 <g_str1>
 6a2:   48 8b 35 6f 09 20 00    mov    0x20096f(%rip),%rsi        # 201018 <g_str2>
 6a9:   48 8b 0d 80 09 20 00    mov    0x200980(%rip),%rcx        # 201030 <g_str1>
 6b0:   8b 15 5a 09 20 00       mov    0x20095a(%rip),%edx        # 201010 <g_int2>
 6b6:   8b 05 6c 09 20 00       mov    0x20096c(%rip),%eax        # 201028 <g_int1>
 6bc:   49 89 f0                mov    %rsi,%r8
 6bf:   89 c6                   mov    %eax,%esi
 6c1:   48 8d 3d 00 01 00 00    lea    0x100(%rip),%rdi        # 7c8 <_IO_stdin_used+0x18>
 6c8:   b8 00 00 00 00          mov    $0x0,%eax
 6cd:   e8 5e fe ff ff          callq  530 <printf@plt>
 6d2:   8b 45 f4                mov    -0xc(%rbp),%eax
 6d5:   c9                      leaveq 
 6d6:   c3                      retq   

00000000000006d7 <func_1>:
 6d7:   55                      push   %rbp
 6d8:   48 89 e5                mov    %rsp,%rbp
 6db:   48 83 ec 20             sub    $0x20,%rsp
 6df:   89 7d ec                mov    %edi,-0x14(%rbp)
 6e2:   8b 45 ec                mov    -0x14(%rbp),%eax
 6e5:   89 45 fc                mov    %eax,-0x4(%rbp)
 6e8:   8b 45 fc                mov    -0x4(%rbp),%eax
 6eb:   89 c6                   mov    %eax,%esi
 6ed:   48 8d 3d 12 01 00 00    lea    0x112(%rip),%rdi        # 806 <_IO_stdin_used+0x56>
 6f4:   b8 00 00 00 00          mov    $0x0,%eax
 6f9:   e8 32 fe ff ff          callq  530 <printf@plt>
 6fe:   90                      nop
 6ff:   c9                      leaveq 
 700:   c3                      retq  

从中可以看出

 1f:   e8 00 00 00 00          callq  24 <main+0x24>

被替换为:

 679:   e8 59 00 00 00          callq  6d7 <func_1>

那这个替换是如何完成的?
主要是通过如下公式:
链接后的地址 = 被调用函数的地址 - call指令所在地址 - 表示地址的字节长度
其中:
链接后的地址59为最低位,被调用函数地址为0x6d7,call指令所在地址为0x67a,func_1地址长度根据重定位段.rela.text可知为0x4,因此可得:0x59 = 0x6d7 - 0x67a - 0x4

参考文档

  1. 《ELF格式解析》学习笔记(一)
  2. 《ELF格式解析》学习笔记(二)
  3. 《ELF格式解析》学习笔记(三)
  4. 详解ELF重定向原理
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值