PLT and GOT - the key to code sharing and dynamic libraries
by Ian Wienand (Tue 10 May 2011)
(this post was going to be about something else, but after getting this far, I think it stands on its own as an introduction to dynamic linking)
The shared library is an integral part of a modern system, but often the mechanisms behind the implementation are less well understood. There are, of course, many guides to this sort of thing. Hopefully this adds another perspective that resonates with someone.
Let's start at the beginning — - relocations are entries in binaries that are left to be filled in later -- at link time by the toolchain linker or at runtime by the dynamic linker. A relocation in a binary is a descriptor which essentially says "determine the value of X, and put that value into the binary at offset Y" — each relocation has a specific type, defined in the ABI documentation, which describes exactly how "determine the value of" is actually determined.
Here's the simplest example:
$ cat a.c extern int foo; int function(void) { return foo; } $ gcc -c a.c $ readelf --relocs ./a.o Relocation section '.rel.text' at offset 0x2dc contains 1 entries: Offset Info Type Sym.Value Sym. Name 00000004 00000801 R_386_32 00000000 foo
The value of foo is not known at the time you make a.o, so the compiler leaves behind a relocation (of type R_386_32) which is saying "in the final binary, patch the value at offset 0x4 in this object file with the address of symbol foo". If you take a look at the output, you can see at offset 0x4 there are 4-bytes of zeros just waiting for a real address:
$ objdump --disassemble ./a.o ./a.o: file format elf32-i386 Disassembly of section .text: 00000000 <function>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: a1 00 00 00 00 mov 0x0,%eax 8: 5d pop %ebp 9: c3 ret
That's link time; if you build another object file with a value of foo and build that into a final executable, the relocation can go away. But there is a whole bunch of stuff for a fully linked executable or shared-library that just can't be resolved until runtime. The major reason, as I shall try to explain, is position-independent code (PIC). When you look at an executable file, you'll notice it has a fixed load address
$ readelf --headers /bin/ls [...] ELF Header: [...] Entry point address: 0x8049bb0 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align [...] LOAD 0x000000 0x08048000 0x08048000 0x16f88 0x16f88 R E 0x1000 LOAD 0x016f88 0x0805ff88 0x0805ff88 0x01543 0x01543 RW 0x1000
This is not position-independent. The code section (with permissions R E; i.e. read and execute) must be loaded at virtual a