9.7. Relocation and Position Independent Code (PIC)

In this section we’ll investigate the difference between position independent code (known as “PIC” from here on) and non-position-independent code and how both affect relocation.


9.7.1. PIC vs. non-PIC

Much of the complexity in the ELF standard is due to the need to load shared libraries at different locations in a process’ address space. Objects built to be position-independent are specifically meant to be loaded anywhere in the address space. As discussed earlier, the code in ELF files contains relative references to data and relies on the PLT and GOT to resolve the symbols at run time. Let’s take a look at position independent code in more detail, though, because it is an important concept for ELF.

Consider the following source code:

#include <stdio.h>


extern "C" int otherFunction( int val )


    return 23 ;



int myGlobInt = 12;


int buzz( void )


   int intVal ;


   intVal = myGlobInt + otherFunction( 5 ) ;


   return intVal ;



int main( )


   printf( "buzz: %d\n", buzz() ) ;


   return 0 ;



Make note that the source code shows a direct call to “buzz” as part of the call to the printf function. How the function is called is not important, but rather that it is called from within the scope of the main function.


This little code snippet contains a few functions and a global variable. Let’s see how the resulting ELF object file differs when it is compiled as PIC or non-PIC. The -fPIC switch tells the g++ compiler to build with position-independent code.

penguin> g++ -c pic.C -o pic_nopic.o

penguin> g++ -fPIC -c pic.C -o pic.o


The first thing worth noting is that the resulting file sizes are different:


penguin> ls -l pic*.o

-rw-r—r—    1 wilding  build       1016 Dec 28 15:14 pic.o

-rw-r—r—    1 wilding  build        924 Dec 28 15:09 pic_nopic.o


The position-independent code is larger by 92 bytes. But what is different about the actual contents of the files? To find out, we need to look deeper. The first tool we’ll use is nm:

penguin> nm -S pic.o


0000000a 00000036 T _Z4buzzv

00000040 00000045 T main

00000000 00000004 D myGlobInt

00000000 0000000a T otherFunction

         U printf


penguin> nm -S pic_nopic.o

0000000a 00000021 T _Z4buzzv

0000002c 00000033 T main

00000000 00000004 D myGlobInt

00000000 0000000a T otherFunction

         U printf


The PIC version includes the global offset table as a required symbol; whereas, the non-PIC version does not. We know that the GOT is used to support relocation, so this makes sense. The other important difference is that the functions have difference sizes. Let’s take a look at the assembly instructions for function main using the objdump tool (only the output for the function main is shown here):

penguin> objdump -d pic.o


00000040 <main>:

  40:   55                     push   %ebp

  41:   89 e5                  mov    %esp,%ebp

  43:   53                     push   %ebx

  44:   83 ec 04               sub    $0x4,%esp

  47:   e8 00 00 00 00         call   4c <main+0xc>

  4c:   5b                     pop    %ebx

  4d:   81 c3 03 00 00 00      add    $0x3,%ebx

  53:   83 e4 f0               and    $0xfffffff0,%esp

  56:   b8 00 00 00 00         mov    $0x0,%eax

  5b:   29 c4                  sub    %eax,%esp

  5d:   83 ec 08               sub    $0x8,%esp

  60:   83 ec 08               sub    $0x8,%esp

  63:   e8 fc ff ff ff         call   64 <main+0x24>

  68:   83 c4 08               add    $0x8,%esp

  6b:   50                     push   %eax

  6c:   8d 83 00 00 00 00      lea    0x0(%ebx),%eax

  72:   50                     push   %eax

  73:   e8 fc ff ff ff         call   74 <main+0x34>

  78:   83 c4 10               add    $0x10,%esp

  7b:   b8 00 00 00 00         mov    $0x0,%eax

  80:   8b 5d fc               mov    0xfffffffc(%ebp),%ebx

  83:   c9                     leave

  84:   c3                     ret


penguin> objdump -d pic_nopic.o


0000002c <main>:

  2c:   55                     push   %ebp

  2d:   89 e5                  mov    %esp,%ebp

  2f:   83 ec 08               sub    $0x8,%esp

  32:   83 e4 f0               and    $0xfffffff0,%esp

  35:   b8 00 00 00 00         mov    $0x0,%eax

  3a:   29 c4                  sub    %eax,%esp

  3c:   83 ec 08               sub    $0x8,%esp

  3f:   83 ec 08               sub    $0x8,%esp

  42:   e8 fc ff ff ff         call   43 <main+0x17>

  47:   83 c4 08               add    $0x8,%esp

  4a:   50                     push   %eax

  4b:   68 00 00 00 00         push   $0x0

  50:   e8 fc ff ff ff         call   51 <main+0x25>

  55:   83 c4 10               add    $0x10,%esp

  58:   b8 00 00 00 00         mov    $0x0,%eax

  5d:   c9                     leave

  5e:   c3                     ret


The non-PIC version is certainly smaller, and there is a good reason for this. It is also interesting that neither version makes a direct call to the function buzz(). For that matter, there is no direct call to printf either. The secret here is relocation and how it works with PIC and non-PIC code.

The PIC code needs first to find the procedure linkage table before it can make a call to the function buzz(). This is because buzz() could be anywhere in the address space. The non-PIC code, on the other hand, can make some assumptions that the buzz() will eventually be at a predictable offset from any code that needs it. Well, sort of. There is an exception listed later under “Relocation and Linking.” In any case, let’s see how relocation is affected by position-independent code.

9.7.2. Relocation and Position Independent Code

As discussed before, relocation is a mechanism used to change values in a shared library or executable when it is loaded into a process’ address space. As discussed earlier, calls to either printf() or buzz() at compile time would be premature because the compiler doesn’t know where these functions will be located at run time.

For simplicity, let’s look at the relocation information for the non-PIC version first:


penguin> readelf -r pic_nopic.o


Relocation section '.rel.text' at offset 0x378 contains 5 entries:

 Offset     Info    Type            Sym.Value  Sym. Name

00000016  00000702 R_386_PC32        00000000   otherFunction

0000001f  00000801 R_386_32          00000000   myGlobInt

00000043  00000902 R_386_PC32        0000000a   Z4buzzv

0000004c  00000501 R_386_32          00000000   .rodata

00000051  00000b02 R_386_PC32        00000000   printf


The relocations solve the mystery of the missing calls to buzz() and printf() in the previous section on PIC vs non-PIC. The relocation for buzz() instructs the run time linker to change the 32-bit value at offset 0x43 in the .text section to the eventual, run time location of the function buzz(). A quick look at the assembly language at 0x42 makes the purpose of this relocation even more clear:

42:    e8 fc ff ff ff          call    43 <main+0x17>


The current instruction calls a false instruction at 0x43 because it will be relocated at a later time anyway. After the relocation, the 32-bit value at 0x43 will point to the address of buzz, so the call instruction at 0x42 will be correct. The same mechanism is used for printf at offset 0x51.


Looking at the relocation information for the PIC object reveals some interesting differences:


penguin> readelf -r pic.o


Relocation section '.rel.text' at offset 0x3c8 contains 7 entries:

 Offset     Info    Type        Sym.Value  Sym. Name

0000001a  00000a0a R_386_GOTPC  00000000   _GLOBAL_OFFSET_TABLE_

00000020  00000803 R_386_GOT32  00000000   myGlobInt

0000002a  00000704 R_386_PLT32  00000000   otherFunction

0000004f  00000a0a R_386_GOTPC  00000000   GLOBAL_OFFSET_TABLE_

00000064  00000904 R_386_PLT32  0000000a   _Z4buzzv

0000006e  00000509 R_386_GOTOFF 00000000   .rodata

00000074  00000c04 R_386_PLT32  00000000   printf


Notice that the relocation entries for the PIC and non-PIC object files have different types for the functions and variables. In the PIC version, the relocation types are PLT32 and for the non-PIC version, the relocation types are PC32. The PLT32 is a type of relocation used with the procedure linkage table. A relocation of PC32 is a more primitive form of relocation.

There is an obvious performance impact when using position-independent code. A few years ago, a benchmark measured the impact at about 2 to 3%, although the actual percentage will depend on many factors (average size of functions, and so on). Regardless of the performance implications, position-independent code is required and effective and is used widely on Linux.

9.7.3. Relocation and Linking

As discussed earlier in the chapter, linking is the process of matching or binding undefined symbols to defined symbols of the same type and name. Linking can be done when a shared library or executable is actually created or at run time, although the mechanisms are very different. The relocation entries for object files are processed during the link phase, and relocation entries in executables and shared libraries are processed at run time.

When creating an executable or shared library, the linker (usually called “ld”) will try to resolve undefined function symbols using the defined function symbols found in the constituent object files. This is where the main symbol table is used. Static functions are referenced through relative addressing, as are global functions. The main difference is that static functions will not be included in the dynamic symbol table.


Let’s take a look at how the linker processes the relocation entries for pic.o. For quick reference, here are the relocation entries for pic.o from before:

penguin> readelf -r pic.o


Relocation section '.rel.text' at offset 0x3c8 contains 7 entries:

 Offset     Info    Type         Sym.Value  Sym. Name

0000001a  00000a0a R_386_GOTPC   00000000   GLOBAL_OFFSET_TABLE_

00000020  00000803 R_386_GOT32   00000000   myGlobInt

0000002a  00000704 R_386_PLT32   00000000   otherFunction

0000004f  00000a0a R_386_GOTPC   00000000   GLOBAL_OFFSET_TABLE_

00000064  00000904 R_386_PLT32   0000000a   Z4buzzv

0000006e  00000509 R_386_GOTOFF  00000000   .rodata

00000074  00000c04 R_386_PLT32   00000000   printf


Each of the function symbols, including the ones that could be satisfied locally by the function symbols in pic.o, have a relocation entry. The global variable myGlobInt also has a relocation entry. Let’s see what happens when the linker links the object file pic.o and creates an executable.

Note: The linker ld is called by g++. It is usually not a good idea to directly link an executable using ld. We could get away with it here because the source code does not include any C++ features. We will use g++ here as we have for the entire chapter because it allows us to show how ELF handles basic C++ features.

penquin> g++ -o pic

pic.o penguin> readelf -r pic


Relocation section '.rel.dyn' at offset 0x28c contains 2 entries:

 Offset     Info    Type            Sym.Value  Sym. Name

080495d4  00000106 R_386_GLOB_DAT    080494bc   myGlobInt

080495d8  00000606 R_386_GLOB_DAT    00000000   gmon_start__


Relocation section '.rel.plt' at offset 0x29c contains 2 entries:

 Offset     Info    Type            Sym.Value  Sym. Name

080495cc  00000207 R_386_JUMP_SLOT   080482d4   libc_start_main

080495d0  00000307 R_386_JUMP_SLOT   080482e4   printf


There are a few differences. The pic.o object file had one relocation section called .rel.text, and the executable “pic” contains two relocation sections called .rel.dyn and .rel.plt. The relocation for function buzz() is also missing from the executable. This is because the reference was satisfied by the function buzz() in the object file.

To see what these relocation entries really do, we need to find out which sections they belong to:


penguin> readelf -S pic |egrep "got|plt"

  [ 9] .rel.plt   REL      0804829c 00029c 000010 08  A  4  b  4

  [11] .plt       PROGBITS 080482c4 0002c4 000030 04 AX  0  0  4

  [21] .got       PROGBITS 080495c0 0005c0 00001c 04 WA  0  0  4


The GLOB_DAT entries have offsets of 0x80495d4 and 0x80495d8, both of which are in the global offset table. The purpose of these entries is to set the address of the symbol for this relocation entry in the corresponding slot of the global offset table. The executable code will be expecting it to be there when the program is loaded. The JUMP_SLOT relocation entries have offsets of 0x80495cc and 0x80495d0, and both of these are also in the GOT. These entries tell the run time linker to set entries in the GOT for the corresponding slots for the same symbol in the PLT. This is required for dynamic linking and in particular, lazy binding. See section “.plt” for more information.

If we link this object file as a shared library, the relocation entries are very different:


penguin> g++ -shared pic.o -o libpic.so

penguin> readelf -r libpic.so


Relocation section '.rel.dyn' at offset 0x5d0 contains 7 entries:

 Offset     Info    Type            Sym.Value  Sym. Name

00001888  00000008 R_386_RELATIVE

0000188c  00000008 R_386_RELATIVE

000019a4  00000008 R_386_RELATIVE

000019a8  00001d06 R_386_GLOB_DAT    00001890   myGlobInt

000019ac  00002206 R_386_GLOB_DAT    00000000   cxa_finalize

000019b0  00002706 R_386_GLOB_DAT    00000000   Jv_RegisterClasses

000019b4  00002806 R_386_GLOB_DAT    00000000   gmon_start__


Relocation section '.rel.plt' at offset 0x608 contains 5 entries:

 Offset     Info    Type            Sym.Value  Sym. Name

00001990  00001a07 R_386_JUMP_SLOT   0000079e   Z4buzzv

00001994  00002007 R_386_JUMP_SLOT   00000000   printf

00001998  00002207 R_386_JUMP_SLOT   00000000   cxa_finalize

0000199c  00002307 R_386_JUMP_SLOT   00000794   otherFunction

000019a0  00002707 R_386_JUMP_SLOT   00000000   Jv_RegisterClasses


There are quite a few more relocation entries for the shared library than for the executable. This includes the function buzz() because the reference for buzz() might not be satisfied by the buzz() contained in the shared library. See the section, “Symbol Resolution,” for more details on how to force a shared library to use the symbols that it contains (“symbolic linking”).

What if we try to create a shared library with the non-PIC object file? 如果我们尝试使用非 PIC 对象文件创建共享库, 该怎么办?

penguin> g++ -shared pic_nopic.o -o libnopic.so

penguin> readelf -r libnopic.so


Relocation section '.rel.dyn' at offset 0x5d0 contains 11 entries:

 Offset     Info    Type            Sym.Value  Sym. Name

000007b0  00000008 R_386_RELATIVE

00001838  00000008 R_386_RELATIVE

0000183c  00000008 R_386_RELATIVE

00001950  00000008 R_386_RELATIVE

0000077a  00002302 R_386_PC32        00000764   otherFunction

00000783  00001d01 R_386_32          00001840   myGlobInt

000007a7  00001a02 R_386_PC32        0000076e   Z4buzzv

000007b5  00002002 R_386_PC32        00000000   printf

00001954  00002206 R_386_GLOB_DAT    00000000   cxa_finalize

00001958  00002706 R_386_GLOB_DAT    00000000   Jv_RegisterClasses

0000195c  00002806 R_386_GLOB_DAT    00000000   gmon_start__


Relocation section '.rel.plt' at offset 0x628 contains 2 entries:

 Offset     Info    Type            Sym.Value  Sym. Name

00001948  00002207 R_386_JUMP_SLOT   00000000   cxa_finalize

0000194c  00002707 R_386_JUMP_SLOT   00000000   Jv_RegisterClasses


The shared library is created, although the relocations are very different. The relocation type for the functions is PC32 and modifies the executable code. How does this work, though? The text segment is always loaded as read-only, and yet these relocations apparently change some values in the text segment. For a better understanding of this special type of relocation, let’s build libfoo.so without using the -fPIC switch and then run the executable foo, seeing how it modifies the text segment of libfoo.so.

penguin> g++ -c foo.C

penguin> g++ -shared foo.o -o libfoo.so

penguin> g++ -o foo main.o -L. -Wl,-rpath,. -lfoo


Now to use strace to see how this works under the covers:


penguin> strace -o foo.st foo

This is a printf format string in baz

This is a printf format string in main


penguin> less foo.st


open("./libfoo.so", O_RDONLY)          = 3

read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\360\7\0"..., 1024) = 1024

fstat64(3, {st_mode=S_IFREG|0755, st_size=7113, ...}) = 0

getcwd("/home/wilding/src/Linuxbook/ELF", 128) = 32

mmap2(NULL, 7412, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40014000

mprotect(0x40015000, 3316, PROT_NONE)   = 0

mmap2(0x40015000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0x40015000

close(3)                                = 0


mprotect(0x40014000, 4096, PROT_READ|PROT_WRITE) = 0

mprotect(0x40014000, 4096, PROT_READ|PROT_EXEC) = 0



The first part of the output shows where libfoo.so is loaded into the address space. It is loaded at address 0x40014000 as shown with the mmap2 system call. Later on in the run, the program (the run time linker) uses mprotect to change the attributes of part of the text section to read/write and then back to read/exec. In between these two system calls is where the relocations take place in order to perform the relocations on the text file. These two calls to mprotect are not needed for the library that was built with a position-independent object because of the relocations action on the GOT, which is in the data segment.

Even though this special type of relocation works, it is not used much in practice. Non-PIC code is not meant to be position-independent, and forcing it to be part of a shared library is not standard and not recommended

