CMU Computer Systems: Linking

Static Linking
  • Programs are translated and linked using a compiler driver
    • Source files
    • Separately compiled relocatable object files
    • Fully linked executable object file
Why linkers
  • Modularity
    • Program can be written as a collection of smaller source files, rather than one monolithic mass
    • Can build libraries of common functions (more on this later)
      • e.g., Math library, standard C library
  • Efficiency
    • Time: Separate compilation
      • Change one source file, compile, and then relink
      • No need to recompile other source files
    • Space: Libraries
      • Common functions can be aggregated into a single file
      • Yet executable files and running memory images contain only code for the functions they actually use
What linkers do
  • Symbol resolution
    • Programs define and reference symbols (global variables and functions)
    • Symbol definitions are stored in object file (by assembler) in symbol table
      • Symbol table is an array of structs
      • Each entry includes name, size, and location of symbol
    • During symbol resolution step, the linker associates each symbols with exactly one symbol definition
  • Relocation
    • Merges separate code and data sections into single sections
    • Relocates symbols from their relative locations in the .o files to their final absolute memory locations in the executable
    • Updates all references to these symbols to reflect their new positions
Three Kinds of Object Files (Modules)
  • Relocatable object file (.o file)
    • Contains code and data in a form that can be combined with other relocatable object files to form executable object file
      • Each .o file is produced from exactly one source (.c) file
  • Executable object file (a .out file)
    • Contains code and data in a form that can be copied directly into memory and then executed
  • Shared object file (.so file)
    • Special type of relocatable object file that can be loaded into memory and linked dynamically, at either load time or run-time
    • Called Dynamic Link Libraries (DLLs) by Windows
Executable and Linkable Format (ELF)
  • Standard binary format for object files
  • One unified format for
    • Relocatable object files (.o)
    • Executable object files (a.out)
    • Shared object files (.so)
  • Generic name: ELF binaries
ELF Object File Format
  • Elf header
    • Word size, byte ordering, file type (.o, exec, .so), machine type, etc.
  • Segment header table
    • Page size, virtual addresses memory segments (sections), segment sizes
  • .text section
    • Code
  • .rodata section
    • Read only data: jump tables, …
  • .data section
    • Initialized global variables
  • .bss section
    • Uninitialized global variables
    • “Block Started by Symbol”
    • “Better Save Space”
    • Has section header but occupies no space
  • .symtab section
    • Symbol table
    • Procedure and static variable names
    • Section names and locations
  • .rel .text section
    • Relocation info for .text section
    • Address of instructions that will need to be modified in the executable
    • Instructions for modifying
  • .rel .data section
    • Relocation info for .data section
    • Addresses of pointer data that will need to be modified in the merged executable
  • .debug section
    • Info for symbolic debugging (gcc -g)
  • Section header table
    • Offsets and sized of each section
Linker Symbols
  • Global symbols
    • Symbols defined by module m that can be referenced by other modules
    • E.g., non-static C functions and non-static global variables
  • External symbols
    • Global symbols that are referenced by module m but defined by some other module
  • Local symbols
    • Symbols that are defined and referenced exclusively by module m
    • E.g., C functions and global variables defined with the static attribute
    • Local linker symbols are not local program variables
Local Symbols
  • Local non-static C variables vs. local static C variables
    • local non-static C variables: stored on the stack
    • local static C variables: stored in either .bss, or .data
How linker resolves duplicate symbol definitions
  • Program symbols are either strong or weak
    • Strong: procedures and initialized globals
    • Weak: uninitialized globals
Linker’s Symbol Rules
  • Multiple strong symbols are not allowed
    • Each item can be defined only once
    • Otherwise: Linker error
  • Given a strong symbol and multiple weak symbols, choose the strong symbol
    • References to the weak symbol resolve to the strong symbol
  • If there are multiple weak symbols, pick an arbitrary one
    • Can override this with gcc -fno-common
Global Variables
  • Avoid if u can
  • Otherwise
    • Use static
    • Initialize if you define a global variable
    • Use extern if you reference an external global variable
Packaging Commonly Used Functions
  • How to package functions commonly used by programmers
    • Math, I/O, memory management, string manipulation, etc.
  • Awkward, given linker framework so far
    • Put all functions into a single source file
    • Put each functions in a separate source file
Old-fashioned Solution: Static Libraries
  • Static libraries (.a archive files)
    • Concatenate related relocatable object files into a single file with an index (called an archive)
    • Enhance linker so that it tries to resolve unresolved external references by looking for the symbols in one or more archives
    • If an archive member file resolves reference, link it into the executable
  • Disadvantages
    • Duplication in the stored executables (every function needs libc)
    • Duplication in the running executables
    • Minor bug fixes of system libraries require each application to explicitly relink
Modern Solution: Shared Libraries
  • Shared Libraries
    • Object files that contain code and data that are loaded and linked into an application dynamically, at either load-time or run-time
    • Also called: dynamic link libraries, DLLs, .so files
  • Advantages
    • Dynamic linking can occur when executable is first loaded and run (load-time linking)
    • Dynamic linking can also occur after program has begun (run-time linking)
    • Shared library routines can be shared by multiple processes
Library Interpositioning
  • Library insterpositioning: powerful linking technique that allows programmers to intercept calls to arbitrary functions
  • Interpositioning can occur at
    • Compile time: when the source is compiled
    • Link time: when the relocatable object files are statically linked to form an executable object file
    • Load/run time: when an executable object file is loaded into memory, dynamically linked, and then executed
Some Interpositioning Applications
  • Security
    • Confinement (sandboxing)
    • Behind the scenes encryption
  • Debugging
    • Code in the SPDY networking stack was writing to the wrong location
    • Solved by intercepting calls to Posix write functions (write, writev, pwrite)
  • Monitoring and Profiling
    • Count number of calls to functions
    • Characterize call sites and arguments to functions
    • Malloc tracing
      • Detecting memory leaks
      • Generating address traces
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值