Operating System Development Series

最新推荐文章于 2021-05-16 18:29:55 发布

happylzs2008

最新推荐文章于 2021-05-16 18:29:55 发布

阅读量173

点赞数

分类专栏： Linux环境开发_调试_定位工具

本文链接：https://blog.csdn.net/happylzs2008/article/details/113406434

版权

Linux环境开发_调试_定位工具专栏收录该内容

38 篇文章 0 订阅

订阅专栏

http://www.brokenthorn.com/Resources/OSDevIndex.html

How statically linked programs run on Linux

https://blog.csdn.net/astrotycoon/article/details/78621164

https://blog.csdn.net/astrotycoon/article/details/78673570

How statically linked programs run on Linux

静态链接的程序是如何在linux系统上运行的

In this article I want to explore what happens when a statically linked program gets executed on Linux. By statically linked I mean a program that does not require any shared objects to run, even the ubiquitous libc. In reality, most programs one encounters on Linux aren't statically linked, and do require one or more shared objects to run. However, the running sequence of such programs is more involved, which is why I want to present statically linked programs first. It will serve as a good basis for understanding, allowing me to explore most of the mechanisms involved with less details getting in the way. In a future article I will cover the dynamic linking process in detail.

在本篇博文中，我将说明一个静态链接的程序是怎样在现代linux操作系统上运行的。顾名思义，静态链接的程序意思就是该程序在运行时不需要依赖任何共享库，甚至是大家熟知的最最基础的C库。当然了，现如今的应用程序使用静态链接方式的已经很少了，它们往往依赖一个或者更多的共享库。不过呢，要讲清楚这些程序的运行机制要牵扯到更多的细节，因此我选择先讲述静态链接程序的运行机制，先给大家一个基础的认识，而不是陷入太多的细枝末梢。等将来有机会，我再来详细讲述动态链接程序的运行机制。

The Linux kernel 内核做了哪些工作

Program execution begins in the Linux kernel. To run a program, a process will call a function from the exec family. The functions in this family are all very similar, differing only in small details regarding the manner of passing arguments and environment variables to the invoked program. What they all end up doing is issuing the sys_execve system call to the Linux kernel.
程序的启动是从内核里开始的。为了运行一个新程序，一个进程通常会调用exec家族里的函数。exec家族的里的函数都很相似，仅仅在传递参数和环境变量的方式上有所区别。它们最终其实都是调用了sys_execve系统调用接口。

sys_execve does a lot of work to prepare the new program for execution. Explaining it all is far beyond the scope of this article - a good book on kernel internals can be helpful to understand the details [1]. I'll just focus on the stuff useful for our current discussion.
sys_execve为使程序能够成功运行而做了很多准备工作。在本篇博文中过多的讲做了哪些准备工作是不切实际的，也有所偏离文意，有兴趣的童鞋可以参阅相关的内核书籍。在这里，我只粗略地捡跟本文相关的来讲。

As part of its job, the kernel must read the program's executable file from disk into memory and prepare it for execution. The kernel knows how to handle a lot of binary file formats, and tries to open the file with different handlers until it succeeds (this happens in the function search_binary_handler in fs/exec.c). We're only interested in ELF here, however; for this format the action happens in function load_elf_binary (in fs/binfmt_elf.c).
sys_execve的诸多准备工作中，有一项是将可执行文件从磁盘上读取进内存，然后开始准备执行。现代的linux内核可以识别很多种二进制可执行文件格式，在文件fs/exec.c的函数search_binary_handler中最终会识别出可执行文件的格式。这里我们只关注ELF文件格式，关于加载ELF可执行文件的细节可在fs/binfmt_elf.c的load_elf_binary函数中找到。

The kernel reads the ELF header of the program, and looks for a PT_INTERP segment to see if an interpreter was specified. Here the statically linked vs. dynamically linked distinction kicks in. For statically linked programs, there is no PT_INTERP segment. This is the scenario this article covers.
内核会解析ELF的程序头（Program header），寻找一个类型为PT_INTERP的程序段，在该段中会指定程序的解释器（interpreter）。如果程序是动态链接的，那么这里的解释器将会是动态链接器的绝对路径，然而静态链接的程序包含了所有的代码，是不需要动态链接器的，因此静态链接的程序是没有类型为PT_INTERP的程序段的。

The kernel then goes on mapping the program's segments into memory, according to the information contained in the ELF program headers. Finally, it passes the execution, by directly modifying the IP register, to the entry address read from the ELF header of the program (e_entry). Arguments are passed to the program on the stack (the code responsible for this is in create_elf_tables). Here's the stack layout when the program is called, for x64:
内核然后根据程序头信息映射程序的不同段到不同的内存区域。最后，根据ELF头部信息里的程序入口点（e_entry）直接修改IP寄存器将执行权转交给应用程序。传递给程序的参数都已经通过函数create_elf_tables压入栈中了。下图是x64平台程序启动时的栈布局情况：

At the top of the stack is argc, the amount of command-line arguments. It is followed by all the arguments themselves (each a char*), terminated by a zero pointer. Then, the environment variables are listed (also a char* each), terminated by a zero pointer. The observant reader will notice that this argument layout is not what one usually expects in main. This is because main is not really the entry point of the program, as the rest of the article shows.

Program entry point 应用程序的真正入口点

So, the Linux kernel reads the program's entry address from the ELF header. Let's now explore how this address gets there.
现在我们知道，内核是通过读取ELF头部的程序入口点知道程序的运行首地址的，那么这个首地址处是什么代码呢？接下来就让我们一探究竟！

Unless you're doing something very funky, the final program binary image is probably being created by the system linker - ld. By default, ld looks for a special symbol called _start in one of the object files linked into the program, and sets the entry point to the address of that symbol. This will be simplest to demonstrate with an example written in assembly (the following is NASM syntax):
首先必须强调，除非你使用了其他操蛋的方法，否则我们的应用程序默认最后都是ld链接器生成的。默认的，ld链接器会在目标文件中寻找_start特殊符号，并将程序的入口点（e_entry）设置成这个符号的地址。我们通过下面的汇编程序（NASM语法）来验证一下：

section    .text
    ; The _start symbol must be declared for the linker (ld)
    global _start

_start:
    ; Execute sys_exit call. Argument: status -> ebx
    mov     eax, 1
    mov     ebx, 42
    int     0x80

This is a very basic program that simply returns 42. Note that it has the _start symbol defined. Let's build it, examine the ELF header and its disassembly:
这是个返回数值42的简单应用程序，注意到我们在程序中定义了_start符号。现在编译，然后查看elf文件头并且反汇编结果如下：

$ nasm -f elf64 nasm_rc.asm -o nasm_rc.o
$ ld -o nasm_rc64 nasm_rc.o
$ readelf -h nasm_rc64
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  ...
  Entry point address:               0x400080
  ...
$ objdump -d nasm_rc64

nasm_rc64:     file format elf64-x86-64


Disassembly of section .text:

0000000000400080 <_start>:
  400080:     b8 01 00 00 00          mov    $0x1,%eax
  400085:     bb 2a 00 00 00          mov    $0x2a,%ebx
  40008a:     cd 80                   int    $0x80

As you can see, the entry point address in the ELF header was set to 0x400080, which also happens to be the address of _start.
正如所见，在ELF头中e_entry的值为0x400080，这也正是符号_start的地址值。

ld looks for _start by default, but this behavior can be modified by either the --entry command-line flag, or by providing an ENTRY command in a custom linker script.
ld链接器默认使用_start作为程序入口，我们可以用过--entry参数或者在链接脚本中提供一个ENTRY命令来改变ld链接器的默认行为。

We're usually not writing our code in assembly, however. For C/C++ the situation is different, because the entry point familiar to users is the main function and not the _start symbol. Now it's time to explain how these two are related.
现如今，我们一般不再使用汇编写程序，而是使用C/C++这类高级语言。但是从我们一开始学习高级语言开始就被告知，程序的入口是main函数，而不是上文提到的_start，这咋回事？好，是时候揭开谜底了！

Let's start with this simple C program which is functionally equivalent to the assembly shown above:
下面的简单C程序跟刚才的汇编代码功能一样，都是返回数值42：

int main() {
    return 42;
}

I will compile this code into an object file and then attempt to link it with ld, like I did with the assembly:
跟上面的汇编代码一样，首先编译成目标文件，然后使用ld链接器链接：

$ gcc -c c_rc.c
$ ld -o c_rc c_rc.o
ld: warning: cannot find entry symbol _start; defaulting to 00000000004000b0

Whoops, ld can't find the entry point. It tries to guess using a default, but it won't work - the program will segfault when run. ld obviously needs some additional object files where it will find the entry point. But which object files are these? Luckily, we can use gcc to find out. gcc can act as a full compilation driver, invoking ld as needed. Let's now use gcc to link our object file into a program. Note that the -static flag is passed to force static linking of the C library and the gcc runtime library:
糟糕，ld链接器抱怨没有找到_start符号，因此它使用了一个默认值，但是这样于事无补 -- 程序运行会出现段错误。很明显，ld链接器需要在其它目标文件中才可以找到_start符号。但是，这些目标文件在哪呢？幸运的是，我们可以通过gcc来找到这些额外的目标文件。我们可以简单的认为gcc是编译器和链接器的前端，它会在适当的时候启动编译器和链接器。现在使用gcc来链接我们的目标文件生成最后的可执行文件。请注意，-static参数是告诉ld链接器，必须静态链接C库和gcc的运行时库。

$ gcc -o c_rc -static c_rc.o
$ c_rc; echo $?
42

It works. So how does gcc manage to do the linking correctly? We can pass the -Wl,-verbose flag to gcc which will spill the list of objects and libraries it passed to the linker. Doing this, we'll see additional object files like crt1.o and the whole libc.a static library (which has objects with telling names like libc-start.o). C code does not live in a vacuum. To run, it requires some support libraries such as the gcc runtime and libc.
程序运行成功。gcc是怎样保证链接成功的呢？我们可以通过传递-Wl,-verbose参数来详细地打印出链接时gcc传递给链接器的目标文件和库文件。这样我们可以看到gcc把诸如crt1.o的目标文件和libc.a传递给了链接器。如此一来我们就知道了，我们写的C代码原来还需要gcc运行时库和C库才能成功运行。

Since it obviously linked and ran correctly, the program we built with gcc should have a _start symbol at the right place. Let's check [2]:
通过前面的链接成功和运行成功，我们推测最后的可执行文件中应该有_start符号。查看如下：

$ readelf -h c_rc
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 03 00 00 00 00 00 00 00 00
  Class:                             ELF64
  ...
  Entry point address:               0x4003c0
  ...

$ objdump -d c_rc | grep -A15 "<_start"
00000000004003c0 <_start>:
  4003c0:     31 ed                   xor    %ebp,%ebp
  4003c2:     49 89 d1                mov    %rdx,%r9
  4003c5:     5e                      pop    %rsi
  4003c6:     48 89 e2                mov    %rsp,%rdx
  4003c9:     48 83 e4 f0             and    $0xfffffffffffffff0,%rsp
  4003cd:     50                      push   %rax
  4003ce:     54                      push   %rsp
  4003cf:     49 c7 c0 20 0f 40 00    mov    $0x400f20,%r8
  4003d6:     48 c7 c1 90 0e 40 00    mov    $0x400e90,%rcx
  4003dd:     48 c7 c7 d4 04 40 00    mov    $0x4004d4,%rdi
  4003e4:     e8 f7 00 00 00          callq  4004e0 <__libc_start_main>
  4003e9:     f4                      hlt
  4003ea:     90                      nop
  4003eb:     90                      nop

Indeed, 0x4003c0 is the address of _start and it's the program entry point. However, what is all that code at _start? Where does it come from, and what does it mean?
果然，0x4003c0既是_start的地址又是程序的入口地址。那么，到了这会儿，会不会有这样的疑问，_start处的代码做了什么，这些代码又来自哪里呢？毕竟肯定不是我们自己写的啊！

Decoding the start sequence of C code 探秘C启动代码
The startup code shown above comes from glibc - the GNU C library, where for x64 ELF it lives in the file sysdeps/x86_64/start.S [3]. Its goal is to prepare the arguments for a function named __libc_start_main and call it. This function is also part of glibc and lives in csu/libc-start.c. Here is its signature, formatted for clarity, with added comments to explain what each argument means:
其实_start来自glibc，在x64平台上，可以在文件sysdeps/x86_64/start.S中找到代码。这段代码的目的很单纯，只是给函数__libc_start_main准备参数。函数__libc_start_main同样来自glibc，它定义在文件csu/libc-start.c中。下面是我添加了注释的函数原型：

int __libc_start_main(
/* Pointer to the program's main function */
(int (*main) (int, char**, char**),
/* argc and argv */
int argc, char **argv,
/* Pointers to initialization and finalization functions */
__typeof (main) init, void (*fini) (void),
/* Finalization function for the dynamic linker */
void (*rtld_fini) (void),
/* End of stack */
void* stack_end)

Anyway, with this signature and the AMD64 ABI in hand, we can map the arguments passed to __libc_start_main from _start:
通过函数原型和参阅AMD64 ABI，我们可以很容易地得到传递给函数__libc_start_main栈布局如下：

main: rdi <-- $0x4004d4
argc: rsi <-- [RSP]
argv: rdx <-- [RSP + 0x8]
init: rcx <-- $0x400e90
fini: r8 <-- $0x400f20
rdld_fini: r9 <-- rdx on entry
stack_end: on stack <-- RSP

You'll also notice that the stack is aligned to 16 bytes and some garbage is pushed on top of it (rax) before pushing rsp itself. This is to conform to the AMD64 ABI. Also note the hlt instruction at address 0x4003e9. It's a safeguard in case __libc_start_main did not exit (as we'll see, it should). hlt can't be executed in user mode, so this will raise an exception and crash the process.
阅读代码，可以看到为了遵循AMD64 ABI的栈16字节对齐的要求，在压入rsp之前，故意压入了无意义的rax。还有在地址0x4003e9处的是hlt指令，这是为了防止函数__libc_start_main出错返回而准备的。hlt指令在用户模式下是不能运行的，所以一旦运行到这里，就会触发异常，并且结束进程。

Examining the disassembly, it's easy to verify that 0x4004d4 is indeed main, 0x400e90 is __libc_csu_init and 0x400f20 is __libc_csu_fini. There's another argument the kernel passes _start - a finish function for shared libraries to use (in rdx). We'll ignore it in this article.
调试这段汇编代码，很容易确认0x4004d4是main函数的地址，0x400e90是__libc_csu_init函数的地址，0x400f20是__libc_csu_fini函数的地址。

The C library start function C启动代码做了哪些事
Now that we understood how it's being called, what does __libc_start_main actually do? Ignoring some details that are probably too specialized to be interesting in the scope of this article, here's a list of things that it does for a statically linked program:
现在知道了函数__libc_start_main是如何被调用的了，那么它做了哪些事呢？下面列出一个静态链接的程序，函数__libc_start_main所做的事情：

Figure out where the environment variables are on the stack.
找到环境变量在栈中的位置。
Prepare the auxiliary vector, if required.
如果需要，为程序准备向量数组。
Initialize thread-specific functionality (pthreads, TLS, etc.)
初始化线程相关的功能，例如初始化pthreads库，初始化线程私有数据等。
Perform some security-related bookkeeping (this is not really a separate step, but is trickled all through the function).
进行一些安全相关的功能。
Initialize libc itself.
初始化C库自身，例如初始化堆栈，初始化IO。
Call the program initialization function through the passed pointer (init).
调用传递进来的init函数，做一些初始化功能。
Register the program finalization function (fini) for execution on exit.
注册fini函数，将来调用exit函数退出程序时调用该函数。
Call main(argc, argv, envp)
调用我们的main函数。
Call exit with the result of main as the exit code.
使用main函数的返回值作为参数调用exit函数。

Digression: init and fini 补充说明：init和fini
Some programming environments (most notably C++, to construct and destruct static and global objects) require running custom code before and after main. This is implemented by means of cooperation between the compiler/linker and the C library. For example, the __libc_csu_init (which, as you can see above, is called before the user's main) calls into special code that's inserted by the linker. The same goes for __libc_csu_fini and finalization.
在C++以及类似的高级语言中，需要在main函数之前或者之后运行一些额外的代码，例如全局对象的构造需要在main函数之前完成，而在main函数之后需要再析构。这看似简单的需求，却需要编译器/链接器和C库协同工作才能完成。例如，在main函数之前运行的__libc_csu_init代码会运行链接器插入的代码。同样的，在main函数之后运行的__libc_csu_fini会做一些程序收尾工作。

You can also ask the compiler to register your function to be executed as one of the constructors or destructors. For example [4]:
你还可以显示地告诉编译器，哪些函数你希望在main函数之前运行，哪些函数在main函数之后运行，例如下面代码片段：

#include <stdio.h>
int main() {
return 43;
}
__attribute__((constructor))
void myconstructor() {
printf("myconstructor\n");
}

myconstructor will run before main. The linker places its address in a special array of constructors located in the .ctors section. __libc_csu_init goes over this array and calls all functions listed in it.
这样，函数myconstructor就会在先于main函数执行。链接器会将myconstructor函数地址存放在.ctors段里的特殊数组里，运行时__libc_csu_init会遍历该数组，分别执行里面的函数。

Conclusion 总结

This article demonstrates how a statically linked program is set up to actually run on Linux. In my opinion, this is a very interesting topic to study because it demonstrates how several large components of the Linux eco-system cooperate to enable the program execution process. In this case, the Linux kernel, the compiler and linker, and the C library are involved. In a future article I will present the more complex case of a dynamically linked program, where another agent joins the game - the dynamic linker. Stay tuned.
本篇博文向大家展示了一个静态链接的程序在linux系统上的运行流程的。其实这是个很有趣的协同过程，要想一个程序成功的运行可没那么容易，需要内核，编译器，链接和C库的共同工作才可能达到最终的目的。好了，在将来有机会向大家讲述动态链接的原理文章，敬请期待吧！

扩展阅读：

《Linux x86 elf 程序启动过程》

《程序入口函数和glibc及C++全局构造和析构》

《How main() is executed on Linux》

《glibc源码分析之进程启动（start.S）》