欢迎使用CSDN-markdown编辑器

出处:http://duanple.blog.163.com/blog/static/7097176720111141085197/
1. 序

最近在折腾各种.so,碰到了一些问题,一开始对于很多错误也没有头绪,茫然不知所措。索性化了一天多时间将<<程序员的自我修养—链接、装载与库>>中部分内容略读了一遍,主要是关于编译,链接和加载这块的。于是顺便做个笔记,方便以后回顾。基本上知道了这些,对于编译,链接和加载过程中产生的各种问题,应该就能从根本上理解并解决了。其实以前上学时也看过那本经典的<>,当时还写了篇<<链接器和加载器原理>>,不过此次会更细致深入地了解下整个编译链接和加载过程,并结合经常碰到的问题,提出一些解决方案。

  1. 编译和链接
    2.1. 编译过程

广义的代码编译过程,实际上应该细分为:预处理,编译,汇编,链接。

预处理过程,负责头文件展开,宏替换,条件编译的选择,删除注释等工作。gcc –E表示进行预处理。

编译过程,负责将预处理生成的文件,经过词法分析,语法分析,语义分析及优化后生成汇编文件。gcc –S表示进行编译。

汇编,是将汇编代码转换为机器可执行指令的过程。通过使用gcc –C或者as命令完成。

链接,负载根据目标文件及所需的库文件产生最终的可执行文件。链接主要解决了模块间的相互引用的问题,分为地址和空间分配,符号解析和重定位几个步骤。实际上在编译阶段生成目标文件时,会暂时搁置那些外部引用,而这些外部引用就是在链接时进行确定的。链接器在链接时,会根据符号名称去相应模块中寻找对应符号。待符号确定之后,链接器会重写之前那些未确定的符号的地址,这个过程就是重定位。

2.1.1. 相关选项

-WL:这个选项可以将指定的参数传递给链接器。

比如使用”-Wl,-soname,my-soname”,GCC会将-soname,my-soname传递给链接器,用来指定输出共享库的SO-NAME。

-shared:表示产生共享对象,产生的代码会在装载时进行重定位。但是无法做到让一份指令由多个进程共享。因为单纯的装载时重定位会对程序中所有指令和数据中的绝对地址进行修改。要做到让多个进程共享,还需要加上-fPIC。

-fPIC:地址无关代码,是为了能让多个进程共享一份指令。基本思想就是将指令中需要进行修改的那部分分离出来,跟数据放到一块。这样指令部分就可以保持不变,而需要变化的那部分则与数据一块,每个进程都有自己的一份副本。

-export-dynamic:默认情况下,链接器在产生可执行文件时,为了减少符号表大只会将那些被其他模块引用到的符号放到动态符号表。也就是说,在共享模块引用主模块时,只有那些在链接时被共享模块引用到的符号才会被导出。当程序使用dlopen()加载某个共享模块时,如果该共享模块反向引用了主模块的符号,而该符号可能在链接时因为未被其他模块引用而未被导出到动态符号表,这样反向引用就会失败。这个参数就是用来解决这个问题的。它表示,链接器在产生可执行文件时,将所有全局符号导出动态符号表。

-soname:指定输出共享库的SO-NAME。

-I:。指定头文件搜索路径。

-l:指定链接某个库。指定链接的比如是libxxx.so.x.y.z的一个库,只需要写-lxxx即可,编译器根据当前环境,在相关路径中查找名为xxx的库。xxx又称为共享库的链接名(link name)。不同的库可能具有同样的链接名,比如动态和静态版本,libxxx.a libxxx.so。如果链接时采用-lxxx,那么链接器会根据输出文件的情况(动态/静态)选择合适的版本。比如如果ld采用了-static参数,就会使用静态版本,如果使用了-Bdynamic(这也是默认情况),就会使用动态版本。

-L:指定链接时查找路径,多个路径用逗号分隔

-rpath:这种方式可以指定产生的目标程序的共享库查找路径。还有一个类似选项-rpath-link,与-rpath选项的区别在于,-rpath选项指定的目录被硬编码到可执行文件中,-rpath-link选项指定的目录只在链接阶段生效。这两个选项都是链接器ld的选项。更多链接器选项可以通过man ld查看。

2.1.2. 编译与链接

.so和.a的生成,可执行文件的生成。.a的生成只需要编译阶段,而可执行文件的生成还需要进行链接。静态库文件的生成很简单,主要就是分两步,第一步将源文件生成目标文件,可以使用gcc –c,第二步就是将目标文件打包,可以通过ar实现。所以该过程只要求源文件能够通过gcc –c这个命令即可。

共享库的生成要复杂一些。可以有三种方法生成:
ldG gcc -shared
$libtool
用ld最复杂,用gcc -shared就简单的多,但是-shared并非在任何平台都可以使用。-shared 该选项指定生成动态连接库(让连接器生成T类型的导出符号表,有时候也生成弱连接W类型的导出符号),不用该标志外部程序无法连接。GNU提供了一个更好的工具libtool,专门用来在各种平台上生成各种库。在编译生成某个.so文件时,比如liba.so,虽然它里面可能用到了libb.so的东西,但是在生成a.so时是可以不加-lb的,因为so的生成不会进行符号解析和重定位。

以GCC为例,它在编译静态库/动态库时到底使用了什么命令?比如:gcc –v -shared hello.c -o libhello.so。ld –G用来产生.so文件,也是gcc链接时实际调用的命令。

生成可执行文件时,如果链接的是静态库,那么链接器会按照静态链接规则,将对应的符号引用进行重定位。而如果是动态库,链接器会将这个符号标记为动态链接的符号,不进行重定位,而是在装载时再进行。所以,尽管是动态链接,如果是已经进入到了链接阶段,那么也需要能在相应的.so中找到某符号的定义,否则也会引发Undefined reference to的链接错误。因为链接器只有通过.so文件,才能判断某符号是个动态链接符号,所以也需要读取这些.so文件,找到相应符号的定义。

2.1.3. 头文件查找

include有两种写法形式,分别是:

include <> : 直接到系统指定的某些目录中去找某些头文件。

include “” : 先到源文件所在文件夹去找,然后再到系统指定的某些目录中去找某些头文件。

gcc寻找头文件的路径(按照1->2->3的顺序):

  1. 在gcc编译源文件的时候,通过参数-I指定头文件的搜索路径,如果指定路径有多个路径时,则按照指定路径的顺序搜索头文件。命令形式如:“gcc -I /path/where/theheadfile/in sourcefile.c“,这里源文件的路径可以是绝对路径,也可以是相对路径。比如设当前路径为/root/test,include_test.c如果要包含头文件“include/include_test.h“,有两种方法:

1)include_test.c中#include “include/include_test.h”或者#include “/root/test/include/include_test.h”,然后gcc include_test.c即可

2)include_test.c中#include

include

using namespace std;

void test()

{

cout << “test” << endl;

}

//main.c

include “test.h”

int main()

{

test();

return 0;

}

5.3.2. 创建静态库

gcc –v -c test.c; ar r libtest.a test.o

[admin@clu01-gala16.dev.sd.aliyun.com]$gcc -v -c test.cpp

Using built-in specs.

Target: x86_64-redhat-linux

Configured with: ../configure –prefix=/usr –mandir=/usr/share/man –infodir=/usr/share/info –enable-shared –enable-threads=posix –enable-checking=release –with-system-zlib –enable-__cxa_atexit –disable-libunwind-exceptions –enable-libgcj-multifile –enable-languages=c,c++,objc,obj-c++,java,fortran,ada –enable-java-awt=gtk –disable-dssi –enable-plugin –with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre –with-cpu=generic –host=x86_64-redhat-linux

Thread model: posix

gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)

/usr/libexec/gcc/x86_64-redhat-linux/4.1.2/cc1plus -quiet -v -D_GNU_SOURCE test.cpp -quiet -dumpbase test.cpp -mtune=generic -auxbase test -version -o /tmp/ccdMLquk.s

ignoring nonexistent directory “/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../x86_64-redhat-linux/include”

include “…” search starts here:

include <…> search starts here:

/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2

/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/x86_64-redhat-linux

/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/backward

/usr/local/include

/usr/lib/gcc/x86_64-redhat-linux/4.1.2/include

/usr/include

End of search list.

GNU C++ version 4.1.2 20080704 (Red Hat 4.1.2-46) (x86_64-redhat-linux)

    compiled by GNU C version 4.1.2 20080704 (Red Hat 4.1.2-46).

GGC heuristics: –param ggc-min-expand=100 –param ggc-min-heapsize=131072

Compiler executable checksum: 927721cb17bef594f560fa66ec50ff62

as -V -Qy -o test.o /tmp/ccdMLquk.s

GNU assembler version 2.17.50.0.6-12.el5 (x86_64-redhat-linux) using BFD version 2.17.50.0.6-12.el5 20061020

可以看到加上-v之后,就能看到各个步骤的具体命令,还能看到头文件搜索路径。

5.3.3. 创建动态库

gcc -shared test.cpp -o libtest.so –fPIC -v

[admin@clu01-gala16.dev.sd.aliyun.com]$gcc -shared test.cpp -o libtest.so -fPIC -v

Using built-in specs.

Target: x86_64-redhat-linux

Configured with: ../configure –prefix=/usr –mandir=/usr/share/man –infodir=/usr/share/info –enable-shared –enable-threads=posix –enable-checking=release –with-system-zlib –enable-__cxa_atexit –disable-libunwind-exceptions –enable-libgcj-multifile –enable-languages=c,c++,objc,obj-c++,java,fortran,ada –enable-java-awt=gtk –disable-dssi –enable-plugin –with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre –with-cpu=generic –host=x86_64-redhat-linux

Thread model: posix

gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)

/usr/libexec/gcc/x86_64-redhat-linux/4.1.2/cc1plus -quiet -v -D_GNU_SOURCE test.cpp -quiet -dumpbase test.cpp -mtune=generic -auxbase test -version -fPIC -o /tmp/ccv7EbFP.s

ignoring nonexistent directory “/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../x86_64-redhat-linux/include”

include “…” search starts here:

include <…> search starts here:

/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2

/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/x86_64-redhat-linux

/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/backward

/usr/local/include

/usr/lib/gcc/x86_64-redhat-linux/4.1.2/include

/usr/include

End of search list.

GNU C++ version 4.1.2 20080704 (Red Hat 4.1.2-46) (x86_64-redhat-linux)

    compiled by GNU C version 4.1.2 20080704 (Red Hat 4.1.2-46).

GGC heuristics: –param ggc-min-expand=100 –param ggc-min-heapsize=131072

Compiler executable checksum: 927721cb17bef594f560fa66ec50ff62

as -V -Qy -o /tmp/ccmzTeZF.o /tmp/ccv7EbFP.s

GNU assembler version 2.17.50.0.6-12.el5 (x86_64-redhat-linux) using BFD version 2.17.50.0.6-12.el5 20061020

/usr/libexec/gcc/x86_64-redhat-linux/4.1.2/collect2 –eh-frame-hdr -m elf_x86_64 –hash-style=gnu -shared -o libtest.so /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64/crti.o /usr/lib/gcc/x86_64-redhat-linux/4.1.2/crtbeginS.o -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 /tmp/ccmzTeZF.o -lgcc –as-needed -lgcc_s –no-as-needed -lc -lgcc –as-needed -lgcc_s –no-as-needed /usr/lib/gcc/x86_64-redhat-linux/4.1.2/crtendS.o /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64/crtn.o

5.3.4. 创建可执行文件

LIBRARY_PATH=.;export LIBRARY_PATH

g++ main.cpp -ltest

gcc头文件搜索路径

-iquote用于搜索”#include “file”“形式的头文件

-I

C_PATH:类似于-I但优先级在-I之后,可以用于任何类型语言的预处理(比如c,c++)

-isystem

C_INCLUDE_PATH:c语言的,类似于-isystem,但优先级在-isystem之后

CPLUS_INCLUDE_PATH:c++的

OBJC_INCLUDE_PATH:Objective-C的

gcc库文件搜索路径

-L

LIBRARY_PATH:可以用来指定库文件搜索路径,但是优先级在-L之后

LD_LIBRARY_PATH:对于gcc编译不起作用,只与加载有关

测试-static,如果无.a,有.so是否可以?答案是必须是.a的库,否则不行

测试-Bdynamic,如果无.so,有.a是否可以?答案是可以

5.3.5. 加载测试

测试-rpath:g++ main.cpp -ltest -Wl,-rpath=. ,-rpath只对加载起作用,对链接无作用,通过它可以把运行时需要的动态库绝对路径写在可执行文件里

测试LD_PRELOAD:在加载阶段器作用,无论可执行文件链不链接,加载器都会加载它

测试LD_LIBRARY_PATH:在加载阶段起作用

测试ldconfig:可以直接对当前路径应用ldconfig,这样也可以将其加入

测试/etc/ld.so.conf:直接只将路径添加到该文件,不起作用,必须执行ldconfig

测试/etc/ld.so.cache:加载时会直接从该处查找

  1. 参考资料

程序员的自我修养—链接、装载与库

GCC编译的背后( 预处理和编译 汇编和链接 )

An Introduction to GCC 学习笔记

LINUX下如何用GCC编译动态库

gcc生成静态库和动态库

GCC编译优化指南

深入理解软件包的配置、编译与安装

  1. 附录

http://www.gentoo.org/proj/en/base/amd64/howtos/fpic.xml

  1. The Problem

Sometimes it occurs that gcc bails out with an error message like the following:

Code Listing 1.1: A typical gcc error message

.libs/assert.o: relocation R_X86_64_32 against `a local symbol’ can not be usedwhen making a shared object; recompile with -fPIC .libs/assert.o: could notread symbols: Bad value

There are several different types of causes for such an error. This HOWTO will explain all of them and show how to fix them.

  1. What is PIC?

PIC is an abbreviation for Position-Independent Code. The following is an excerpt of the Wikipedia article about position-independent code:

“In computing, position-independent code (PIC) or position-independent executable (PIE) is object code that can execute at different locations in memory. PIC is commonly used for shared libraries, so that the same library code can be mapped to a location in each application (using the virtual memory system) where it won’t overlap the application or other shared libraries. PIC was also used on older computer systems lacking an MMU, so that the operating system could keep applications away from each other.
Position-independent code can be copied to any memory location without modification and executed, unlike relocatable code, which requires special processing by a link editor or program loader to make it suitable for execution at a given location. Code must generally be written or compiled in a special fashion in order to be position independent. Instructions that refer to specific memory addresses, such as absolute branches, must be replaced with equivalent program counter relative instructions. The extra indirection may cause PIC code to be less efficient, although modern processors are designed to ameliorate this.”
—Wikipedia Encyclopaedia

On certain architectures (AMD64 amongst them), shared libraries must be “PIC-enabled”.

  1. What are “relocations”?

Again, from Wikipedia:

“In computer science, relocation refers to the process of replacing symbolic references or names of libraries with actual usable addresses in memory before running a program. It is typically done by the linker during compilation, although it can be done at run-time by a loader. Compilers or assemblers typically generate the executable with zero as the lower-most, starting address. Before the execution of object code, these addresses should be adjusted so that they denote the correct runtime addresses.”
—Wikipedia Encyclopaedia

With these terms defined, we can finally have a look at the different scenarios where breakage occurs:

  1. Case 1: Broken compiler

At least GCC 3.4 is known to have a broken implementation of the -fvisibility-inlines-hidden flag. The use of this flag is therefore highly discouraged, reported bugs are usually marked as RESOLVED INVALID. See bug 108872 for an example of a typical error message caused by this flag.

  1. Case 2: Broken `-fPIC’ support checks in configure

Many configure tools check whether the compiler supports the -fPIC flag or not. They do so by compiling a minimalistic program with the -fPIC flag and checking stderr. If the compiler prints any warnings, it is assumed that the -fPIC flag is not supported by the compiler and is therefore abandoned. Unfortunately, if the user specifies a non-existing flag (i.e. C++-only flags in CFLAGS or flags introduced by newer versions of GCC but unknown to older ones), GCC prints a warning too, resulting in borkage.

To prevent this kind of breakage, the AMD64 profiles use a bashrc that filters out invalid flags in C[XX]FLAGS.

See bug bug 122208 for an example.

  1. Case 3: Lack of `-fPIC’ flag in the software to be built

This is the most common case. It is a real bug in the build system and should be fixed in the ebuild, preferably with a patch that is sent upstream. Assuming the error message looks like this:

Code Listing 1.1: A sample error message

.libs/assert.o: relocation R_X86_64_32 against `a local symbol’ can not be usedwhen making a shared object; recompile with -fPIC .libs/assert.o: could notread symbols: Bad value

This means that the file assert.o was not compiled with the -fPIC flag, which it should. When you fix this kind of error, make sure only objects that are used in shared libraries are compiled with -fPIC.

In this case, globally adding -fPIC to C[XX]FLAGS resolves the issue, although this practice is discouraged because the executables end up being PIC-enabled, too.

Note: Adding the -fPIC flag to the linking command or LDFLAGS won’t help.

  1. Case 4: Linking dynamically against static archives

Sometimes a package tries to build shared libraries using statically built archives which are not PIC-enabled. There are two main reasons why this happens:

Often it is the result of mixing USE=static and USE=-static. If a library package can be built statically by setting USE=static, it usually doesn’t create a .so file but only a .a archive. However, when GCC is given the -l flag to link to said (dynamic or static) library, it falls back to the static archive when it can’t find a shared lib. In this case, the preferred solution is to build the static library using the -fPIC flag too.

Warning: Only build the static archive with -fPIC on AMD64. On other architectures this is unneeded and will have a performance impact at execution time.

See bug 88360 and mysql bug 8796 for an example.

Sometimes it is also the case that a library isn’t intended to be a shared library at all, e.g. because it makes heavy usage of global variables. In this case the solution is to turn the to-be-built shared library into a static one.

See bug 131460 for an example.

Code Listing 1.1: A sample error message

gcc -fPIC -DSHARED_OBJECT -c lex.yy.cgcc -shared -o html2txt.so lex.yy.o -lflusr/lib/gcc/x86_64-pc-linux-gnu/4.1.1/../../../../x86_64-pc-linux-gnu/bin/ld:/usr/lib/gcc/x86_64-pc-linux-gnu/4.1.1/../../../../lib64/libfl.a(libyywrap.o):relocation R_X86_64_32 against `a local symbol’ can not be used when making ashared object; recompile with -fPIC/usr/lib/gcc/x86_64-pc-linux-gnu/4.1.1/../../../../lib64/libfl.a: could notread symbols: Bad value

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值