Enhancing Applications by Directing Linker Symbol Processing

最新推荐文章于 2022-04-08 22:54:00 发布

hnhbdss

最新推荐文章于 2022-04-08 22:54:00 发布

阅读量773

点赞数

分类专栏：编译器文章标签： linker processing application performance solaris library

编译器专栏收录该内容

15 篇文章 0 订阅

订阅专栏

Print-friendly Version

By Greg Nakhimovsky, August 2002

Modern linkers, such as the linker in Sun's Solaris Operating Environment (OE), have an interesting option allowing you to change their default symbol processing. The Solaris platform achieves this with linker mapfiles. This article describes the benefits of using this option for building end-user applications, explains how the linker mapfiles work, and shows how easy it is to use this feature.

Comparable features are also available under operating environments other than Solaris. For example, the GNU linker used with Linux and FreeBSD has a "version script" option allowing you to achieve similar goals.

Background Information: How Solaris Linkers Handle Symbols

The ability to control symbol processing with a linker mapfile while linking an executable or a shared library has been available since Solaris 2.5 released in 1995. Starting with Solaris 2.6, Sun engineers have been using it to limit and control the exported symbols in the Solaris system libraries. However, as far as I know, it has not been used much for end-user applications, simply because most application developers are not yet familiar with this feature and its benefits.

To understand how linker mapfiles can improve your applications, let's consider how the linking process works.

The Solaris OE has two linkers:

A static linker ld(1) (also called link-editor) that you use to link your application binaries after compiling your source code into object files
A dynamic linker ld.so.1(1) performing the runtime linking of dynamic executables and shared libraries

The dynamic linker brings shared libraries into an executing application and handles the symbols in those libraries as well as in the dynamic executable images.

You can link executables with libraries statically or dynamically; see the "-B static" and "-B dynamic" options described in the ld(1) man page. Dynamic linking (which is the default) has been the recommended way for quite a few years now. The system libraries are now available only in the dynamic mode. Also, dynamic linking with system libraries is the only way to make your application upward compatible at the binary level across different Solaris versions. Therefore, in the rest of this article I assume that you link all application binaries with system libraries in the dynamic mode.

By default, the static linker makes all applications' symbols global in scope (making them into what is also called "exported symbols"). This means it puts the symbols into the dynamic symbol table of the resulting binary such that other binary modules can access those symbols.

At program startup, the dynamic linker runs before the application's executable code and loads all shared libraries that you specified at link time. These are preloaded shared objects that may include both system libraries and your own shared libraries. You can use the ldd(1) utility to find out which shared libraries the dynamic linker will need to load for a particular dynamically linked binary.

The dynamic linker maintains a linked list of the so-called link maps in the memory of the executing process, one for each dynamically linked object. The symbol search mechanism, required to bind the objects of an application, traverses this list.

Procedure Linkage Table (PLT) entries are structures the dynamic linker uses to handle calls across shared object boundaries. For example, the call to printf() (which is normally in libc.so) goes through this table.

The dynamic relocations that the dynamic linker performs are only necessary for the global (also known as external or exported) symbols. The static linker resolves references to local symbols (for example, names of static functions) statically when it links the binary.

By default, the dynamic linker also performs the so-called "lazy binding" of the symbols in the shared libraries: It does not search for those symbols and does not bind to them until the application actually uses them. This improves application performance because in most cases the application does not need all available symbols during one run.

When there is any possibility that a required symbol may be in an object different from the currently executing binary, the dynamic linker will have to traverse the list of link maps to determine the necessary binding. This procedure incurs certain performance costs that can be quite significant, especially when you run the executables or shared libraries off the network.

You can override this default "lazy binding" behavior by setting environment variable LD_BIND_NOW to any non-null value (like 1) before running your program. The performance price of this is that the initialization of each run takes longer. This setting shifts the binding performance costs to the application startup time. It is also likely to increase these costs because the dynamic linker has to bind to all symbols regardless of whether the application uses them or not.

Also note that "lazy binding" can only work with those shared libraries that you compile with the -KPIC or -Kpic option.

For a number of reasons, Sun recommends that you compile the application source code you intend for shared libraries with -KPIC or -Kpic. These compiler options are described in the C/C++ compiler man pages, but here is the summary.

PIC stands for Position-Independent Code. The -Kpic compiler option (also known as -xcode=pic13) corresponds to the "small model," which is the fastest, but it can only handle up to 2048 global symbols (the limit is 1024 for the 64-bit mode). -KPIC (also known as -xcode=pic32) corresponds to the "large model," which is somewhat slower than -Kpic, but allows up to 1,073,741,824 global symbols. In other words, the "large model" has practically no limit for the number of global symbols. (These PIC/pic specifics correspond to the Solaris Operating Environment SPARC? Platform Edition; the specifics are slightly different for the x86 Platform Edition.)

The PIC-compiled code allows the linker to keep a read-only version of the text (code) segment for a given shared library. The dynamic linker can share this text segment among all running processes, referencing it at a given time.

The Solaris linker can use the non-PIC-compiled code in shared libraries. However, compiling your code this way will prevent "lazy binding," may decrease performance in other ways, and will make it impossible to actually share such libraries among applications running on the system at the same time. It may also reduce the robustness of your application. This is why it's important to compile your shared library code in the position-independent mode.

How Using Linker Mapfiles Can Improve Your Application

Using linker mapfiles, you can enhance your applications in many ways (see the following sections). You can achieve most of these improvements by reducing the scope of all or most of the symbols in your application from "global" to "local."

Avoid Namespace Collisions With Third-Party Libraries

Although modern computer languages allow symbol encapsulation (for example, with C++ namespaces), in many cases namespace collisions are still a serious problem. In particular, third-party shared libraries can create havoc when some of their symbol names coincide with those in your application. Linker mapfiles can help fix these problems.

Note that such namespace collisions can be silent and therefore hard to detect or debug. For example, if a third-party shared library uses a global symbol with the same name as a global symbol in one of your shared libraries, the symbol from the third-party library may interpose on yours and unintentionally change the functionality of your application without any warning.

Improve Performance Due to Faster Runtime Linker Operation

This performance improvement is due to the decreased size of the link maps and reduced number of page faults resulting from symbol scope reduction. It can be especially important when you access application binaries remotely, for example, using NFS.

Reduce Size of Your Application's Binaries

With one large application, I've seen a 7 percent binary size reduction due to reduced symbol scope. This decrease is not earth-shattering, but it helps.

Restrict Functionality Available to External Binaries

By using a linker mapfile requesting symbol scope reduction, you can control exactly which routines in your application are "exported" and can be invoked from other applications, and which are not.

Reduce Risk of Accidental or Malicious Interposition on Global Symbols in Your Application Shared Libraries

Library interposition can be quite useful. For a few examples, see the Solaris Developer Connection article "Debugging and Performance Tuning with Library Interposers".

However, library interposition can also be used to hack into your code. Making most of your symbols local in scope helps secure your application.

Further Improve Security by Hiding Application's Symbol Names

Contrary to popular belief, stripping your binaries with the strip(1) utility is not enough to hide the names of your routines and data items. Stripping eliminates the local symbols but not the global symbols.

Dynamically linked binaries (both executables and shared libraries) use two symbol tables: the static symbol table and the dynamic symbol table. The dynamic symbol table is used by the runtime linker. It has to be present even in stripped executables, or else the dynamic linker is not able to find the symbols it needs. The strip(1) utility can only remove the static symbol table.

Normally, the nm(1) utility can list all symbol names used in the application's binaries. Even if you strip your binaries, "nm -D" will list the global symbols using the dynamic symbol table. In addition to the dynamic linker, other tools that also use the dynamic symbol table include dbx(1) , pstack(1) , and "Performance Analyzer" (which is included in Forte Developer tools, now a part of Sun ONE Studio software).

However, if you use a linker mapfile that makes all or most of the symbols in your application local in scope, the symbol information for such local symbols in a stripped binary is really gone: It is not available at runtime, so no one can extract it.

In some cases, it may be useful to restore the original symbol information for debugging purposes. For a description of how the symbol table information can be restored if you have access to an unstripped version of the same binary, see the Solaris Developer Connection article "Generating and Handling Application Traceback on Crash".

Remove Danger of a DGA "winlock" Timeout

This problem can cause serious graphics and window manager problems.

Solaris uses DGA (Direct Graphics Access) to circumvent the X-server for faster graphics operations. DGA uses a special lock (called winlock) and a timeout for each primitive graphics operation to complete.

For example, a few years ago some Sun customers ran into problems with winlock timeout hanging a large CAD application. When this happened, the following message appeared in the console window:

winlock:  timeout on process xxxx

And the application froze. The window manager was also crippled.

A detailed investigation revealed that the problem occurred when the application was mounted across NFS, the network was very slow, and dynamic symbol resolution took longer than three seconds. Correct DGA operation depends on the assumption that each primitive graphics operation performed under a window lock takes less than three seconds. When the timeout expired, DGA refused to operate as intended.

The operation took such a long time because all symbols in the application's dynamic executable were global in scope. The dynamic linker had to go through a long linked list of all those global symbols each time it needed to find a symbol in any system library, such as the OpenGL library. It had to perform this search remotely even before it had a chance to search that OpenGL library. Even though the OpenGL library was local, serious time and network bandwidth were wasted by searching through the remote applications binaries first.

The solution in this case was to use a linker mapfile while building the application's executable and make most of its symbols local in scope. This drastically reduced the size of the link maps, thus resolving the winlock problem and improving application performance at the same time.

Improve Application Performance by Reducing the Number of Instruction-Cache Misses and Page Faults

"Performance Analyzer" is included in Sun Studio Developer Tools (formerly known as Forte and Sun WorkShop). Among other things, Performance Analyzer can produce a linker mapfile that may improve application performance by optimizing the order of function loading in the program's address space. It does this based on the runtime data collected with another tool called the "Collector."

That linker mapfile will look quite different from the scope-reducing mapfiles discussed in the rest of this article. That's not a problem: you can always have more than one linker mapfile using more than one -M directive for the static linker.

As an added benefit, the mapfile created with the Performance Analyzer may also decrease the amount of memory your application uses.

For details on this feature, see Analyzing Program Performance With Sun WorkShop on the Sun Production Documentation site.

Work Around a 64-Bit PLT Limit of 32768

In the 64-bit mode, the Solaris linker currently has a limitation: It can only handle up to 32768 PLT entries. This means that at the moment you can't link very large shared libraries in the 64-bit mode. You will get the following message if this limit is exceeded:

Assertion failed:  pltndx < 0x8000

Actually, this limitation has already been removed (at the cost of a slight linker performance degradation when the index exceeds 32768), particularly in the Solaris 9 OE. In the future, the linker patches should become available to remove this limit in Solaris 8 and Solaris 7 as well.

In the meantime, a linker mapfile can help you work around this limitation.

The linker only needs PLT entries for the global symbols. If you use a linker mapfile reducing the scope of most of your symbols to local, this limitation is likely to become irrelevant.

Say you have a huge 64-bit mode shared library containing more than 32768 symbols (all of them global by default), such that you can't link this library because of the 32768 PLT entry limit. Assuming that this shared library (let's call it libhuge.so for example) is to be linked into your main executable, here is one way how you can determine which symbols can be safely made local.

In this example, I'll assume you use C++, so I'll use the CC driver for linking. In most practical cases it is best to use the compiler driver (such as CC(1) or cc(1)) for linking your application instead of invokingld(1) directly. The compiler driver will invoke ld(1) as needed. It will also automatically include all necessary objects and libraries, making the linking step easier for you.

1) Try linking your executable without libhuge.so by removing "-lhuge" from the executable's link line. That is, change your usual link command such as:

% CC ... -o executable -lone -ltwo -lhuge

to:

% CC ... -o executable -lone -ltwo

This will result in a number of unresolved reference error messages from the static linker -- one error message for each symbol needed from libhuge.so.

Here's a simple example. Let's pretend test1.cc is the main C++ program that needs some functions from libhuge.so:

% cat test1.cc
void one_from_libhuge(void);
void two_from_libhuge(void);
main () 
{
  one_from_libhuge();
  two_from_libhuge();
}
% CC -filt=no%names test1.cc
Undefined                       first referenced
 symbol                             in file
__1cQtwo_from_libhuge6F_v_          test1.o
__1cQone_from_libhuge6F_v_          test1.o
ld: fatal: Symbol referencing errors. No output written to a.out
%

The undefined symbol names are mangled because this is C++. The "-filt=no%names" flag prevents the Sun C++ compiler from demangling such names in the linker output. You can demangle them yourself to clearly see what they represent, for example, by using the dem(1) utility:

% dem __1cQone_from_libhuge6F_v_
__1cQone_from_libhuge6F_v_ == void one_from_libhuge()

2) Take all the undefined symbols listed in those error messages, for example, these in the preceding example:

__1cQtwo_from_libhuge6F_v_
__1cQone_from_libhuge6F_v_

Now put them into the "global" section of your mapfile. Make all the remaining symbols local in scope. See the relevant mapfile syntax in the following section.

3) Create a shared library libhuge.so using the linker mapfile you've just created:

% CC ...  -M/path/mapfile -o libhuge.so ...

This link will most likely be successful because the linker only needs enough PLT entries for the global symbols specified in the mapfile.

4) Now restore the original link line for your executable back to:

% CC ...-o executable -lone -ltwo -lhuge

How to Create and Use a Linker Mapfile

The most practical way to use a linker mapfile for an application is to reduce the scope of the application's symbols. A linker mapfile is a simple ASCII file, so you can create it with any text editor. Here is a very simple linker mapfile reducing the scope of all symbols in the binary from global to local:

% cat mapfile
{
	local:
		*;
};
%

Once you create such a file, you can use it while linking your application executable or shared library by adding a -M directive to the link line:

% CC ...  -M/path/mapfile ...

In some cases, the mapfile may have to be a little more elaborate. For example, if your application statically links in its own version of the malloc package, and uses a linker mapfile making all symbols local, all shared objects (such as libc and other system libraries) will use a version of malloc different from those used by your application. That is likely to cause corruption of the malloc database, leading to very serious problems. Therefore, in this case you'll have to make sure you define the malloc API routines as global. Here is how you can do it:

% cat mapfile
{
	global:
		malloc;
		realloc;
		free;
		memalign;
	local:
		*;
};
%

Also, if your application allows other applications to access some of your application's functionality using your own API to remotely call your functions, you'll have to specify those external API functions in the list of global symbols defined in your mapfile. Otherwise, those functions will remain local and no one else will be able to call them.

Even after you specify a number of symbols to be global in scope, using a linker mapfile like this is likely to drastically reduce the number of global symbols in your application and therefore result in all or some of the benefits listed in this article.

For further information regarding Solaris linker mapfiles and related issues, see the Solaris Linker and Libraries Guide index.

Acknowledgments

Thanks to Michael Walker of Sun Microsystems for his help regarding this article and many other linker-related issues.

About the Author

Greg Nakhimovsky is a Sun Microsystems engineer working with application software vendors to make sure their products run well on Sun systems. He has over 20 years of industry experience developing, performance tuning, and troubleshooting technical computer applications on various systems.

hnhbdss

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Enhancing Applications by Directing Linker Symbol Processing

Print-friendly VersionBy Greg Nakhimovsky, August 2002 Modern linkers, such as the linker in Sun's Solaris Operating Environme
复制链接

扫一扫