The Linux GCC HOWTO

转自:http://www.faqs.org/docs/Linux-HOWTO/GCC-HOWTO.html

The Linux GCCHOWTO

Daniel Barlow

Linux DocumentationProject

May 1999

This document covers how to set up the GNU C compiler anddevelopment libraries under Linux, and gives an overview ofcompiling, linking, running and debugging programs under it. Mostof the material in it has been taken from Mitch D'Souza's GCC-FAQor the ELF-HOWTO - it replaces both documents.

This is the first version to be written in DocBook instead ofthe old Linuxdoc format, and may contain markup errors. Please letme know if you find anything worng.

As can be determined from the long times between updates of thisdocument, I don't actually have the time or inclination to maintainit much. If you have, can, and want to, drop me some emaildescribing what you'd do with it and why you think you'd be good atit.


Preliminaries

ELF vs. a.out, libc 5vs 6

Three years ago when this document was first created, I openedthis section by saying "Linux development is in a state of fluxright now" and going on to describe how ELF was replacing the oldera.out binary format.

It still is in a state of flux. It always will be. Though thatparticular change is long since past, development of the Linuxkernel and the surrounding system continues to happen, and thingschange for developers as a result. So it's a good idea to knowupfront what kind of system you have in front of you.

The possible candidates, in order of age, are

  • libc 4, a.out: very old systems

  • libc 5, ELF: Red Hat 4.2, Debian 2.0

  • libc 6 (a.k.a glibc 2), ELF: Red Hat 5 - 5.2, Debian 2.1

  • libc 6.1,(a.k.a glibc 2.1) ELF: Red Hat 6

How to tell? The simplest approach is to pick a binary that youconsider is typical (e.g. /bin/ls and run ldd on it. One of the listed librariesshould be libc - check its version number.
$ ldd /bin/ls
        libc.so.6 => /lib/libc.so.6 (0x4000e000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

This document was created on a Debian 2.1 system, so no surprise there.

It's entirely possible that the system you're using may have amix of different versions on it. What you probably want to know inthat case is the version that its C development environment is setup for, so you're best off compiling "hello world" and runningldd on the output thus created. Note thatfor historical reasons, gcc defaults to anoutput file called a.out even on ELFsystems, so don't assume anything from that.


Administrata

The copyright information and like legalese can be found at theend of this document, together with thestatutory warnings about asking dumb questions on Usenet, revealingyour ignorance of the C language by reporting bugs which aren't,and picking your nose while chewing gum.


Typography

If you're reading this in Postscipt, dvi, or html format, youget to see a little more font variation than people with the plaintext version. In particular, filenames, commands, command outputand source code excerpts are set in some form of typewriter font, whereas `variables' and randomthings that need emphasizing are emphasized.

You also get a usable index. In dvi or postscript, the numbersin the index are section numbers. In HTML they're just sequentiallyassigned numbers that you can click on. In the plain text version,they really are just numbers. Get an upgrade!

The Bourne (rather than C) shell syntax is used in examples. Cshell users will want to use

% setenv FOO bar
where I have written
 
$ FOO=bar; export FOO

If the prompt shown is # rather than$, the command shown will probably onlywork as root. Of course, I accept no responsibility for anythingthat happens to your system as a result of trying these examples.Have a nice day :-)


Where to getthings

In the three years since the first `HOWTO' version of this,useful Linux distributions have become prevalent. So, where onceI'd have spent pages listing FTP sites and hours updating (failingto update) version numbers and directory names, now I will simplysay - your distribution maintainer should be taking care of thisfor you. If you don't have, say, gcc installed, find the RPM or thedeb packages that contain it, and install it. If that isn't anoption because you don't have a friendly distribution, you'vealmost certainly been using Linux long enough that you don't needme to tell you where to find things anyway.


This document

You're reading it. You probably have it already.

This document is one of the Linux HOWTO series, so is probablyalready installed somewhere in /usr/docif you're reading this on a linux box. Failing that, from all LinuxHOWTO repositories (try Metalab) and (possibly in a slightly newerversion) at my personal web site www.telent.net.


Otherdocumentation

The official documentation for gcc is in the source distribution(see below) as texinfo files, and as .infofiles. If you have a fast network connection, a cdrom, or areasonable amount of patience, you can just untar it and copy therelevant bits into /usr/info. If not, youmay find them at tsx-11, but not necessarily always the latestversion.

There are two source of documentation for libc. GNU libc comeswith info files which describe Linux libc fairly accurately exceptfor stdio. Also, the manpagesarchive are written for Linux and describe a lot of system calls(section 2) and libc functions (section 3).


GCC

There are two answers.

(a) The official Linux GCC distribution can always be found inbinary (ready-compiled) form at .At the time of writing, 2.7.2 (gcc-2.7.2.bin.tar.gz) is the latest version.

(b) The latest source distribution of GCC from the Free SoftwareFoundation can be had from GNU archives. This is not necessarily always thesame version as above, though it is just now. The Linux GCCmaintainer(s) have made it easy for you to compile the latestversion available yourself --- the configure script should set it all up for you. Checktsx-11 as well, for patches which you may want toapply.

To compile anything non-trivial (and quite a few trivial thingsalso) you will also need the


C library andheader files

What you want here depends on (i) whether your system is ELF ora.out, and (ii) which you want it to be. If you're upgrading fromlibc 4 to libc 5, you are recommended to look at the ELF-HOWTO fromapproximately the same place as you found this document.

These are available from tsx-11 as above:

libc-5.2.18.bin.tar.gz

--- ELF shared library images, static libraries and includefiles for the C and maths libraries.

libc-5.2.18.tar.gz

--- Source for the above. You will also need the .bin. package for the header files. If you aredeliberating whether to compile the C library yourself or use thebinaries, the right answer in nearly all cases is to use thebinaries. You will however need to roll your own if you want NYS orshadow password support.

libc-4.7.5.bin.tar.gz

--- a.out shared library images and static libraries for version4.7.5 of the C library and friends. This is designed to coexistwith the libc 5 package above, but is only really necessary if youwish to keep using/developing a.out format programs.


Associated tools(as, ld, ar, strings etc)

From tsx-11, just like everything else so far. Thecurrent version is binutils-2.6.0.2.bin.tar.gz.

Note that the binutils are only available in ELF, the currentlibc version is in ELF and the a.out libc is happiest when used inconjunction with an ELF libc. C library development is movingemphatically ELFwards, and unless you have really good reasons forneeding a.out things you're encouraged to follow suit.


GCC installation andsetup

GCCversions

You can find out what GCC version you're running by typinggcc -v at the shell prompt. This is also afairly reliable way to find out whether you are set up for ELF ora.out. On my system it does

$ gcc -v
Reading specs from /usr/lib/gcc-lib/i486-box-linux/2.7.2/specs
gcc version 2.7.2

The key things to note here are

  • i486. This indicates that the gcc youare using was built for a 486 processor --- you might have 386 or586 instead. All of these chips can run code compiled for each ofthe others; the difference is that the 486 code has added paddingin some places so runs faster on a 486. This has no detrimentalperformance effect on a 386, but does make the binaries slightlylarger.

  • box. This is not at all important, and may say something else(such as slackware or debian) or nothing at all (so that the completedirectory name is i486-linux). If youbuild your own gcc, you can set this at build time for cosmeticeffect. Just like I did :-)

  • linux. This may instead say linuxelf or linuxaout, and,confusingly, the meaning of each varies according to the versionthat you are using.

    • linux means ELF if the version is 2.7.0or newer, a.out otherwise.

    • linuxaout means a.out. It wasintroduced as a target when the definition of linux was changed from a.out to ELF, so you won'tsee any linuxaout gcc older than2.7.0.

    • linuxelf is obsolete. It is generally aversion of gcc 2.6.3 set to produce ELF executables. Note that gcc2.6.3 has known bugs when producing code for ELF --- an upgrade isadvisable.

  • 2.7.2 is the version number.

So, in summary, I have gcc 2.7.2 producing ELF code. Quellesurprise.


Where did itgo?

If you installed gcc without watching, or if you got it as partof a distribution, you may like to find out where it lives in thefilesystem. The key bits are

  • /usr/lib/gcc-lib/target/version/ (andsubdirectories) is where most of the compiler lives. This includesthe executable programs that do actual compiling, and someversion-specific libraries and include files.

  • /usr/bin/gcc is the compiler driver ---the bit that you can actually run from the command line. This canbe used with multiple versions of gcc provided that you havemultiple compiler directories (as above) installed. To find out thedefault version it will use, type gcc -v.To force it to another version, type gcc-V version. For example

    # gcc -v
    Reading specs from /usr/lib/gcc-lib/i486-box-linux/2.7.2/specs
    gcc version 2.7.2
    # gcc -V 2.6.3 -v
    Reading specs from /usr/lib/gcc-lib/i486-box-linux/2.6.3/specs
    gcc driver version 2.7.2 executing gcc version 2.6.3
    
  • /usr/target/(bin|lib|include)/.If you have multiple targets installed (for example, a.out and elf,or a cross-compiler of some sort, the libraries, binutils(as, ld and soon) and header files for the non-native target(s) can be foundhere. Even if you only have one kind of gcc installed you mightfind anyway that various bits for it are kept here. If not, they'rein /usr/(bin|lib|include).

  • /lib/,/usr/liband others are library directories for the native system. You willalso need /lib/cpp for many applications(X makes quite a lot of use of it) --- either copy it from/usr/lib/gcc-lib/target/version/ or make a symlinkpointing there.


Where are theheader files?

Apart from whatever you install yourself under /usr/local/include, there are three main sources ofheader files in Linux:

  • Most of /usr/include/ and itssubdirectories are supplied with the libc binary package from H JLu. I say `most' because you may also have files from other sources(curses and dbmlibraries, for example) in here, especially if you are using thenewest libc distribution (which doesn't come with curses or dbm,unlike the older ones).

  • /usr/include/linux and /usr/include/asm (for the files <linux #endif

    Use __linux__ for this purpose,not linux.Although the latter is defined, it is not POSIX compliant.


Compilerinvocation

The documentation for compiler switches is the gcc info page (inEmacs, use C-h i then select the `gcc'option). Your distributor may not have packed this with yoursystem, or you may have an old version; the best thing to do inthis case is to download the gcc source archive from or one of itsmirrors, and copy them out of it.

The gcc manual page (gcc.1) is,generally speaking, out of date. It will warn you of this when youtry to look at it.


Compilerflags

gcc can be made to optimize its output code by adding -On to its command line,where n is an optional small integer.Meaningful values of n, and their exacteffect, vary according to the exact version, but typically itranges from 0 (no optimization) to 2 (lots) or 3 (lots andlots).

Internally, gcc translates these to a series of -f and -m options. You cansee exactly which -O levels map to whichoptions by running gcc with the -v flagand the (undocumented) -Q flag. Forexample, for -O2, mine says

enabled: -fdefer-pop -fcse-follow-jumps -fcse-skip-blocks
-fexpensive-optimizations
         -fthread-jumps -fpeephole -fforce-mem -ffunction-cse -finline
         -fcaller-saves -fpcc-struct-return -frerun-cse-after-loop
         -fcommon -fgnu-linker -m80387 -mhard-float -mno-soft-float
         -mno-386 -m486 -mieee-fp -mfp-ret-in-387

Using an optimization level higher than your compiler supports(e.g. -O6) will have exactly the sameeffect as using the highest level that it does support. Distributing code which is set tocompile this way is a poor idea though --- if further optimisationsare incorporated into future versions, you (or your users) may findthat they break your code.

Users of gcc 2.7.0 thru2.7.2 should note that there is a bug in -O2 on these. Specifically, strength reductiondoesn't work. A patch can be had to fix this if you feel likerecompiling gcc, otherwise make sure that you always compile with-fno-strength-reduce


Processor-specific

There are other -m flags which aren'tturned on by any variety of -O but arenevertheless useful. Chief among these are -m386 and -m486, which tellgcc to favour the 386 or 486 respectively. Code compiled with oneof these will still work on the other; 486 code is bigger, butotherwise not slower on the 386.

There is currently no -mpentium or-m586. Linus suggests using -m486 -malign-loops=2 -malign-jumps=2-malign-functions=2, to get 486 code optimisations but withoutthe big gaps for alignment (which the pentium doesn't need).Michael Meissner (of Cygnus) says

"My hunch is that -mno-strength-reduce also results in faster code onthe x86 (note, I'm not talking about the strength reduction bug,which is another issue). This is because the x86 is rather registerstarved (and GCC's method of grouping registers into spillregisters vs. other registers doesn't help either). Strengthreduction typically results in using additional registers toreplace multiplications with addition. I also suspect -fcaller-saves may also be a loss.""Another hunch is that -fomit-frame-pointer might or might not be a win. Onthe one hand, it can mean that another register is available forallocation. On the other hand, the way the x86 encodes itsinstruction set, means that stack relative addresses take morespace instead of frame relative addresses, which means slightlyless Icache availble to the program. Also, -fomit-frame-pointer, means that the compiler has toconstantly adjust the stack pointer after calls, while with aframe, it can let the stack accumulate for a few calls."

The final word on this subject is from Linus again:

"Note that if you want to get optimalperformance, don't believe me: test. There are lots of gcc compilerswitches, and it may be that a particular set gives the bestoptimizations for you. "


Internal compiler error: cc1 got fatal signal11

Signal 11 is SIGSEGV, or `segmentation violation'. Usually itmeans that the program got its pointers confused and tried to writeto memory it didn't own. So, it could be a gcc bug.

gcc is however, a well tested and reliable piece of software,for the most part. It also uses a large number of complex datastructures, and an awful lot of pointers. In short, it's thepickiest RAM tester commonly available. If you can't duplicate the bug --- if it doesn't stop inthe same place when you restart the compilation --- it's almostcertainly a problem with your hardware (CPU, memory, motherboard orcache). Don't claim it as a bug becauseyour computer passes the power-on checks or runs Windows ok orwhatever; these `tests' are commonly and rightly held to beworthless. And don't claim it's a bug because a kernel compilealways stops during `make zImage' --- ofcourse it will! `make zImage' is probablycompiling over 200 files; we're looking for a slightly smaller place than that.

If you can duplicate the bug, and (better) can produce a shortprogram that exhibits it, you can submit it as a bug report to theFSF, or to the linux-gcc mailing list. See the gcc documentationfor details of exactly what information they need.


Portability

It has been said that, these days, if something hasn't beenported to Linux then it is not worth having :-)

Seriously though, in general only minor changes are needed tothe sources to get over Linux's 100% POSIX compliance. It is alsoworthwhile passing back any changes to authors of the code suchthat in the future only `make' need be called to provide a workingexecutable.


BSDisms (includingbsd_ioctl, daemonand <sgtty.h>)

You can compile your program with -I/usr/include/bsd and link it with -lbsd (i.e. add -I/usr/include/bsd to CFLAGS and -lbsd to theLDFLAGS line in your Makefile). There isno need to add -D__USE_BSD_SIGNAL any more if you want BSD typesignal behavior, as you get this automatically when you have-I/usr/include/bsd and include <signal.h>.


`Missing'signals (SIGBUS, SIGEMT, SIGIOT, SIGTRAP, SIGSYSetc)

Linux is POSIX compliant. These are not POSIX-defined signals--- ISO/IEC 9945-1:1990 (IEEE Std 1003.1-1990), paragraph B.3.3.1.1sez:

"``The signals SIGBUS, SIGEMT, SIGIOT,SIGTRAP, and SIGSYS were omitted from POSIX.1 because theirbehavior is implementation dependent and could not be adequatelycategorized. Conforming implementations may deliver these signals,but must document the circumstances under which they are deliveredand note any restrictions concerning their delivery.''"

The cheap and cheesy way to fix this is to redefine thesesignals to SIGUNUSED. The correct way is to bracket the code that handles themwith appropriate #ifdefs:

#ifdef SIGSYS

#endif

K& R Code

GCC is an ANSI compiler; much existing code is not ANSI. There'sreally not much that can be done about this, except to add-traditional to the compiler flags. Thereis a certain amount of finer-grained control over which varietiesof brain damage to emulate; consult the gcc info page.

Note that -traditional has effectsbeyond just changing the language that gcc accepts. For example, itturns on -fwritable-strings, which movesstring constants into data space (from text space, where theycannot be written to). This increases the memory footprint of theprogram.


Preprocessorsymbols conflict with prototypes in the code

One of the most frequent problems is that some common functionsare defined as macros in Linux's header files and the preprocessorwill refuse to parse similar prototype definitions in the code.Common ones are atoi() and atol().


sprintf()

Something to be aware of, especially when porting from SunOS, isthat sprintf(string, fmt, ...) returns apointer to string on many unices, whereasLinux (following ANSI) returns the number of characters which wereput into the string.


fcntl and friends. Where are the definitions ofFD_* stuff ?

In <sys/time.h>. If youare using fcntl you probably want toinclude <unistd.h> too, forthe actual prototype.

Generally speaking, the manual page for a function lists thenecessary #includes in its SYNOPSISsection.


The select() timeout. Programs startbusy-waiting.

The BSD manual page for select(2) used to say "select() should probably return the time remaining fromthe original timeout, if any, by modifying the time value in place.This may be implemented in future versions of the system. Thus, itis unwise to assume that the timeout pointer will be unmodified bythe select() call."

Some versions of Linux do perform this modification. Some don't.It is incredibly unwise to assume one behaviour or the other.

To fix, put the timeout value into that structure every time youcall select(). Change code like

      struct timeval timeout;
      timeout.tv_sec = 1; timeout.tv_usec = 0;
      while (some_condition)
            select(n,readfds,writefds,exceptfds,&timeout); 
to, say,
      struct timeval timeout;
      while (some_condition) {
            timeout.tv_sec = 1; timeout.tv_usec = 0;
            select(n,readfds,writefds,exceptfds,&timeout);
      }

Some versions of Mosaic were at one time notable for thisproblem. The speed of the spinning globe animation was inverselyrelated to the speed that the data was coming in from the networkat!


Interruptedsystem calls.

Symptom:

When a program is stopped using Ctrl-Z and then restarted - orin other situations that generate signals: Ctrl-C interruption,termination of a child process etc. - it complains about"interrupted system call" or "write: unknown error" or things likethat.


Problem:

POSIX systems check for signals a bit more often than some olderunices. Linux may execute signal handlers ---

  • asynchronously (at a timer tick)

  • on return from any system call

  • during the execution of the following system calls: select(), pause(),connect(), accept(), read() onterminals, sockets, pipes or files in /proc, write() onterminals, sockets, pipes or the line printer, open() on FIFOs, PTYs or serial lines, ioctl() on terminals, fcntl() with command F_SETLKW, wait4(),syslog(), any TCP or NFS operations.

For other operating systems you may have to include the systemcalls creat(), close(), getmsg(),putmsg(), msgrcv(), msgsnd(),recv(), send(),wait(), waitpid(), wait3(),tcdrain(), sigpause(), semop() to thislist.

If a signal (that the program has installed a handler for)occurs during a system call, the handler is called. When thehandler returns (to the system call) it detects that it wasinterrupted, and immediately returns with -1 and errno = EINTR. The program is not expecting that tohappen, so bottles out.

You may choose between two fixes.

(1) For every signal handler that you install, add SA_RESTART to the sigaction flags. For example,change

  signal (sig_nr, my_signal_handler);
to
  signal (sig_nr, my_signal_handler);
  { struct sigaction sa;
    sigaction (sig_nr, (struct sigaction *)0, &sa);
#ifdef SA_RESTART
    sa.sa_flags |= SA_RESTART;
#endif
#ifdef SA_INTERRUPT
    sa.sa_flags &= ~ SA_INTERRUPT;
#endif
    sigaction (sig_nr, &sa, (struct sigaction *)0);
  }

Note that while this applies to most system calls, you muststill check for EINTR yourself onread(), write(),ioctl(), select(), pause() andconnect(). See below.

(2) Check for EINTR explicitly,yourself:

Here are two examples for read() andioctl(),

Original piece of code using read()

int result;
while (len > 0) { 
  result = read(fd,buffer,len);
  if (result < 0) break;
  buffer += result; len -= result;
}
becomes
int result;
while (len > 0) { 
  result = read(fd,buffer,len);
  if (result < 0) { if (errno != EINTR) break; }
  else { buffer += result; len -= result; }
}
and a piece of code using ioctl()
int result;
result = ioctl(fd,cmd,addr);
becomes
int result;
do { result = ioctl(fd,cmd,addr); }
while ((result == -1) && (errno == EINTR));

Note that in some versions of BSD Unix the default behaviour isto restart system calls. To get system calls interrupted you haveto use the SV_INTERRUPT or SA_INTERRUPT flag.


Writable strings(program seg faults randomly)

GCC has an optimistic view of its users, believing that theyintend string constants to be exactly that --- constant. Thus, itstores them in the text (code) area of the program, where they canbe paged in and out from the program's disk image (instead oftaking up swapspace), and any attempt to rewrite them will cause asegmentation fault. This is a feature!

It may cause a problem for old programs that, for example, callmktemp() with a string constant asargument. mktemp() attempts to rewrite itsargument in place.

To fix, either (a) compile with -fwritable-strings, to get gcc to put constants indata space, or (b) rewrite the offending parts to allocate anon-constant string and strcpy the data into it before calling.


Why does theexecl() call fail?

Because you're calling it wrong. The first argument toexecl is the program that you want to run.The second and subsequent arguments become the argv array of the program you're calling. Remember:argv[0] is traditionally set even when aprogram is run with `no' arguments. So, you should be writing

execl("/bin/ls","ls",NULL);
not just
execl("/bin/ls", NULL);

Executing the program with no arguments at all is construed asan invitation to print out its dynamic library dependencies, atleast using a.out. ELF does things differently.

(If you want this library information, there are simplerinterfaces; see the section on dynamic loading, or the manual pagefor ldd).


Debugging andProfiling

Preventativemaintenance (lint)

There is no widely-used lint for Linux, as most people aresatisfied with the warnings that gcc can generate. Probably themost useful is the -Wall switch --- thisstands for `Warnings, all' but probably has more mnemonic value ifthought of as the thing you bang your head against.

There is a public domain lint available from . Idon't know how good it is.


Debugging

How do I getdebugging information into a program ?

You need to compile and link all its bits with the -g switch, and without the -fomit-frame-pointer switch. Actually, you don'tneed to recompile all of it, just the bits you're interested indebugging.

On a.out configurations the shared libraries are compiled with-fomit-frame-pointer, which gdb won't geton with. Giving the -g option when youlink should imply static linking; this is why.

If the linker fails with a message about not finding libg.a, youdon't have /usr/lib/libg.a, which is thespecial debugging-enabled C library. It may be supplied in the libcbinary package, or (in newer C library versions) you may need toget the libc source code and build it yourself. You don't actuallyneed it though; you can get enoughinformation for most purposes simply by symlinking it to /usr/lib/libc.a


How do I get itout again?

A lot of GNU software comes set up to compile and link with-g, causing it to make very big (and oftenstatic) executables. This is not really such a hot idea.

If the program has an autoconf generated configure script, you can usually turn off debugginginformation by doing ./configure CFLAGS=or ./configure CFLAGS=-O2. Otherwise,check the Makefile. Of course, if you're using ELF, the program isdynamically linked regardless of the -gsetting, so you can just strip it.


Availablesoftware

Most people use gdb, which you can getin source form from GNU archive sites, or as a binary from tsx-11 or sunsite. xxgdb is an Xdebugger based on this (i.e. you need gdb installed first). Thesource may be found at

Also, the UPS debugger has been portedby Rick Sladkey. It runs under X as well, but unlike xxgdb, it isnot merely an X front end for a text based debugger. It has quite anumber of nice features, and if you spend any time debugging stuff,you probably should check it out. The Linux precompiled version andpatches for the stock UPS sources can be found in , and the original source at .

Another tool you might find useful for debugging is `strace', which displays the system calls that aprocess makes. It has a multiplicity of other uses too, includingfiguring out what pathnames were compiled into binaries that youdon't have the source for, exacerbating race conditions in programsthat you suspect contain them, and generally learning how thingswork. The latest version of strace (currently 3.0.8) can be foundat .


Background (daemon)programs

Daemon programs typically execute fork() early, and terminate the parent. This makesfor a short debugging session.

The simplest way to get around this is to set a breakpoint forfork, and when the program stops, force itto return 0.

(gdb) list 
1       #include <stdio.h>
2
3       main()
4       {
5         if(fork()==0) printf("child\n");
6         else printf("parent\n");
7       }
(gdb) break fork
Breakpoint 1 at 0x80003b8
(gdb) run
Starting program: /home/dan/src/hello/./fork 
Breakpoint 1 at 0x400177c4

Breakpoint 1, 0x400177c4 in fork ()
(gdb) return 0
Make selected stack frame return now? (y or n) y
#0  0x80004a8 in main ()
    at fork.c:5
5         if(fork()==0) printf("child\n");
(gdb) next
Single stepping until exit from function fork, 
which has no line number information.
child
7       }

Core files

When Linux boots it is usually configured not to produce corefiles. If you like them, use your shell's builtin command tore-enable them: for C-shell compatibles (e.g. tcsh) this is

% limit core unlimited
while Bourne-like shells (sh, bash, zsh, pdksh) use
$ ulimit -c unlimited

If you want a bit more versatility in your core file naming (forexample, if you're trying to conduct a post-mortem using a debuggerthat's buggy itself) you can make a simple mod to your kernel. Lookfor the code in fs/binfmt_aout.c andfs/binfmt_elf.c (in newer kernels, you'llhave to grep around a little in older ones) that says

        memcpy(corefile,"core.",5);
#if 0
        memcpy(corefile+5,current->comm,sizeof(current->comm));
#else
        corefile[4] = '\0';
#endif

and change the 0s to 1s.


Profiling

Profiling is a way to examine which bits of a program are calledmost often or run for longest. It is a good way to optimize codeand look at where time is being wasted. You must compile all objectfiles that you require timing information for with -p, and to make sense of the output file you willalso need gprof (from the binutilspackage). See the gprof manual page fordetails.


Linking

Between the two incompatible binary formats, the static vsshared library distinction, and the overloading of the verb `link'to mean both `what happens after compilation' and `what happenswhen a compiled program is invoked' (and, actually, the overloadingof the word `load' in a comparable but opposite sense), thissection is complicated. Little of it is much more complicated thanthat sentence, though, so don't worry too much about it.

To alleviate the confusion somewhat, we refer to what happens atruntime as `dynamic loading' and cover it in the next section. Youwill also see it described as `dynamic linking', but not here. Thissection, then, is exclusively concerned with the kind of linkingthat happens at the end of a compilation.


Shared vs staticlibraries

The last stage of building a program is to `link' it; to joinall the pieces of it together and see what is missing. Obviouslythere are some things that many programs will want to do --- openfiles, for example, and the pieces that do these things areprovided for you in the form of libraries. On the average Linuxsystem these can be found in /lib and/usr/lib/, among other places.

When using a static library, the linker finds the bits that theprogram modules need, and physically copies them into theexecutable output file that it generates. For shared libraries, itdoesn't --- instead it leaves a note in the output saying `whenthis program is run, it will first have to load this library'.Obviously shared libraries tend to make for smaller executables;they also use less memory and mean that less disk space is used.The default behaviour of Linux is to link shared if it can find theshared libraries, static otherwise. If you're getting staticbinaries when you want shared, check that the shared library files(*.sa for a.out, *.so for ELF) are where they should be, and arereadable.

On Linux, static libraries have names like libname.a, while shared libraries are calledlibname.so.x.y.z where x.y.z is some form of version number. Sharedlibraries often also have links pointing to them, which areimportant, and (on a.out configurations) associated .sa files. The standard libraries come in bothshared and static formats.

You can find out what shared libraries a program requires byusing ldd (List Dynamic Dependencies)

$ ldd /usr/bin/lynx
        libncurses.so.1 => /usr/lib/libncurses.so.1.9.6
        libc.so.5 => /lib/libc.so.5.2.18

This shows that on my system the WWW browser `lynx' depends onthe presence of libc.so.5 (the C library)and libncurses.so.1 (used for terminalcontrol). If a program has no dependencies, ldd will say `staticallylinked' or `statically linked(ELF)'.


Interrogatinglibraries (`which library is sin()in?')

nm librarynameshould list all the symbols that libraryname has references to. It works on bothstatic and shared libraries. Suppose that you want to know wheretcgetattr() is defined: you might do

$ nm libncurses.so.1 |grep tcget
         U tcgetattr

The U stands for `undefined' --- itshows that the ncurses library uses but does not define it. Youcould also do

$ nm libc.so.5 | grep tcget
00010fe8 T __tcgetattr
00010fe8 W tcgetattr
00068718 T tcgetpgrp

The `W' stands for `weak', which meansthat the symbol is defined, but in such a way that it can beoverridden by another definition in a different library. Astraightforward `normal' definition (such as the one for tcgetpgrp) is marked by a `T'

The short answer to the question in the title, by the way, islibm.(so|a). All the functions defined in<math.h> arekept in the maths library; thus you need to link with -lm when using any of them.


Findingfiles

ld: Output file requires shared library`libfoo.so.1`

The file search strategy of ld and friends varies according toversion, but the only default you can reasonably assume is/usr/lib. If you want libraries elsewhereto be searched, specify their directories with the -L option to gcc or ld.

If that doesn't help, check that you have the right file in thatplace. For a.out, linking with -lfoo makesld look for libfoo.sa (shared stubs), andif unsuccessful then for libfoo.a(static). For ELF, it looks for libfoo.sothen libfoo.a. libfoo.so is usually a symbolic link to libfoo.so.x.


Building your ownlibraries

Versioncontrol

As any other program, libraries tend to have bugs which getfixed over time. They also may introduce new features, change theeffect of existing ones, or remove old ones. This could be aproblem for programs using them; what if it was depending on thatold feature?

So, we introduce library versioning. We categorise the changesthat might be made to a library as `minor' or `major', and we rulethat a `minor' change is not allowed to break old programs that areusing the library. You can tell the version of a library by lookingat its filename (actually, this is, strictly speaking, a lie forELF; keep reading to find out why) : libfoo.so.1.2 has major version 1, minor version 2.The minor version number can be more or less anything --- libc putsa `patchlevel' in it, giving library names like libc.so.5.2.18, and it's also reasonable to putletters, underscores, or more or less any printable ASCII init.

One of the major differences between ELF and a.out format is inbuilding shared libraries. We look at ELF first, because it'ssimpler.


ELF? What is itthen, anyway?

ELF (Executable and Linking Format) is a binary formatoriginally developed by USL (UNIX System Laboratories) andcurrently used in Solaris and System V Release 4. Because of itsincreased flexibility over the older a.out format that Linux wasusing, the GCC and C library developers decided last year to moveto using ELF as the Linux standard binary format also.


Come again?

This section is from the document'/news-archives/comp.sys.sun.misc'.

"ELF ("Executable Linking Format) is the"new, improved" object file format introduced in SVR4. ELF is muchmore powerful than straight COFF, in that it *is* user-extensible.ELF views an object-file as an arbitarily long list of sections(rather than an array of fixed size entities), these sections,unlike in COFF, do not HAVE to be in a certain place and do notHAVE to come in any specific order etc. Users can add new sectionsto object-files if they wish to capture new data. ELF also has afar more powerful debugging format called DWARF (Debugging WithAttribute Record Format) - not currently fully supported on linux(but work is underway). A linked list of DWARF DIEs (or DebuggingInformation Entries) forms the .debug section in ELF. Instead ofbeing a collection of small, fixed-size information records, DWARFDIEs each contain an arbitrarily long list of complex attributesand are written out as a scope-based tree of program data. DIEs cancapture a large amount of information that the COFF .debug sectionsimply couldn't (like C++ inheritance graphs etc.).""ELF files are accessed via the SVR4 (Solaris2.0 ?) ELF access library, which provides an easy and fastinterface to the more gory parts of ELF. One of the major boons inusing the ELF access library is that you will never need to look atan ELF file qua. UNIX file, it is accessed as an Elf *, after anelf_open() call and from then on, you perform elf_foobar() calls onits components instead of messing about with its actual on-diskimage (something many COFFers did with impunity). "

The case for/against ELF, and the necessary contortions toupgrade an a.out system to support it, are covered in the ELF-HOWTOand I don't propose to cut/paste them here. The HOWTO should beavailable in the same place as you found this one.


ELF sharedlibraries

To build libfoo.so as a shared library,the basic steps look like this:

$ gcc -fPIC -c *.c
$ gcc -shared -Wl,-soname,libfoo.so.1 -o libfoo.so.1.0 *.o
$ ln -s libfoo.so.1.0 libfoo.so.1
$ ln -s libfoo.so.1 libfoo.so
$ LD_LIBRARY_PATH=`pwd`:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH

This will generate a shared library called libfoo.so.1.0, and the appropriate links for ld(libfoo.so) and the dynamic loader(libfoo.so.1) to find it. To test, we addthe current directory to LD_LIBRARY_PATH.

When you're happpy that thelibrary works, you'll have to move it to, say, /usr/local/lib, and recreate the appropriate links.The link from libfoo.so.1 to libfoo.so.1.0 is kept up to date by ldconfig, which on most systems is run as part ofthe boot process. The libfoo.so link mustbe updated manually. If you are scrupulous about upgrading all theparts of a library (e.g. the header files) at the same time, thesimplest thing to do is make libfoo.so-> libfoo.so.1, so that ldconfig will keep bothlinks current for you. If you aren't,you're setting yourself up to have all kinds ofweird things happen at a later date. Don't say you weren'twarned.

$ su
# cp libfoo.so.1.0 /usr/local/lib
# /sbin/ldconfig
# ( cd /usr/local/lib ; ln -s libfoo.so.1 libfoo.so )

Versionnumbering, sonames and symlinks

Each library has a soname. When thelinker finds one of these in a library it is searching, it embedsthe soname into the binary instead of the actual filename it islooking at. At runtime, the dynamic loader will then search for afile with the name of the soname, not the library filename. Thus alibrary called libfoo.so could have asoname libbar.so, and all programs linkedto it would look for libbar.so insteadwhen they started.

This sounds like a pointless feature, but it is key tounderstanding how multiple versions of the same library can coexiston a system. The de facto naming standard for libraries in Linux isto call the library, say, libfoo.so.1.2,and give it a soname of libfoo.so.1. Ifit's added to a `standard' library directory (e.g. /usr/lib), ldconfig willcreate a symlink libfoo.so.1 ->libfoo.so.1.2 so that the appropriate image is found atruntime. You also need a link libfoo.so-> libfoo.so.1 so that ld will find the rightsoname to use at link time.

So, when you fix bugs in the library, or add new functions (anychanges that won't adversely affect existing programs), you rebuildit, keeping the soname as it was, and changing the filename. Whenyou make changes to the library that would break existing binaries,you simply increment the number in the soname --- in this case,call the new version libfoo.so.2.0, andgive it a soname of libfoo.so.2. Nowswitch the libfoo.so link to point to thenew version and all's well with the world again.

Note that you don't have to namelibraries this way, but it's a good convention. ELF gives you theflexibility to name libraries in ways that will confuse the pantsoff people, but that doesn't mean you have to use it.

Executive summary: supposing that you observe the tradition thatmajor upgrades may break compatibility, minor upgrades may not,then link with

gcc -shared -Wl,-soname,libfoo.so.major -o libfoo.so.major.minor

and everything will be all right.


a.out. Ye oldetraditional format

The ease of building shared libraries is a major reason forupgrading to ELF. That said, it's still possible in a.out. Getand read the 20 page document that you will findafter unpacking it. I hate to be so transparently partisan, but itshould be clear from context that I never bothered myself :-)


ZMAGIC vsQMAGIC

QMAGIC is an executable format just like the old a.out (alsoknown as ZMAGIC) binaries, but which leaves the first pageunmapped. This allows for easier NULL dereference trapping as nomapping exists in the range 0-4096. As a side effect your binariesare nominally smaller as well (by about 1K).

Obsolescent linkers support ZMAGIC only, semi-obsolescentsupport both formats, and current versions support QMAGIC only.This doesn't actually matter, though, as the kernel can still runboth formats.

Your `file' command should be able to identify whether a programis QMAGIC.


FilePlacement

An a.out (DLL) shared library consists of two real files and asymlink. For the `foo' library used throughout this document as anexample, these files would be libfoo.saand libfoo.so.1.2; the symlink would belibfoo.so.1 and would point at the latterof the files. What are these for?

At compile time, ld looks forlibfoo.sa. This is the `stub' file for thelibrary, and contains all exported data and pointers to thefunctions required for run time linking.

At run time, the dynamic loader looks for libfoo.so.1. This is a symlink rather than a realfile so that libraries can be updated with newer, bugfixed versionswithout crashing any application that was using the library at thetime. After the new version --- say, libfoo.so.1.3 --- is completely there, runningldconfig will switch the link to point to it in one atomicoperation, leaving any program which had the old version stillperfectly happy.

DLL libraries (I know that's a tautology --- so sue me) oftenappear bigger than their static counterparts. They reserve spacefor future expansion in the form of `holes' which can be made totake no disk space. A simple cp call orusing the program makehole will achievethis. You can also strip them after building, as the addresses arein fixed locations. Do not attempt to strip ELFlibraries.


``libc-lite''?

A libc-lite is a light-weight version of the libc library builtsuch that it will fit on a floppy and suffice for all of the mostmenial of UNIX tasks. It does not includecurses, dbm, termcap etc code. If your /lib/libc.so.4 is linked to a lite lib, you areadvised to replace it with a full version.


Linking: commonproblems

Send me your linking problems! I probably won't do anythingabout them, but I will write them up if I get enough ...

Programs link static when you wanted them shared

Check that you have the right links for ld to find each shared library. For ELF this means alibfoo.so symlink to the image, for a.outa libfoo.sa file. A lot of people had thisproblem after moving from ELF binutils 2.5 to 2.6 --- the earlierversion searched more `intelligently' for shared libraries, so theyhadn't created all the links. The intelligent behaviour was removedfor compatibility with other architectures, and because quite oftenit got its assumptions wrong and caused more trouble than itsolved.

The DLL tool `mkimage' fails to find libgcc, or

As of libc.so.4.5.x and above, libgccis no longer shared. Hence you must replace occurrences of`-lgcc' on the offending line with`gcc -print-libgcc-file-name` (completewith the backquotes).

Also, delete all /usr/lib/libgcc*files. This is important.

__NEEDS_SHRLIB_libc_4 multiply definedmessages

are another consequence of the same problem.

``Assertion failure'' message when rebuilding a DLL ?

This cryptic message most probably means that one of your jumptable slots has overflowed because too little space has beenreserved in the original jump.vars file.You can locate the culprit(s) by running the `getsize' command provided in the tools-2.17.tar.gzpackage. Probably the only solution, though, is to bump the majorversion number of the library, forcing it to be backwardincompatible.

ld: output file needs shared librarylibc.so.4

This usually happens when you are linking with libraries otherthan libc (e.g. X libraries), and use the -g switch on the link line without also using-static.

The .sa stubs for the shared librariesusually have an undefined symbol _NEEDS_SHRLIB_libc_4 which gets resolved from thelibc.sa stub. However with -g you end up linking with libg.a or libc.a and thusthis symbol never gets resolved, leading to the above errormessage.

In conclusion, add -static whencompiling with the -g flag, or don't linkwith -g. Quite often you can get enoughdebugging information by compiling the individual files with-g, and linking without it.


DynamicLoading

This section is a tad short right now; itwill be expanded over time as I gut the ELF howto


Concepts

Linux has shared libraries, as you will by now be sick ofhearing if you read the whole of the last section at a sitting.Some of the matching-names-to-places work which was traditionallydone at link time must be deferred to load time.


Errormessages

Send me your link errors! I won't do anything about them, but Imight write them up ...

can't load library: /lib/libxxx.so,Incompatible version

(a.out only) This means that you don't have the correct majorversion of the xxx library. No, you can't just make a symlink toanother version that you do have; if you are lucky this will causeyour program to segfault. Get the new version. A similar situationwith ELF will result in a message like

ftp: can't load library 'libreadline.so.2'
warning using incompatible library versionxxx

(a.out only) You have an older minor version of the library thanthe person who compiled the program used. The program will stillrun. Probably. An upgrade wouldn't hurt, though.


Controlling theoperation of the dynamic loader

There are a range of environment variables that the dynamicloader will respond to. Most of these are more use to ldd than they are to the average user, and can mostconveniently be set by running ldd with various switches. Theyinclude

  • LD_BIND_NOW --- normally, functions arenot `looked up' in libraries until they are called. Setting thisflag causes all the lookups to happen when the library is loaded,giving a slower startup time. It's useful when you want to test aprogram to make sure that everything is linked.

  • LD_PRELOAD can be set to a filecontaining `overriding' function definitions. For example, if youwere testing memory allocation strategies, and wanted to replace`malloc', you could write your replacement routine, compile it intomalloc.o and then

    $ LD_PRELOAD=malloc.o; export LD_PRELOAD
    $ some_test_program
    
    LD_ELF_PRELOAD and LD_AOUT_PRELOAD are similar, but only apply to theappropriate type of binary. If LD_something_PRELOAD and LD_PRELOAD areset, the more specific one is used.
  • LD_LIBRARY_PATH is a colon-separatedlist of directories in which to look for shared libraries. It doesnot affect ld; it only has effect atruntime. Also, it is disabled for programs that run setuid orsetgid. Again, LD_ELF_LIBRARY_PATH andLD_AOUT_LIBRARY_PATH can also be used todirect the search differently for different flavours of binary.LD_LIBRARY_PATH shouldn't be necessary innormal operation; add the directories to /etc/ld.so.conf/ and rerun ldconfig instead.

  • LD_NOWARN applies to a.out only. Whenset (e.g. with LD_NOWARN=true; exportLD_NOWARN) it stops the loader from issuing non-fatal warnings(such as minor version incompatibility messages).

  • LD_WARN applies to ELF only. When set,it turns the usually fatal ``Can't find library'' messages intowarnings. It's not much use in normal operation, but important forldd.

  • LD_TRACE_LOADED_OBJECTS applies to ELFonly, and causes programs to think they're being run underldd:

    $ LD_TRACE_LOADED_OBJECTS=true /usr/bin/lynx
            libncurses.so.1 => /usr/lib/libncurses.so.1.9.6
            libc.so.5 => /lib/libc.so.5.2.18
    

Writing programswith dynamic loading

This is very close to the way that Solaris 2.x dynamic loadingsupport works, if you're familiar with that. It is coveredextensively in H J Lu's ELF programming document, and thedlopen(3) manual page, which can be foundin the ld.so package. Here's a nice simple example though: link itwith -ldl

#include <dlfcn.h>
#include <stdio.h>

main()
{
  void *libc;
  void (*printf_call)();

  if(libc=dlopen("/lib/libc.so.5",RTLD_LAZY))
  {
    printf_call=dlsym(libc,"printf");
    (*printf_call)("hello, world\n");
  }

}

Contacting thedevelopers

Bugreports

Start by narrowing the problem down. Isit specific to Linux, or does it happen with gcc on other systems?Is it specific to the kernel version? Library version? Does it goaway if you link static? Can you trim the program down to somethingshort that demonstrates the bug?

Having done that, you'll know what program(s) the bug is in. ForGCC, the bug reporting procedure is explained in the info file. Forld.so or the C or maths libraries, send mail to linux-gcc@vger.rutgers.edu. If possible, include ashort and self-contained program that exhibits the bug, and adescription both of what you want it to do, and what it actuallydoes.


Helping withdevelopment

If you want to help with the development effort for GCC or the Clibrary, the first thing to do is join the linux-gcc@vger.rutgers.edu mailing list. If you justwant to see what the discussion is about, there are list archivesat .The second and subsequent things depend on what you want to do!


The Remains

The Credits

" Only presidents, editors, and people withtapeworms have the right to use the editorial ``we''." (MarkTwain)

This HOWTO is based very closely on Mitchum DSouza's GCC-FAQ;most of the information (not to mention a reasonable amount of thetext) in it comes directly from that document. Instances of thefirst person pronoun in this HOWTO could refer to either of us;generally the ones that say ``I have not tested this; don't blameme if it toasts your hard disk/system/spouse'' apply to both ofus.

Contributors to this document have included (in ASCII orderingby first name) Andrew Tefft, Axel Boldt, Bill Metzenthen, BruceEvans, Bruno Haible, Daniel Barlow, Daniel Quinlan, David Engel,Dirk Hohndel, Eric Youngdale, Fergus Henderson, H.J. Lu, JensSchweikhardt, Kai Petzke, Michael Meissner, Mitchum DSouza, OlafFlebbe, Paul Gortmaker, Rik Faith, Steven S. Dick, Tuomas J Lukka,and of course Linus Torvalds, without whom the whole exercise wouldhave been pointless, let alone impossible :-)

Please do not feel offended if your name has not appeared hereand you have contributed to this document (either as HOWTO or asFAQ). Email me and I will rectify it.


Translations


Feedback

is welcomed. Mail me at daniel.barlow@linux.org. My PGP public key (ID5F263625) is available from my web pages, if you feel the need to be secretiveabout things.


Legalese

All trademarks used in this document are acknowledged as beingowned by their respective owners.

This document is copyright (C) 1996,1999 Daniel Barlow<dan@detached.demon.co.uk>.It may be reproduced and distributed in whole or in part, in anymedium physical or electronic, as long as this copyright notice isretained on all copies. Commercial redistribution is allowed andencouraged; however, the author would like to be notified of anysuch distributions.

All translations, derivative works, or aggregate worksincorporating any Linux HOWTO documents must be covered under thiscopyright notice. That is, you may not produce a derivative workfrom a HOWTO and impose additional restrictions on itsdistribution. Exceptions to these rules may be granted undercertain conditions; please contact the Linux HOWTO coordinator atthe address given below.

In short, we wish to promote dissemination of this informationthrough as many channels as possible. However, we do wish to retaincopyright on the HOWTO documents, and would like to be notified ofany plans to redistribute the HOWTOs.

If you have questions, please contact Tim Bynum, the Linux HOWTOcoordinator, at linux-howto@sunsite.unc.edu via email.

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值