Examining the Compilation Process

最新推荐文章于 2022-06-11 14:21:08 发布

酱油王

最新推荐文章于 2022-06-11 14:21:08 发布

阅读量525

点赞数

分类专栏： C

C 专栏收录该内容

11 篇文章 0 订阅

订阅专栏

Examining the Compilation Process. Part 1.

This article, and the one to follow, are based on aSoftware Development class I taught a few years ago. The students in this classwere non-programmers who had been hired to receive bug reports for a compilerproduct. As Analysts, they had to understand the software compilation processin some detail, even though some of them had never written a single line ofcode. It was a fun class to teach, so I'm hoping that the subject translatesinto interesting reading.

In this article, I'm going to discuss the process that thecomputer goes through to compile source code into an executable program. Iwon't be clouding the issue with the Make environment, or Revision Control,like I necessarily did in the class. For this article, we're only going todiscuss what happens after you type gcc test.c.

Broadly speaking, the compilation process is broken downinto 4 steps: preprocessing, compilation, assembly, and linking. We'll discusseach step in turn.

Before we can discuss compiling a program, we really needto have a program to compile. Our program needs to be simple enough that we candiscuss it in detail, but broad enough that it exercises all of the conceptsthat I want to discuss. Here is a program that I hope fits the bill:

#include <stdio.h>

// This is a comment.

#define STRING "This is a test"

#define COUNT (5)

int     main () {

       int i;

       for (i=0; i<COUNT; i++) {

               puts(STRING);

       return 1;

If we put this program in a file called test.c, we cancompile this program with the simple command: gcc test.c. What we end up withis an executable file called a.out. The name a.out has some history behind it.Back in the days of the PDP computer, a.out stood for “assembler output.”Today, it simply means an older executable file format. Modern versions of Unixand Linux use the ELF executable file format. The ELF format is much moresophisticated. So even though the default filename of the output of gcc is“a.out,” its actually in ELF format. Enough history, let's run our program.

When we type ./a.out, we get:

This is a test

This is a test

This is a test

This is a test

This is a test

This, of course, doesn't come as a surprise, so let'sdiscuss the steps that gcc went through to create the a.out file from thetest.c file.

As mentioned earlier, the first step that the compiler doesis it sends our source code through the C Preprocessor. The C Preprocessor isresponsible for 3 tasks: text substitution, stripping comments, and fileinclusion. Text substitution and file inclusion is requested in our source codeusing preprocessor directives. The lines in our code that begin with the “#”character are preprocessor directives. The first one requests that a standardheader, stdio.h, be included into our source file. The other two request astring substitution to take place in our code. By using gcc's “-E” flag, we cansee the results of only running the C preprocessor on our code. The stdio.hfile is fairly large, so I'll clean up the results a little.

gcc -E test.c > test.txt

# 1 "test.c"

# 1 "/usr/include/stdio.h" 1 3 4

# 28 "/usr/include/stdio.h" 3 4

# 1 "/usr/include/features.h" 1 3 4

# 330 "/usr/include/features.h" 3 4

# 1 "/usr/include/sys/cdefs.h" 1 3 4

# 348 "/usr/include/sys/cdefs.h" 3 4

# 1 "/usr/include/bits/wordsize.h" 1 3 4

# 349 "/usr/include/sys/cdefs.h" 2 3 4

# 331 "/usr/include/features.h" 2 3 4

# 354 "/usr/include/features.h" 3 4

# 1 "/usr/include/gnu/stubs.h" 1 3 4

# 653 "/usr/include/stdio.h" 3 4

extern int puts (__const char *__s);

int main () {

 int i;

 for (i=0; i<(5); i++) {

 puts("This is a test");

 return 1;

The first thing that becomes obvious is that the CPreprocessor has added a lot to our simple little program. Before I cleaned itup, the output was over 750 lines long. So, what was added, and why? Well, ourprogram requested that the stdio.h header be included into our source. Stdio.h,in turn, requested a whole bunch of other header files. So, the preprocessormade a note of the file and line number where the request was made and madethis information available to the next steps in the compilation process. Thus,the lines,

# 28 "/usr/include/stdio.h" 3 4

# 1 "/usr/include/features.h" 1 3 4

indicates that the features.h file was requested on line 28of stdio.h. The preprocessor creates a line number and file name entry beforewhat might be “interesting” to subsequent compilation steps, so that if thereis an error, the compiler can report exactly where the error occurred.

When we get to the lines,

# 653 "/usr/include/stdio.h" 3 4

extern int puts (__const char *__s);

We see that puts() is declared as an external function thatreturns an integer and accepts a single constant character array as aparameter. If something were to go horribly wrong with this declaration, thecompiler could tell us that the function was declared on line 653 of stdio.h.It's interesting to note that puts() isn't defined, only declared. That is, wedon't get to see the code that actually makes puts() work. We'll talk about howputs(), and other common functions get defined later.

Also notice that none of our program comments are left inthe preprocessor output, and that all of the string substitutions have beenperformed. At this point, the program is ready for the next step of theprocess, compilation into assembly language.

We can examine the results of the compilation process byusing gcc's -S flag.

gcc -S test.c

This command results in a file called test.s that containsthe assembly code implementation of our program. Let's take a brief look.

       .file   "test.c"

       .section        .rodata

.LC0:

       .string "This is a test"

       .text

.globl main

       .type   main, @function

main:

       leal    4(%esp), %ecx

       andl    $-16, %esp

       pushl   -4(%ecx)

       pushl   %ebp

       movl    %esp, %ebp

       pushl   %ecx

       subl    $20, %esp

       movl    $0, -8(%ebp)

       jmp     .L2

.L3:

       movl    $.LC0, (%esp)

       call    puts

       addl    $1, -8(%ebp)

.L2:

       cmpl    $4, -8(%ebp)

       jle     .L3

       movl    $1, %eax

       addl    $20, %esp

       popl    %ecx

       popl    %ebp

       leal    -4(%ecx), %esp

ret

       .size   main, .-main

       .ident  "GCC: (GNU) 4.2.4 (Gentoo 4.2.4 p1.0)"

       .section        .note.GNU-stack,"",@progbits

My assembly language skills are a bit rusty, but there area few features that we can spot fairly readily. We can see that our messagestring has been moved to a different part of memory and given the name .LC0. Wecan also see that there are quite a few steps needed to start and exit ourprogram. You might be able to follow the implementation of the for loop at .L2;it's simply a comparison (cmpl) and a “Jump if Less Than” (jle) instruction.The initialization was done in the movl instruction just above the .L3 label.The call to puts() is fairly easy to spot. Somehow the Assembler knows that itcan call the puts() function by name and not a funky label like the rest of thememory locations. We'll discuss this mechanism next when we talk about thefinal stage of compilation, linking. Finally, our program ends with a return(ret).

The next step in the compilation process is to assemble theresulting Assembly code into an object file. We'll discuss object files in moredetail when we discuss linking. Suffice it to say that assembling is theprocess of converting (relatively) human readable assembly language intomachine readable machine language.

Linking is the final stage that either produces anexecutable program file or an object file that can be combined with otherobject files to produce an executable file. It's at the link stage that wefinally resolve the problem with the call to puts(). Remember that puts() wasdeclared in stdio.h as an external function. This means that the function willactually be defined, or implemented, elsewhere. If we had several source filesin our program, we might have declared some of our functions as extern andimplemented them in different files; such functions would be available anywherein our source files by nature of having been declared extern. Until thecompiler knows exactly where all of these functions are implemented, it simplyuses a place-holder for the function call. The linker will resolve all of thesedependencies and plug in the actual address of the functions.

The linker also does a few additional tasks for us. Itcombines our program with some standard routines that are needed to make ourprogram run. For example, there is standard code required at the beginning ofour program that sets up the running environment, such as passing incommand-line parameters and environment variables. Also, there is code thatneeds to be run at the end of our program so that it can pass back a returncode, among other tasks. It turns out that this is no small amount of code.Let's take a look.

If we compile our example program, as we did above, we getan executable file that is 6885 byes in size. However, if we instruct thecompiler to not go through the linking stage, by using the -c flag (gcc -ctest.c -o test.o), we get an object module that is 888 bytes in size. Thedifference in file size is the code to startup and terminate our program, alongwith the code that allows us to call the puts() function in libc.so.

At this point, we've looked at the compilation process insome detail. I hope this has been interesting to you. Next time, we'll discussthe linking process in a bit more detail and consider some of the optimizationfeatures that gcc provides.

Examining the compilation process. part 2.

In my last article, I discussed, in quite some detail, the process thatGCC uses to convert a C source file into an executable program file. Thesesteps included preprocessing the source to remove comments, include other filesas required, and string substitution. The resulting file was then compiled intoassembly language. The assembly language output was then used to create anobject file containing machine language, which was then linked with otherstandardized libraries to create an executable.

As mentioned in the previous article, that article as well as this one,are based on a software development class I taught a few years ago. Some peopleare going to find this to be quite a dry subject; others will be delighted tosee some of the magic that the compiler performs on our creations in order tomake them executable. I happened to fall into the later category and I hope youdo to.

And so, last time I had concluded my article with a very light discussionof the linking process. I intend to go a bit deeper into the linking process inthis article, as well as some discussion about some of the optimizations thatGCC can perform for you.

Before we get too deep into things, let's see a quick example of what thelinking process does for us. For this example, we have two files, main.c andfunct.c

main.c:

#include <stdio.h>

extern void funct();

int main () {

funct();

}

Yes, this is a pretty simple program. Notice that we've not defined thefunction, funct(), only declared it as an external function that accepts noparameters and returns no value. We will define this function in the next file,funct.c:

void funct () {

puts("Hello World.");

}

Most of you, by now, see where this is headed. This is the proverbial“Hello World” program, only we've broken it up into two separate files, for thesake of instruction. In a real project, you'd use the make program to arrangefor all of the files to be compiled, but we're going to do the compilation byhand.

First we compile the main.c into a main.o file:

gcc main.c -c -o main.o

This command tells GCC to compile the source file, but not to run thelinker so that we are left with an object file, which we want named main.o.

Compiling funct.c is much the same:

gcc funct.c -c -o funct.o

Now we can call GCC one more time, only this time, we want it to run thelinker:

gcc main.o funct.o -o hello

In this example, we supplied the names of a couple “.o” object files,requested that they all be linked, and that the resulting executable be namedhello.

Would you be surprised if executing ./hello resulted in “Hello World.”?

I didn't think so. So why would we take the simplest program possible andsplit it into two separate files? Well, because we can. And what we gain fromdoing it this way is that if we make a change to only one of the files, wedon't have to recompile any of the files that didn't change; we simply re-linkthe already existing object files to the new object file that we created whenwe compiled the source file that we changed. This is where the make utilitycomes in handy as it keeps track of what needs to be recompiled based on whatfiles have been changed since the last compilation.

Essentially, let's say that we had a very large software project. We couldwrite it as one file and simply recompile it as needed. However, this wouldmake it difficult for more than one person to work on the project, as only oneof us could work at a given time. Also, it would mean that the compilationprocess would be quite time consuming since it would have to compile severalthousands of lines of C source. But if we split the project into severalsmaller files, more than one person can work on the project and we only have tocompile those files that get changed.

The Linux linker is pretty powerful. The linker is capable of linkingobject files together, as in the example above. It's also able to create sharedlibraries that can be loaded into our program at run time. While we won'tdiscuss the creation of shared libraries, we will see a few examples that thesystem already has.

In my last article, I used a source file called test.c for the sake ofdiscussion:

#include <stdio.h>

// This is a comment.

#define STRING "This is a test"

#define COUNT (5)

int main () {

int i;

for (i=0; i<COUNT;i++) {

puts(STRING);

}

return 1;

}

We can compile this program with:

gcc test.c -o test

We can use the ldd command to get a list of shared libraries that ourprogram depends upon.

ldd test

And we see:

linux-gate.so.1 => (0xffffe000)

libc.so.6=> /lib/libc.so.6 (0xb7e3c000)

/lib/ld-linux.so.2 (0xb7f9a000)

The libc.so.6 entry is fairly easy. It's the standard C library thatcontains such things as puts() and printf(). We can also see which fileprovides this library, /lib/libc.so.6. The other two are a bit moreinteresting. Theld-linux.so.2 is a library that finds and load all of the other sharedlibraries, that a program needs in order to run, such as the libcmentioned earlier. The Linux-gate.so.1 entry is also interesting. This library is actually just avirtual library created by the Linux kernel that lets a program know how tomake system calls. Some systems support the sysenter mechanism, whileothers call system calls via the interrupt mechanism, which is considerablyslower. We'll be talking about system calls next.

System calls are a standardized interface for interacting with theoperating system. Long story short, how do you allocate memory? How do yououtput a string to the console? How do you read a file? These functions areprovided by system calls. Let's take a closer look.

We can see what function calls a program uses by using the strace command.For example, let's take a look at our test program above with the straceprogram.

strace ./test

This command results in output similar to what we see below except thatI've added line numbers for the sake of convenient reference.

1 execve("./test",["./test"], [/* 56 vars */]) = 0

2 brk(0) = 0x804b000

3 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)

4 open("/etc/ld.so.cache", O_RDONLY) = 3

5 fstat64(3,{st_mode=S_IFREG|0644, st_size=149783, ...}) = 0

6 mmap2(NULL,149783, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f79000

7 close(3) = 0

8 open("/lib/libc.so.6", O_RDONLY)= 3

9 read(3,"\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\220g\1\0004\0\0\0"...,512) = 512

10 fstat64(3, {st_mode=S_IFREG|0755,st_size=1265948, ...}) = 0

11 mmap2(NULL, 4096, PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f78000

12 mmap2(NULL, 1271376, PROT_READ|PROT_EXEC,MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7e41000

13 mmap2(0xb7f72000, 12288,PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x131) =0xb7f72000

14 mmap2(0xb7f75000, 9808, PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7f75000

15 close(3) = 0

16 mmap2(NULL, 4096, PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7e40000

17 set_thread_area({entry_number:-1 -> 6, base_addr:0xb7e406c0,limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1,seg_not_present:0, useable:1}) = 0

18 mprotect(0xb7f72000, 8192, PROT_READ) = 0

19 mprotect(0x8049000, 4096, PROT_READ)= 0

20 mprotect(0xb7fb9000, 4096, PROT_READ) = 0

21 munmap(0xb7f79000, 149783) = 0

22 fstat64(1,{st_mode=S_IFCHR|0600, st_rdev=makedev(136, 3), ...}) = 0

23 mmap2(NULL,4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f9d000

24 write(1,"This is a test\n", 15This is a test

25 )= 15

26 write(1,"This is a test\n", 15This is a test

27 )= 15

28 write(1,"This is a test\n", 15This is a test

29 )= 15

30 write(1,"This is a test\n", 15This is a test

31 )= 15

32 write(1,"This is a test\n", 15This is a test

Lines 1 and 2 are simply the calls needed by the shell execute an externalcommand. In ines 3 through 8 we see the system trying to load various sharedlibraries. Line 8 shows is where the system tries to load libc. Line 9 shows usthe results of the first reading of the library file. In lines8-15, we see thesystem mapping the contents of the libc file into memory. This is how thesystem loads a library into memory for use by our program; it simply reads thefile into memory and gives our program a pointer to the block of memory wherethe library got loaded. Now our program can call functions in libc as thoughthey were a part of our program.

Line 22 is where the system allocates a tty to send it's output to.

Finally, we see our output getting sent out in lines 24-32. The stracecommand lets us see what what our program is doing under the hood. It's goodfor learning about the system, as in this article, as well as helping to findwhat a misbehaved program is attempting to do. I've had numerous occasions torun strace on a program that had apparently “locked up” only to find that itwas blocking on some sort of file read or such. Strace is a sure fire way oflocating those types of problems.

Finally, GCC supports various levels of optimization, and I'd like todiscuss just what that means.

Let's take a look at another program, test1.c:

#include <stdio.h>

int main () {

int i;

for(i=0;i<4;i++) {

puts("Hello");

}

return 0;

}

When we convert that to assembly language with the gcc -s command, we get:

.file "t2.c"

.section .rodata

.LC0:

.string"Hello"

.text

.globl main

.type main, @function

main:

leal 4(%esp), %ecx

andl $-16, %esp

pushl -4(%ecx)

pushl %ebp

movl %esp, %ebp

pushl %ecx

subl $20, %esp

movl $0, -8(%ebp)

jmp .L2

.L3:

movl $.LC0, (%esp)

call puts

addl $1, -8(%ebp)

.L2:

cmpl $3, -8(%ebp)

jle .L3

movl $0, %eax

addl $20, %esp

popl %ecx

popl %ebp

leal -4(%ecx), %esp

ret

.size main, .-main

.ident "GCC: (GNU) 4.2.4 (Gentoo 4.2.4p1.0)"

.section .note.GNU-stack,"",@progbits

We can see the for loop starting at .L3:. It runs until the jleinstruction right after .L2. Now let's compile this program into assemblylanguage, but with 03 optimization turned on:

gcc -S -O3 test.c

What we get is:

main:

leal 4(%esp), %ecx

andl $-16, %esp

pushl -4(%ecx)

pushl %ebp

movl %esp, %ebp

pushl %ecx

subl $4, %esp

movl $.LC0, (%esp)

call puts

movl $.LC0, (%esp)

call puts

movl $.LC0, (%esp)

call puts

movl $.LC0, (%esp)

call puts

movl $.LC0, (%esp)

call puts

addl $4, %esp

movl $1, %eax

popl %ecx

popl %ebp

leal -4(%ecx), %esp

ret

.size main, .-main

.ident "GCC: (GNU) 4.2.4 (Gentoo 4.2.4p1.0)"

.section .note.GNU-stack,"",@progbits

Here we can see that the for loop has been completely factored out andthat gcc has replaced it with 5 separate calls to the puts system call. Theentire for loop is gone! Nice.

GCC is an extremely sophisticated compiler that is even capable offactoring out loop invariants. Consider this code snippet:

for (i=0; i<5; i++) {

x=23;

do_something();

}

If you write a quick program to exercise this code snippet, you will seethat the assignment to the x variable gets factored to a point outside of thefor loop, as long as the value of x isn't used inside the loop. Essentially,GCC, with -O3, rewrites the code into this:

x=23;

for (i=0; i<5; i++) {

do_something();

}

Very nice.

Bonus points for anyone who can guess what gcc -O3 does to this program:

#include <stdio.h>

int main () {

int i;

int j;

for(i=0;i<4;i++) {

j=j+2;

}

return 0;

}

Na, I always hated bonus questions, so I'll just give you the answer. GCCfactors our program out completely. Since it does nothing, GCC doesn't writeanything. Here is the output of that program:

.file "t3.c"

.text

.p2align4,,15

.globl main

.type main, @function

main:

leal 4(%esp), %ecx

andl $-16, %esp

pushl -4(%ecx)

xorl %eax, %eax

pushl %ebp

movl %esp, %ebp

pushl %ecx

popl %ecx

popl %ebp

leal -4(%ecx), %esp

ret

.size main, .-main

.ident "GCC: (GNU) 4.2.4 (Gentoo 4.2.4p1.0)"

.section .note.GNU-stack,"",@progbits

As you can see, the program starts up and immediately terminates. The forloop is gone, as well as the assignment to the j variable. Very, very nice.

So, GCC is a very sophisticated compiler that is capable of handling verylarge projects and performing some very sophisticated optimizations on a givensource file. I hope that reading this article, and the one before it, has leadyou to a greater appreciation of just how intelligent the Linux compiler suiteactually is, as well as given you some understanding that you can use to debugyour own programs.