C语言编译步骤 4 Stages of C Compilation

The 4 Stages of C Compilation

C is a compiled language, which means it is interpreted by the machine at “compile time” instead of upon execution.

In summary, the compilation can split into 4 stages:

  • preprocessing
  • compilation
  • assembly
  • linking

Flow of Compilation
(Online Image)

To demonstrate these steps I’m going to be using the gcc (GNU Compiler Collection) command to compile. Every compiler handles these steps, but may vary slightly in what they do during them.

This post will walk through each of the 4 stages of compiling the simplest “Hello World” C program:

/*
 * hello.c
 */
#include <stdio.h>

#define HI "Hello, World"  
// This command will be stripped after preprocessing

int main()
{
	printf("%s\n", HI);
}

1. Preprocessing

C provides certain language facilities by means of a preprocessor, which is conceptionally a separate first step in compilation. In this stage, lines starting with # character are interpreted by the preprocessor as preprocessor commands.

The most frequently used features are:

  • #include, to include the contents of a file during compilation, and
  • #define, to replace a token by an arbitrary sequence of characters.
  • Other features include conditional compilation and macros with arguments.

Before interpreting commands, the preprocessor does some initial processing. This includes joining continued lines (lines ending with a \) and stripping comments.

To perform this step using gcc, you can pass -E option, and use -o to ouput result to hello.i file:

gcc -E hello.c -o hello.i

Contents in hello.i:

// bunch of lines omitted for brevity

extern int __vsnprintf_chk (char * restrict, size_t, int, size_t,
       const char * restrict, va_list);
# 408 "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/stdio.h" 2 3 4
# 19 "hello.c" 2

int main()
{
    printf("%s\n", "Hello World");
}

As you can see, the preprocessor just done simply copy-paste jobs for #include and replace #define macro HI with "Hello World".

2. Compilation

The second stage of compilation is confusingly enough called compilation, where the preprocessor code is translated into human-readable assembly instructions.

Pass -S option to perform this step:

gcc -S hello.i -o hello.s

Snippets of hello.s:

_main:                                  ## @main
	.cfi_startproc
## %bb.0:
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	subq	$16, %rsp
	leaq	L_.str(%rip), %rdi
	leaq	L_.str.1(%rip), %rsi
	movb	$0, %al
	callq	_printf
	xorl	%ecx, %ecx
	movl	%eax, -4(%rbp)          ## 4-byte Spill
	movl	%ecx, %eax
	addq	$16, %rsp
	popq	%rbp
	retq
	.cfi_endproc
                                        ## -- End function

3. Assembly

During the third stage, Assembler is coming in and translating assembly instructions into object instructions (or machine instructions) which are simply 0’s and 1’s sequence looks like 0010111010010.

Pass -c option to gcc:

gcc -c hello.s -o hello.o

Running the above command will generate a hello.o object file. The contents of this file is in a binary format and can be inspected using hexdump:

hexdump hello.o

It will look like:

0000000 cf fa ed fe 07 00 00 01 03 00 00 00 01 00 00 00
0000010 04 00 00 00 08 02 00 00 00 20 00 00 00 00 00 00
0000020 19 00 00 00 88 01 00 00 00 00 00 00 00 00 00 00
0000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000040 a0 00 00 00 00 00 00 00 28 02 00 00 00 00 00 00
0000050 a0 00 00 00 00 00 00 00 07 00 00 00 07 00 00 00
0000060 04 00 00 00 00 00 00 00 5f 5f 74 65 78 74 00 00
0000070 00 00 00 00 00 00 00 00 5f 5f 54 45 58 54 00 00
... 

4. Linking

The object file is composed of machine instructions that the processor understands but some pieces of the program are out of order or missing. To produce an executable program, the existing pieces have to be rearranged and the missing ones filled in. This process is called linking.

The linker will arrange the pieces of object code so that functions in some pieces can successfully call functions in other ones. It will also add pieces containing the instructions for library functions used by the program.

In the case of the “Hello, World!” program, the linker will add the object code for the printf function.

The result of the stage is the final executable program. When run gcc without options, gcc will name this file a.out. To name the grogram something else, pass -o option:

gcc hello.o -o helloword	# link object files to a executable file
./helloword					# run the program

Last but not least

As you have learned, the compilation can be explicitly separated into 4 stages, but you can compile source files directly into final excutable grogram using gcc without options except -o to rename the program, which is usually what we did in practice. gcc will automatically do all the above mentioned 4 stages in one:

gcc hello.c -o helloword
./helloworld
在Java中构建一个简单的C语言编译系统,基于管道和过滤器(Pipeline and Filter),可以分为以下几个步骤: 1. **设计架构**:该系统通常包含几个核心组件,如词法分析器、语法分析器(解析器)、优化器、汇编器和链接器。每个组件都可以看作是一个独立的Java进程或线程。 2. **管道通信**:通过Java的`ProcessBuilder`和`PipedInputStream/PipedOutputStream`等API,创建管道连接。例如,词法分析器将输出结果到输入流,供语法分析器处理。 3. **过滤器模式**:每个组件都是一个过滤器,它读取输入数据,执行特定任务并生成新的数据。数据在管道之间逐级传递,直到最后一个组件完成编译过程。 4. **错误处理**:Java进程需要能够捕获和处理错误,例如通过标准错误流(stderr)来报告编译错误。 5. **接口和抽象**:为了支持不同的编译阶段,你可以定义一些公共接口,让各个组件按照这些接口操作。 ```java // 概念性示例(简化版) public abstract class CompilerStage { public abstract void process(PipedOutputStream outputStream, PipedInputStream inputStream); } class LexicalAnalyzer extends CompilerStage { @Override public void process(...) { // 实现词法分析... } } class SyntaxAnalyzer extends CompilerStage { @Override public void process(...) { // 实现语法分析... } } // 简化编译流程 List<CompilerStage> stages = Arrays.asList(new LexicalAnalyzer(), new SyntaxAnalyzer()); for (CompilerStage stage : stages) { try (PipedInputStream input = new PipedInputStream(); PipedOutputStream output = new PipedOutputStream()) { stage.process(output, input); // 数据从上一阶段传递到当前阶段 } catch (IOException e) { handleException(e); } } ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值