『C』程序的翻译执行

最新推荐文章于 2022-07-18 11:31:28 发布

爱喝可乐的炸鸡

最新推荐文章于 2022-07-18 11:31:28 发布

阅读量843

点赞数 3

分类专栏：『C语言』文章标签： C 预处理编译汇编链接

本文链接：https://blog.csdn.net/sss_0916/article/details/85079926

版权

『C语言』专栏收录该内容

17 篇文章 0 订阅

订阅专栏

在ANSI C的任何一种实现中，存在两种不同的环境。第一种是翻译环境，在这个环境中，源代码被转换为可执行的机器指令。第二种是执行环境，它用于实际执行代码。标准明确说明，这两种环境不必位于同一台机器上。例如，交叉编译器就是在一台机器上运行，但它所产生的可执行代码运行于不同类型的机器上。

程序的翻译和执行

翻译

在这里插入图片描述
组成一个程序的每个源文件通过编译过程分别转换成目标代码。
每个目标文件由链接器捆绑在一起，形成一个单一而完整的可执行程序。
链接器同时也会引入标准C函数库中任何被该程序所用到的函数，而且他可以搜索程序猿客人的程序库，将需要的函数链接到程序中。

执行

程序执行过程：

首先，程序必须载入到内存中。在有操作系统的环境中：一般这个由操作系统完成。在独立的环境中，程序的载入必须由手工安排，也可能是通过可执行代码置入只读内存来完成。
然后，程序的执行便开始。接着便执行main函数。
然后，开始执行程序代码。这个时候程序将使用一个运行时堆栈，存储函数的局部变量和返回地址。程序同时也可以使用静态内存，存储于静态内存中的变量在程序的整个执行过程一直保留它们的值。
最后，终止程序，正常终止main函数，也有可能是意外终止。

程序的翻译流程

预处理

命令：gcc -E ×××.c -o ×××.i
主要工作：
拷贝头文件到×××.i。
去掉注释。
对宏进行展开。
处理条件编译。

代码演示

#include <stdio.h>

#define N 5

/*** 测试用例 ***/

int main(){

#if 1
	printf("hello, world! month: %d\n", N);
#else
	printf("hehe, world! month: %d\n", N);
#endif

	return 0;
}

预处理之后

835 # 943 "/usr/include/stdio.h" 3 4
836 
837 # 2 "pretreatment.c" 2
838 
839 
840 
841 
842 
843 int main(){
844 
845 
846  printf("hello, world! month: %d\n", 5);
847 
848 
849 
850 
851  return 0;
852 }

前面800多行都是stdio.h头文件的展开。可以看到，预处理之后，注释被去掉了，宏也进行了展开，条件编译也被处理。

预定义符号

__FILE__	进行编译的源文件
__LINE__	文件当前的行号
__DATE__	文件被编译的日期
__TIME__	文件被编译的时间
__STDC__	如果编译器遵循ANSI C，其值为1，否则未定义

这些预定义符号都是语言内置的。

代码演示

#include <stdio.h>

int main(){

	printf("files: %s\tline: %d\n", __FILE__, __LINE__);
	printf("date: %s\t\ttime: %s\n", __DATE__, __TIME__);
	printf("stdc: %d\n", __STDC__);

	return 0;
}

运行结果

[sss@aliyun order]$ !gcc
gcc pre_define_symbol.c -o pre_define_symbol
[sss@aliyun order]$ ./pre_define_symbol 
files: pre_define_symbol.c	line: 5
date: May  5 2019		time: 19:59:18
stdc: 1

宏

语法：

#define constant	100
#define reg			register
#define do_forever	for(;;)
#define CASE		break;case
#define DEBUG_PRINT	printf("File: %s. Line: %d."\
							"x = %d, y = %d, z = %d",\
							__FILE__, __LINE__,\
							x, y, z)

宏常量

语法：

#define name stuff

代码示例

#include <stdio.h>

#define N 10

int main(){

	printf("hello, world! %d\n", N);

	return 0;
}

运行结果

[sss@aliyun order]$ gcc macro_constant.c -o macro_constant
[sss@aliyun order]$ ./macro_constant 
hello, world! 10

宏函数

语法：

#define name(parament_list) stuff
其中parament_list是一个由逗号隔开的符号表，它们可能出现在stuff中。

注意：参数列表的左括号必须与name紧邻。如果两者存在任何空白，参数列表就会被解释为stuff的一部分。

代码演示

#include <stdio.h>

#define MAX(a, b) a > b ? a : b

int main(){
	int a = 1;
	int b = 2;

	printf("max: %d\n", MAX(a, b));

	return 0;
}

运行结果

[sss@aliyun order]$ gcc macro_function.c -o macro_function
[sss@aliyun order]$ ./macro_function 
max: 2

宏函数的一个注意事项

代码演示

#include <stdio.h>

#define SQUARE(X) X * X

int main(){
	int num = 3;

	printf("square(num + 1): %d\n", SQUARE(num + 1));

	return 0;
}

运行结果

[sss@aliyun order]$ !gcc
gcc macro_function.c -o macro_function
[sss@aliyun order]$ ./macro_function 
square(num + 1): 7

可见宏函数只是简单的文本替换。

SQUARE(num + 1)
num + 1 * num + 1

#undef

用于移除一个宏。
语法：

#define N 10
#undef N

#和##

首先来看一段代码

#include <stdio.h>

int main(){
	char* p = "hello, ""world!\n";

	printf("hello, ""world!\n");
	printf("%s", p);

	return 0;
}

运行结果

[sss@aliyun order]$ gcc test.c -o test
[sss@aliyun order]$ ./test 
hello, world!
hello, world!

从这里，我们可以看出字符串是有自动连接的特点的。

#用法示例

#include <stdio.h>

#define PRINT(FORMAT, VALUE)\
		printf("the value is "FORMAT"\n", VALUE);

/*** #的作用，把一个宏参数编程对应的字符串 ***/
#define PRINT2(FORMAT, VALUE)\
		printf("the value of "#VALUE" is "FORMAT"\n", VALUE);

int main(){
	int i = 10;

	PRINT("%d", 10);
	PRINT2("%d", i + 3);

	return 0;
}

运行结果

[sss@aliyun order]$ !gcc
gcc test.c -o test
[sss@aliyun order]$ ./test 
the value is 10
the value of i + 3 is 13

##用法示例

#include <stdio.h>

#define PRINT(FORMAT, VALUE)\
		printf("the value is "FORMAT"\n", VALUE);

/*** #的作用，把一个宏参数编程对应的字符串 ***/
#define PRINT2(FORMAT, VALUE)\
		printf("the value of "#VALUE" is "FORMAT"\n", VALUE);

/*  
 * #的作用，把位于它两边的符号合成一个符号，
 * 它允许宏定义从分离的文本片段创建标识符 
 */
#define ADD_TO_SUM(num, value) \
	sum##num += value;

int main(){
	int i = 10;
	int sum5 = 0;

	PRINT("%d", 10);
	PRINT2("%d", i + 3);

	ADD_TO_SUM(5, 10);

	printf("sum5: %d\n", sum5);

	return 0;
}

运行结果

[sss@aliyun order]$ gcc macro_20190505.c -o macro
[sss@aliyun order]$ ./macro
the value is 10
the value of i + 3 is 13
sum5: 10

宏和函数

属性	宏	函数
代码长度	每次使用时，宏代码都会被插入到程序中。除了非常小的宏之外，程序的长度会大幅度增长。	函数代码只出现在一个地方；每次使用这个函数时，都调用同一份代码。
执行速度	更快。	存在函数的调用和返回的额外开销，相对慢一些。
操作符优先级	宏参数的求值是在所有周围表达式的上下文环境里，除非加上括号，否则临近操作符的优先级可能会产生不可预料的后果，所以建议宏在书写时多些括号。	函数参数只在函数调用的时候求值一次，它的结果传递给函数。表达式的求值结果更容易预测。
带有副作用的参数	参数可能被替换到宏体中的多个位置，所以带有副作用的参数求值可能会产生不可预料的结果。	函数参数只在传参的时候求值一次，结果更容易控制。
参数类型	宏的参数与类型无关，只要对参数的操作是合法的，它就可以使用任何类型参数。	函数的参数是与类型有关的，如果函数的类型不同，就需要不同的函数，即使他们执行的任务是相同的。
调试	宏是不方便调试的。	函数是可以逐语句调试的。
递归	宏是不能递归的。	函数是可以递归的。

命令行定义

许多C编译器提供了一种能力，允许在命令行中定义符号。用于启动编译过程。例如：当我们根据同一个源文件要编译出一个程序的不同版本时，这个特性有点用处。（假定某个程序中声明了一个某个长度的数组，如果机器内存有限，我们需要一个很小的数组，但是另外一个机器内存很大，我们需要一个大些的数组。）

代码演示

#include <stdio.h>

int main(){
	int arr[LEN];
	int i = 0;

	for(; i < LEN; ++i){
		arr[i] = i;
	}

	printf("The array is: \n");
	for(i = 0; i < LEN; ++i){
		printf("%d ", arr[i]);
	}
	printf("\n");

	return 0;
}

运行结果

[sss@aliyun order]$ gcc -DLEN=10 command_line_define.c -o command_line_define
[sss@aliyun order]$ ./command_line_define 
The array is: 
0 1 2 3 4 5 6 7 8 9 
[sss@aliyun order]$ gcc -DLEN=5 command_line_define.c -o command_line_define
[sss@aliyun order]$ ./command_line_define 
The array is: 
0 1 2 3 4

条件编译

在编译一个程序的时候，我们如果要将一条语句编译或者放弃是很方便的。我们可以用条件编译指令。

代码演示

#include <stdio.h>
#define __DEBUG__

int main(){
	int i = 0;
	int arr[10] = {0};
	
	for(; i < 10; ++i){
		arr[i] = i;
		#ifdef __DEBUG__
		printf("%d\n", arr[i]);
		#endif
	}
	
	return 0;
}

常见的条件编译指令

#if 常量表达式
	// ...
#endif

/*** 多分支条件编译 ***/
#if 常量表达式
	// ...
#elif 常量表达式
	// ...
#else
	// ...
#endif

/*** 判断symbol是否被定义 ***/
#if defined(symbol)
#ifdef symbol
#if !defined(symbol)
#ifndef symbol

/*** 嵌套指令 ***/
#if defined(OS_UNIX)
	#ifdef OPTION1
		unix_version_option1();
	#endif
	#ifdef OPTION2
		unix_version_option2();
	#endif
#elif defined(OS_MSDOS)
	#ifdef OPTION2
		msdos_version_option2();
	#endif
#endif

文件包含

我们已经看到过，#include指令使另一个文件的内容被编译，就像它实际出现于#include指令的地方一样。这种替换的方式很简单：预处理器先删除这条指令，并用包含文件的内容替换。这样一个源文件被包含10次，那就实际被编译10次。

头文件包含方式

本地头文件包含

#include "×××.h"

该种包含方式，先在源文件所在目录下查找，找不到的话就到标准库文件所在路径进行查找。

库文件包含

#include <×××.h>

该种包含方式，直接去标准库文件所在路径进行查找，找不到报错。
Linux环境的标准头文件路径：

/usr/include

VS环境的标准头文件的路径：

C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC/include

嵌套文件包含

在这里插入图片描述
comm.h和comm.c是公共模块。test1.h和test1.c使用了公共模块。test2.h和test2.c使用了公共模块。test.h和test.c使用了test1模块和test2模块。这样最终程序中就会出现两份comm.h的内容。这样就造成了文件内容的重复包含。
如何解决这个问题？
条件编译。

#ifndef __TEST_H__
#define __TEST_H__
// 头文件内容
#endif

#pragma once

上面两种方法都可以避免头文件的重复包含。

其他预处理指令

#error
#pragma
#line
...

编译

命令：gcc -S ×××.i -o ×××.s
主要工作：
将预处理过的源代码文件转换为汇编代码。

演示

[sss@aliyun order]$ gcc -S hehe.i -o hehe.s
[sss@aliyun order]$ vim hehe.s

	.file	"pretreatment.c"
	.section	.rodata
.LC0:
	.string	"hello, world! month: %d\n"
	.text
	.globl	main
	.type	main, @function
main:
.LFB0:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register 6
	movl	$5, %esi
	movl	$.LC0, %edi
	movl	$0, %eax
	call	printf
	movl	$0, %eax
	popq	%rbp
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE0:
	.size	main, .-main
	.ident	"GCC: (GNU) 4.8.5 20150623 (Red Hat 4.8.5-36)"
	.section	.note.GNU-stack,"",@progbits

汇编

命令：gcc ×××.s -o ×××.o
主要工作：将汇编代码变成二进制指令。
注意，×××.o还不能执行。

chmod +x ×××.o

加了权限也不能执行。
和可执行程序还差了一步链接。

演示

[sss@aliyun order]$ gcc -c hehe.s -o hehe.o
[sss@aliyun order]$ objdump -d hehe.o

hehe.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <main>:
   0:	55                   	push   %rbp
   1:	48 89 e5             	mov    %rsp,%rbp
   4:	be 05 00 00 00       	mov    $0x5,%esi
   9:	bf 00 00 00 00       	mov    $0x0,%edi
   e:	b8 00 00 00 00       	mov    $0x0,%eax
  13:	e8 00 00 00 00       	callq  18 <main+0x18>
  18:	b8 00 00 00 00       	mov    $0x0,%eax
  1d:	5d                   	pop    %rbp
  1e:	c3                   	retq   
[sss@aliyun order]$ chmod +x hehe.o
[sss@aliyun order]$ ll
total 124
-rw-rw-r-- 1 sss sss   184 May  5 19:44 hehe.c
-rw-rw-r-- 1 sss sss 16918 May  5 19:55 hehe.i
-rwxrwxr-x 1 sss sss  1528 May  6 09:21 hehe.o
-rw-rw-r-- 1 sss sss   501 May  6 09:15 hehe.s
[sss@aliyun order]$ ./hehe.o
-bash: ./hehe.o: cannot execute binary file