第一步:理解编译阶段
我们的前面的课程中已经学到,源程序输入完之后的工作就是要编译它。编译源程序就是计算机把你所写(通常是用C 或 C++编写的)的源代码进行分解、分析,然后转化为机器语言。机器语言是一种计算机能够理解阿语言,而且它运行起来比 C 或 C++ 也要快得多。编译的大致过程如下:
1. 用 C 或 C++ 编写代码:
#include
void main(void) {
int c;
printf("Hello World on Windows!\n");
c = getchar();
}
2. 进行编译。 CodeWarrior 编译上述代码并把它翻译成机器语言,结果如下:
符号名:
1: _main 2: _@8 3: _printf 4: ___files 5: _fwide 6: ___get_char =============================================================== SECTION SIZE = 0x0000003C; NAME =?.text DEFINED SYMBOLS: name = _main offset = 0x00000000; type = 0x0020; class = 0x0002 00000000: 68 00 00 00 00 push offset _@8 00000005: E8 00 00 00 00 call _printf 0000000A: 59 pop ecx 0000000B: 6A FF push -1 0000000D: 68 00 00 00 00 push offset ___files 00000012: E8 00 00 00 00 call _fwide 00000017: 85 C0 test eax,eax 00000019: 59 pop ecx 0000001A: 59 pop ecx 0000001B: 7D 1E jge $+32 ; --> 0x003b 0000001D: 83 2D 2C 00 00 00 sub dword ptr ___files+44,1 00000023: 01 00000024: 72 0A jb $+12 ; --> 0x0030 00000026: FF 05 28 00 00 00 inc dword ptr ___files+40 0000002C: EB 0D jmp $+15 ; --> 0x003b 0000002E: 89 C0 mov eax,eax 00000030: 68 00 00 00 00 push offset ___files 00000035: E8 00 00 00 00 call ___get_char 0000003A: 59 pop ecx 0000003B: C3 ret near ============================================================= SECTION SIZE = 0x00000019; NAME = .data 00000000: 48 65 6C 6C 6F 20 57 6F 72 6C 64 20 6F 6E 20 57 Hello World on W 00000010: 69 6E 64 6F 77 73 21 0A 00 indows! DEFINED SYMBOLS: name = _@8 offset = 0x00000000; type = 0x0000; class = 0x0003 ==============================================================
上述机器代码难于阅读,不要去管它。机器代码相比 C 或 C++ 而言,要难理解多了。但是,计算机只能理解机器语言。只有将你的程序编译—或翻译—成机器代码,然后再执行它,这样运行起来才能快一些,而不是每次运行时才去翻译它,那样运行速度就很慢了。
你只需选定一个源文件,然后从工程菜单中选择反汇编项,你就能看到该文件的机器语言清单。实际上上面我们看到的机器语言清单就是这样得到的。如果你仔细地对照阅读一下你的 C 或 C++ 源代码和它编译后的机器代码,不难发现它们之间的关系。
CodeWarrior 中编译选项的详细设置
在正式开始编译源代码之前,CodeWarrior 还要对其做预处理。在这个阶段,是对 C 或 C++ 代码进行编译前的一些准备工作。在编写程序的过程中,往往会有很多相同的P代码输入,于是程序员使用一些快捷方式,比如所谓的宏(macros)来代替这些相同的输入。例如,你可以使用 APPNAME 作为一个宏,来表示“Metrowerks CodeWarrior”,以此来减少输入的工作量。预处理就是要把这些宏转换为它们实际表示的代码,此外还要替换一些定义符号(比如 #define THREE 3)为实际的源代码。为了更好地理解预处理所做的工作,你可以查看一下预处理结果的清单。首先在工程窗口中选中一个源文件,然后从工程菜单中选择预处理项,你就可以看到源代码进行了预处理之后,编译之前的结果清单了。.
定制 CodeWarrior 的编译方式
在第二课中,我们已经了解了一些控制 CodeWarrior 编译代码的选项对话框。现在我们再来详细地看看一些标准 C 和 C++ 编译器的设置。请你按照下面所学内容在你的 CodeWarrior 上进行实际的操作练习。
图 3-1: 控制 CodeWarrior 进行 C/C++ 编译的语言设置
从编辑菜单中选择 Hello World X86 setting 项来打开设置窗口。现在,点击图3-1中语言设置类(Language Settings category)下面的 C/C++ 语言标签(C/C++ Language label),你就可以看到C/C++ 语言设置对话框中的许多选项。我们逐个来学习这些选项,以便了解它们是如何影响编译的过程的。
•激活 C++ 编译器(Activate C++ Compiler): 这个选项允许你把所有后缀名为 .c 的源文件当作 C++ 文件来进行编译。如果你想在 C 源码中使用一些 C++ 语言的特色时,这个功能就很有用了;
•ARM 一致性(ARM Conformance): 编译器要求你的代码遵循 ANSI C++ 的标准。但你可以通过选中这个选项来指定你的编译器遵循“注释 C++ 参考手册”(Annotated C++ Reference Manual,ARM) 中的标准;
•允许 C++ 例外(Enable C++ Exceptions): 选中这个选项将允许你在 C++ 代码中使用 try/catch/throw 等块(blocks)。这些方法用于书写错误管理器(writing error handlers);
•允许 RTTI(Enable RTTI): RTTI 表示 Run Time Type Information(运行类型信息)。这个选项运息编译器判定一个 C++ 对象在运行中的类型。这是 C++ 的高级特色,在很多情况下都是很有用的。如果你想了解更多关于 RTTI 的信息,请查阅你的 C++ 手册;
•内联深度/自动内联/延期内联(Inline Depth/Auto-Inline/Deferred Inlining): 这些是关于源代码中使用的内联函数的一些设置项。所谓内联函数,就是在编译时该函数的源代码将被直接插入到程序体中,而不是产生这个函数的调用。在某些情况下,使用内联函数可以提高代码的性能。这是编译器的一个高级设置项;
•字符串池(Pool Strings): 通常,编译器将编译后代码中的所有字符串对象存储到它们自己的数据空间中。如果你想将所有的这些字符串存放到一个数据空间中,就应该选中此项。如果你的源代码中有非常多的字符串(比如上面我们提到的 APPNAME 宏),那么你就应该选中此项来节省内存空间。这个功能只能在使用 PowerPC 的 Mac OS 平台上编程时使用;
•不重用字符串(Don't Reuse Strings): 所谓“重用字符串”,就是指当你的程序中有几个完全一样的字符串时,编译器会将它们全都存放到同一个数据空间里。但是,有时你可能想修改某个字符串,这就会造成其它和要修改字符串共享数据空间的字符串也要被修改了。如果你想避免这种情况,就要选中这个选项。这样,即使程序中有完全一样的字符串,它们也将存放到不同的数据空间中;
•要求有函数原型(Require Function Prototypes): 建议最好选中此项。使用函数原型可以帮助编译器在检查传递给函数的参数类型时发现代码中的错误。所谓函数原型就是在程序的前端对函数进行声明。也就是说,你应当在使用一个函数之前,定义或声明这个函数。既然这个选项这么好,那么什么时候应该关掉它呢?通常是,当你使用一些老版本的 C 进行编程时,并不需要对函数进行事先声明,这时你就应该关掉这个选项来检查整个程序代码。你一定愿意在程序前端书写函数原型,并选中此选项,因为它能帮助你解决这么多的编码错误;
•允许支持布尔变量(Enable bool Support): 为了使用 C++ 的布尔变量—— true (真) 和 false (假) 这两个关键字,必须选中此项;
•允许支持wchar_t(Enable wchar_t Support): 为了使用 C++ 的内置类型 wchar_t 而不是 char 类型来表示字符类型,必须选中此项;
•严格遵循ANSI/只能使用 ANSI 关键字(ANSI Strict/ANSI Keywords Only): 默认情况下,编译器允许你使用 Metrowerks 扩展和 C/C++ 语言的附加关键字。但是如果你想在这种情况时编译报错,那么就应该选中这两项。这样,编译出来的程序就是 100% ANSI 兼容代码;
•扩展通配符(Expand Trigraphs): 默认情况下,通配符是不允许的。为了能够使用通配符,就要选中此项。所谓通配符,就是在你的源代码中代表字符常量的方式。例如, '????' 就是一个通配符;
•多字节敏感(Multi-Byte Aware): 如果你是有能够的编程语言要求使用多字节字符(例如 Kanji 或 Unicode),就要选中此项。这样编译器才能正确地处理源代码中的多字节字符;
•指示到 SOM (Direct to SOM): 这是 Macintosh 平台上才有的功能,它允许你在 CodeWarrior 中直接创建 SOM 代码。SOM 是一种使用于苹果机上的开放文档环境的代码类型,但现在已没人用了;
•使用 CR 为换行符(Map Newlines to CR): 这个选项允许你交换 '\n' 和 '\r' (这是用于标识源码行结束的符号)。此选项只对 Mac OS 上的编程有用;
•不严格的指针类型规则(Relaxed Pointer Type Rules): 选中此项将把 char *, unsigned char *, void * 和 Ptr 当作是同一种类新。当你从另一个并没有正确管理指针类型的源代码中,或者是从一个使用还不能正确支持这些类型的老编译平台上开发的源代码中继承代码时,这个选项就很有用了;
•枚举类型总是整型(Enums Always Ints): 通常情况下,编译器将一个枚举类型分配与之最接近的类型同样的空间。如果你想使枚举类型的空间总是和整型一样,那么就要选中此项。所谓枚举类型就像这样:enum {itemone, itemtwo = 7, itemthree}。其中,itemone 等于 0,itemtwo 等于 7,itemthree 等于 8;
•使用无符号字符类型(Use Unsigned Chars): 选中此项将把所有字符数据类型当作无符号字符类型来处理;
•EC++ 兼容模式(EC++ Compatibility Mode): 使用 CodeWarrior 编译嵌入式 C++ (EC++) 代码时,要选中此项。请注意,此时 C++ 中的诸如模板(templates)、例外(exceptions)和其它一些 C++ 的高级功能就不可用了。具体情况请查阅 C++ 手册;
•允许 Objective C(Enable Objective C): 为了使用 Objective C (在 NeXT 计算机操作系统上很著名的编程语言),要选中此项。此选项只能在 Mac OS 下只用;
•前缀文件(Prefix File): 如果需要在每个源文件中包括一个头或预编译头文件,就要将该文件名输入在此处。适用情况:当所有源文件都要访问一个特殊的定义,但你又不想在每个源文件中键入 #include 来包括该定义时,使用此选项很方便。
注意: 上述许多选项在 Mac OS 和 Windows 平台上的 CodeWarrior 编译器版本中都是一样的。但根据我们在上面对这些编译器选项的描述可知,一些选项在两个平台上还是有一些不同的。然而,C 和 C++ 时平台无关的编程语言,因此大多数概念都是可以应用于任何平台的。
附原文:
Step One: Understanding the Compile Phase
As I've discussed in previous lessons, compiling your source code is the next step after you type it in. When you compile your source code, your computer parses and analyzes the code that you've written (usually in C or C++) and converts it to machine language. Machine language is a programming language that your computer can understand and act upon much more quickly than C or C++. The life cycle for you and your code looks like this:
1. Write the code in C or C++:
<p>#include <stdio.h>
<p>void main(void) { int c; printf("Hello World on Windows!\n"); c = getchar();
}
2. Choose Compile. CodeWarrior then compiles the code and translates it into machine language:
SYMBOL NAMES:
<p>1: _main 2: _@8 3: _printf 4: ___files 5: _fwide 6: ___get_char
=============================================================== SECTION SIZE = 0x0000003C;
NAME =?.text DEFINED SYMBOLS: name = _main offset = 0x00000000; type = 0x0020; class =
0x0002 00000000: 68 00 00 00 00 push offset _@8 00000005: E8 00 00 00 00 call _printf
0000000A: 59 pop ecx 0000000B: 6A FF push -1 0000000D: 68 00 00 00 00 push offset ___files
00000012: E8 00 00 00 00 call _fwide 00000017: 85 C0 test eax,eax 00000019: 59 pop ecx
0000001A: 59 pop ecx 0000001B: 7D 1E jge $+32 ; --> 0x003b 0000001D: 83 2D 2C 00 00 00
sub dword ptr ___files+44,1 00000023: 01 00000024: 72 0A jb $+12 ; --> 0x0030 00000026:
FF 05 28 00 00 00 inc dword ptr ___files+40 0000002C: EB 0D jmp $+15 ; --> 0x003b
0000002E: 89 C0 mov eax,eax 00000030: 68 00 00 00 00 push offset ___files 00000035: E8 00
00 00 00 call ___get_char 0000003A: 59 pop ecx 0000003B: C3 ret near
============================================================= SECTION SIZE = 0x00000019;
NAME = .data 00000000: 48 65 6C 6C 6F 20 57 6F 72 6C 64 20 6F 6E 20 57 Hello World on W
00000010: 69 6E 64 6F 77 73 21 0A 00 indows! DEFINED SYMBOLS: name = _@8 offset =
0x00000000; type = 0x0000; class = 0x0003
==============================================================
That's hard to read, let alone type correctly. Machine code is a bit more difficult for a human to read than C or C++ code. However, it's just what the computer needs, and your application will operate much faster if you compile -- or translate -- it into machine language at the outset than if it has to make that translation every time you ran it.
You can view the disassembly (machine code listing), as we have done above, for any source file by simply clicking on it in the Project window and selecting Disassemble from the Project menu. In fact, the assembly code listing shown above was produced this way. By looking at your C or C++ source code and the disassembled code side by side, you will begin to see the relationships between the two.
A Detailed Look at Compiling Options in CodeWarrior
Before your code is actually compiled, CodeWarrior preprocesses it. This preprocessing phase prepares the C or C++ code to be compiled. Programmers have plenty of typing ahead of them, so they sometimes use shortcuts, known as macros. These macros allow you to type things like APPNAME when you really mean "Metrowerks CodeWarrior" and save keystrokes as you type your source code. Preprocessing converts the macro text you've typed into the code it represents. The preprocessor also substitutes defined symbols (such as #define THREE 3) in the source code. To get a better feel for what the preprocessor accomplishes, you can examine a listing of its output. To do this, pick a source file in the Project window and select Preprocess from the Project menu. The output file that appears shows the preprocessed source code as the compiler will actually see it.
Customizing the Way CodeWarrior Compiles
In Lesson 2, I showed you a few of the dialog boxes that control the way CodeWarrior compiles code. Let's take a closer look at some standard C and C++ compiler settings. Please follow along in your copy of CodeWarrior.
Figure 3-1: The C/C++ Language Settings control the way CodeWarrior deals with C and C++ code when compiling.
Open the Settings window by selecting Project Hello World x86 Settings from the Edit menu. Now, click the C/C++ Language label under the Language Settings category (Figure 3-1). You will see that there are numerous options in the C/C++ Language Settings Dialog Box. Let's take a look at some of these items one by one and see how they affect the compilation process.
•Activate C++ Compiler: This option allows you to compile all .c source files as if they were .cpp files. This can be helpful if you want to use some of the features of the C++ programming language in your C source code.
•ARM Conformance: The compiler expects your code to follow the ANSI C++ standard. You can direct the compiler to instead follow the standard specified in the Annotated C++ Reference Manual (ARM) by enabling this feature.
•Enable C++ Exceptions: Enabling this feature allows you to use try/catch/throw blocks in your C++ code. These methods help with writing error handlers.
•Enable RTTI: RTTI stands for Run Time Type Information. This allows the compiler to figure out the type of a C++ object at run-time. This is an advanced feature of C++ that can be helpful in various situations. See your C++ manual for more information on RTTI.
•Inline Depth/Auto-Inline/Deferred Inlining: These options refer to the inlining of functions in your source code. That is, rather than generate the function call, the compiler inserts the function's code directly into the program body. In certain situations, this can improve code performance. This is an advanced compiler feature.
•Pool Strings: Normally, the compiler will store all of your string objects in their own data space within your compiled code. If you would like all strings to be stored in just one data space, you can enable this feature. If you have a large number of strings in your source code (such as the APPNAME macro we discussed earlier), you should enable this feature to save memory. Be aware that although this checkbox is present in the Windows version of CodeWarrior, the feature only works for Mac OS PowerPC program code.
•Don't Reuse Strings: If you have multiple strings that are the same throughout your program, the compiler will store them all in the same data space -- that is, it will reuse them. However, there are times when you may alter the string literal in place, in which case other strings that share its data space would also be altered. If you do not want this to happen, enable this feature, and all strings, even if they are identical, will be stored in separate data spaces.
•Require Function Prototypes: It's a good idea to keep this feature enabled. Function prototypes allow you to more easily find errors in your source code by assisting the compiler in verifying data types that you pass to your functions. A function prototype is also known as a forward declaration. That is, you define (declare) the function before (forward) you use it. If this feature is so good, when might you turn it off? Typically, when you're working with old C programs that never made forward declarations of their function calls. You'd disable this option only to check the integrity of the program code. However, you'd want to write function prototypes for the program and reactivate this feature quickly, because it solves so many coding problems.
•Enable bool Support: To use the C++ bool, true and false keywords, enable this feature.
•Enable wchar_t Support: To use the C++ wchar_t built-in type for characters rather than char, enable this feature.
•ANSI Strict/ANSI Keywords Only: By default, the compiler allows you to use Metrowerks extensions and additional keywords in the C and C++ languages. If you would like an error to be generated when you try to do this, enable the ANSI Strict and/or ANSI Keywords Only feature. This will ensure that you only compile 100 percent ANSI-compatible code.
•Expand Trigraphs: By default, trigraphs are ignored. To expand them, you can enable this feature. Trigraphs have to do with the way character constants are represented in your source code. For example, '????' is a trigraph.
•Multi-Byte Aware : If you are programming in a language that requires the use of multi-byte characters (like Kanji or Unicode), you will want to enable this feature. This enables the compiler to properly handle multi-byte characters in the source code.
•Direct to SOM: This is a Macintosh-only feature that allows you to create SOM code directly in CodeWarrior. SOM is a type of code used in the now defunct OpenDoc environment from Apple Computer.
•Map Newlines to CR: This feature allows you to swap '\n' and '\r' (the values for line feed and carriage return, which mark the end of a source code line). This is feature is only useful for Mac OS programs.
•Relaxed Pointer Type Rules: Enabling this feature will treat char *, unsigned char *, void * and Ptr as the same type. This can be helpful when you inherit code from another source in which the programmer did not properly manage pointer types, and/or the developer used an old compiler that did not handle these types properly.
•Enums Always Ints: Normally, the compiler will make an enumerated type the size of the closest type. If you would like types to always be the size of an int, you can enable this feature. An enumerated type is something like this: enum {itemone, itemtwo = 7, itemthree}. In this case, itemone would be equal to 0, itemtwo would be equal to 7, and itemthree would be equal to 8.
•Use Unsigned Chars: Enabling this feature will cause all char data types to be treated as if they were unsigned char.
•EC++ Compatibility Mode: Enable this feature to compile embedded C++ (EC++) code. Note that certain C++ goodies such as templates, exceptions, and other advanced C++ features aren't available in EC++. See your C++ manual for information.
•Enable Objective C: To enable Objective C (made famous in the NeXT computer operating system) you can check this checkbox. This is another Mac OS-only language feature.
•Prefix File: To include a header or precompiled header in every source file, type the name here. This can be useful if you have specific definitions that you want all source files to have access to, but do not want to type the #include line in the source files themselves.
Note: Many of these features are identical on both the Mac OS and Windows CodeWarrior compilers. As pointed out in the descriptions of the compiler settings above, there are some differences between the two. However, C and C++ are platform-independent languages, so most of these concepts apply on any platform.