本书以 编译原理及实践 为textbook
Chapter Eight. Code Generation
a compiler typically breaks up this phase into several steps, involving various intermediate data structures, often including some form of abstract code called intermediate code.
issues:
- Two popular forms of intermediate code(three address code and P-code)
- The basic algorithms for generating intermediate or assembly code.
- Code generation techniques for various language features are discussed.
- Apply the techniques studied in previous sections to develop an assembly code generator for the TINY language.
intermediate representation(IR): A data structure that represents the source program during translation
- Three-Address Code
- P-Code
Three-Address Code: x = y op z
Data Structures: four fields are necessary: one for the operation and three for the addresses.
A different implementation of three-address code is use the instructions themselves to represent the temporaries. Such an implementation of three-address code is called a triple.Triples have one major drawback: any movement of their positions becomes difficult.
P-Code:
began as a standard target assembly code produced by a number of Pascal compilers
designed to be the actual code for a hypothetical stack machine
Comparison of P-Code to Three-Address Code:
- P-code is in many respects closer to actual machine code than three-address code.
- P-code instructions also require fewer addresses.
- P-code is less compact than three-address code in terms of numbers of instructions.
- P-code is not “ self-contained ” in that the instructions operate implicitly on a stack.
Basic Code Generation Techniques:
- Intermediate Code or Target Code as a Synthesized Attribute
pseudo code:
Procedure gencode (T: treenode);
Begin
If T is not nil then
Generate code to prepare for code of left child of T;
Gencode(left child of T);
Generate code to prepare for code of right child of T;
Gencode(right child of T);
Generate code to implement the action of T;
End;
Generation of Target Code from Intermediate Code
the final code generation pass must supply:
- all the actual locations of variables and temporaries
- the code necessary to maintain the runtime environment.
macro expansion: replacing each kind of intermediate code instruction with an equivalent sequence of target code instructions.
static simulation: a straight-line simulation of the effects of the intermediate code and generating target code to match these effects.
Address Calculations:
- Three-Address Code for Address Calculations
an enumerated AddrMode field with possible values None, Address, and Indirect. - P-Code for Address Calculations
ind ixa
Each address must be computed :
- the base address of a (its starting address in memory )
- an offset that depends linearly on the value of the subscript.
The offset is computed from the subscript value as follows.
- an adjustment must be made to the subscript value if the subscript range does not begin at 0.
- the adjusted subscript value must be multiplied by a scale factor .
- the resulting scaled subscript is added to the base address to get the final address .
Record Structure and Pointer References:
- the base address of the structure variable is computed.
- the (usually fixed) offset of the named field is found
- the two are added to get the resulting address.
Code Generation for If – and While
Code Generation of Logical Expressions
Code Generation of Procedure and Function Calls