Chapter 1 Introduction 1
1.1 Overview and History 1
1.2 What Do Compilers Do? 3
1.3 The Structure of a Compiler 8
1.4 The Syntax and Semantics of Programming Languages 14
1.5 Compiler Design and Programming Language Design 16
1.6 Compiler Classifications 18
1.7 Influences on Computer Design 19
Exercises 21
Chapter 2 A Simple Compiler 23
2.1 The Structure of a Micro Compiler 24
2.2 A Micro Scanner 25
2.3 The Syntax of Micro 30
2.4 Recursive Descent Parsing 33
2.5 Translating Micro 38
2.5.1 Target Language 38
2.5.2 Temporaries 39
2.5.3 Action Symbols 39
2.5.4 Semantic Information 40
2.5.5 Action Symbols for Micro 41
Exercises 47
Chapter 3 Scanning Theory and Practice 50
3.1 Overview 50
3.2 Regular Expressions 52
3.3 Finite Automata and Scanners 55
3.4 Using a Scanner Generator 59
3.4.1 ScanGen 59
3.4.2 Lex 64
3.5 Practical Considerations 70
3.5.1 Reserved Words 70
3.5.2 Compiler Directives and Listing Source Lines 72
3.5.3 Entry of Identifiers into the Symbol Table 73
3.5.4 Scanner Termination 74
3.5.5 Multicharacter Lookahead 74
3.5.6 Lexical Error Recovery 76
3.6 Translating Regular Expressions into Finite Automata 78
3.6.1 Creating Deterministic Automata 80
3.6.2 Optimizing Finite Automata 84
Exercises 86
Chapter 4 Grammars and Parsing 91
4.1 Context-Free Grammars: Concepts and Notation 91
4.2 Errors in Context-Free Grammars 95
4.3 Transforming Extended BNF Grammars 98
4.4 Parsers and Recognizers 98
4.5 Grammar Analysis Algorithms 100
Exercises 108
Chapter 5 LL(1) Grammars and Parsers 111
5.1 The LL(1) Predict Function 112
5.2 The LL(1) Parse Table 115
5.3 Building Recursive Descent Parsers from LL(1) Tables 116
5.4 An LL(1) Parser Driver 120
5.5 LL(1) Action Symbols 121
5.6 Making Grammars LL(1) 123
5.7 The If-Then-Else Problem in LL(1) Parsing 127
5.8 The LLGen Parser Generator 129
5.9 Properties of LL(1) Parsers 133
5.10 LL(k)Parsing 134
Exercises 137
Chapter 6 LR Parsing 140
6.1 Shift-Reduce Parsers 141
6.2 LR Parsers 144
6.2.1 LR(0) Parsing 145
6.2.2 How Can We Be Sure LR(0) Parsers Work Correctly? 153
6.3 LR(1) Parsing 155
6.3.1 Correctness of LR(1) Parsing 159
6.4 SLR(1) Parsing 161
6.4.1 Correctness of SLR(1) Parsing 164
6.4.2 Limitations of the SLR(1) Technique 165
6.5 LALR(1) 167
6.5.1 Building LALR(1) Parsers 171
6.5.2 Correctness of LALR(1) Parsing ! 77
6.6 Calling Semantic Routines in Shift-Reduce Parsers 178
6.7 Using a Parser Generator 180
6.7.1 The LALRGen Parser Generator 180
6.7.2 Yacc 184
6.7.3 Uses (and Misuses) of Controlled Ambiguity 187
6.8 Optimizing Parse Tables 190
6.9 Practical LR(1) Parsers 194
6.10 Properties of LR Parsing 197
6.11 LL(1) orLALR(1), That Is the Question 198
6.12 Other Shift-Reduce Techniques 202
6.12.1 Extended Lookahead Techniques 202
6.12.2 Precedence Techniques 203
6.12.3 General Context-Free Parsers 205
Exercises 208
Chapter 7 Semantic Processing 216
7.1 Syntax-directed Translation 317
7.1.1 Using a Syntax Tree Representation of a Parse 217
7.1.2 Compiler Organization Alternatives 219
7.1.3 Parsing, Checking, and Translation in a Single Pass 225
7.2 Semantic Processing Techniques 227
7.2.1 LL Parsers and ActionSymbols 227
7.2.2 LR Parsers and Action Symbols 228
7.2.3 Semantic Record Representations 230
7.2.4 Implementing Action-controlled Semantic Stacks 232
7.2.5 Parser-controlled Semantic Stacks 236
7.3 Intermediate Representations and Code Generation 246
7.3.1 Intermediate Representations versus Direct Code Generation 246
7.3.2 Forms of Intermediate Representations 247
7.3.3 A Tuple Language 250
Exercises 252
Chapter 8 Symbol Tables 254
8.1 A Symbol Table Interface 255
8.2 Basic Implementation Techniques 256
8.2.1 Binary Search Trees 257
8.2,2 Hash Tables 257
8.2.3 String Space Arrays 259
8.3 Block-Structured Symbol Tables 261
8.4 Extensions to Block-Structured Symbol Tables 267
8.4.1 Fields and Records 267
8.4.2 Export Rules 269
8.4.3 Import Rules 274
8.4.4 Altered Search Rules 277
8.5 Implicit Declarations 279
8.6 Overloading 280
8.7 Forward References 282
8.8 Summary 284
Exercises 284
Chapter 9 Run-Time Storage Organization 287
9.1 Static Allocation 288
9.2 Stack Allocation 289
9.2.1 Displays 292
9.2.2 Block-level and Procedure-level Activation Records 295
9.3 Heap Allocation 296
9.3.1 No Deallocation 297
9.3.2 Explicit Deallocation 298
9.3.3 Implicit Deallocation 298
9.3.4 Managing Heap Space 301
9.4 Program Layout in Memory 302
9.5 Static and Dynamic Chains 305
9.6 Formal Procedures 307
9.6.1 Static Chains 309
9.6.2 Displays 311
9.6.3 Perspective 312
Exercises 313
Chapter 10. Processing Declarations 319
10.1 Declaration Processing Fundamentals 320
10.1.1 Attributes in the Symbol Table 320
10.1.2 Type Descriptor Structures 321
10.1.3 Lists in the Semantic Stack 323
10.2 Action Routines for Simple Declarations 326
10.2.1 Variable Declarations 326
10.2.2 Type Definitions, Declarations, and References 330
10.2.3 Record Types 335
10.2.4 Static Arrays 338
10.3 Action Routines for Advanced Features 340
10.3.1 Variable and Constant Declarations 340
10.3.2 Enumeration Types 343
10.3.3 Subtypes 346
10.3.4 Array Types 349
10.3.5 Variant Records 359
10.3.6 Access Types 364
10.3.7 Packages 366
10.3.8 The' attributes and semantic record Structures 370
Exercises 375
Chapter 11 Processing Expressions and Data Structure References 378
11.1 Introduction 378
11.2 Action Routines for Simple Names, Expressions,and Data Structures 380
11.2.1 Handling Simple Identifiers and Literal Constants 380
11.2.2 Processing Expressions 382
11.2.3 Simple Record and Array References 387
11.2.4 Record and Array Example 390
11.2.5 Strings 390
11.3 Action Routines for Advanced Features 394
11.3.1 Multidimensional Array Organization and References 394
11.3.2 Records with Dynamic Objects 406
11.3.3 Variant Records 411
11.3.4 Access-Type References 411
11.3.5 Other Uses of Names in Ada 413
11.3.6 Record and Array Aggregates 416
11.3.7 Overload Resolution 418
Exercises 422
Chapter 12 Translating Control Structures 426
12.1 if Statements 427
12.2 1OOpS 431
12.2.1 while loops 432
12.2.2 for loops 433
12.3 Compiling exits 440
12.4 The case Statement 445
12.5 Compiling goto Statements 452
12.6 Exception Handling 457
12.7 Short-circuit Boolean Expressions 463
12.7.1 One-address Short-circuit Evaluation 471
Exercises 479
Chapter 13 Translating Procedures and Functions 484
13.1 Simple Subprograms 485
13.1.1 Declaring Subprograms without Parameters 485
13.1.2 Calling Parameterless Procedures 488
13.2 Passing Parameters to Subprograms 489
13.2.1 Value, Result, and Value-Result Parameters 490
13.2.2 Reference and Read-only Parameters 492
13.2.3 Semantic Routines for Parameter Declarations 493
13.3 Processing Subprogram Calls and Parameter Lists 495
13.4 Subprogram Invocation 498
13.4.1 Saving and Restoring Registers 498
13.4.2 Subprogram Entry and Exit 500
13.5 Label Parameters 503
13.6 Name Parameters 506
Exercises 508
Chapter 14 Attribute Grammars and Multipass Translation 510
14.1 Attribute Grammars 511
14.1.1 Simple Assignment Form and Action Symbols 514
14.1.2 Tree-Walk Attribute Evaluators 515
14.1:3 On-the-Fly Attribute Evaluators 524
14.1.4 An Attribute Grammar Example 531
14.2 Tree-structured Intermediate Representations 534
14.2.1 Interfaces to Abstract Syntax Trees 536
14.2.2 Abstract Interfaces to Syntax Trees 538
14.2.3 Implementing Trees 544
Exercises 544
Chapter 15 Code Generation and Local Code Optimization 546
15.1 An Overview 547
15.2 Register and Temporary Management 548
15.2.1 Classes of Temporaries 550
15.2.2 Allocating and Freeing Temporaries 551
15.3 A Simple Code Generator 551
15.4 Interpretive Code Generation 555
15.4.1 Optimizing Address Calculation 556
15.4.2 Avoiding Redundant Computations 559
15.4.3 Register Tracking 562
15.5 Peephole Optimization 572
15.6 Generating Code from Trees 574
15.7 Generating Code from Dags '578
15.7.1 Aliasing 586
15.8 Code Generator Generators 589
15.8.1 Grammar-Based Code Generators 593
15.8.2 Using Semantic Attributes in Code
Generators 597
15.8.3 Generation of Peephole Optimizers 602
15.8.4 Code Generator Generators Based on Tree Rewriting 605
Exercises 605
Chapter 16 Global Optimization 614
16.1 An Overview--Goals and Limits 614
16.1.1 An Idealized Optimizing Compiler
Structure 617
16.1.2 Putting optimization in Perspective 621
16.2 Optimizing Subprogram Calls 622
16.2.1 lnline Expansion of Subprogram Calls 622
16.2.2 Optimizing Calls of Closed Subroutines 625
16.2.3 Interprocedural Data Flow Analysis 630
16.3 Loop Optimization 636
16.3.1 Factoring Loop-invariant Expressions 637
16.3.2 Strength Reduction in Loops 641
16.4 Global Data Flow Analysis 645
16.4.1 Any-Path Flow Analysis 645
16.4.2 All-Paths Flow Analysis 650
16.4.3 A Taxonomy of Data Flow Problems 652
16.4.4 Other Important Data Flow Problems 652
16.4.5 Global Optimizations Using Data Flow Information 655
16.4.6 Solving Data Flow Equations 660
16.5 Putting It All Together 673
Exercises 675
Chapter 17 Parsing in the Real World 685
17.1 Compacting Tables 686
17.1.1 Compacting LL(1) Parse Tables 691
17.2 Syntactic Error Recovery and Repair 691
17.2.1 Immediate Error Detection 694
17.2.2 Error Recovery in Recursive Descent Parsers 695
17.2.3 Error Recovery in LL(1) Parsers 698
17.2.4 The FMQ LL(1) Error-Repair
Algorithm 698
17.2.5 Adding Deletions to the FMQ Repair Algorithm 703
17.2.6 Extensions to the FMQ Algorithm 706
17.2.7 Error Repair Using LLGen 712
17.2.8 LR Error Recovery 713
17.2.9 Error Recoveryin Yacc 714
17.2.10 Automatically Generated LR Repair Techniques 715
17.2.11 Error Repair Using LALRGen 723
17.2.12 Other LR Error-Repair Techniques 724
Exercises 726
Appendix A Definition of Ada/CS 730
Appendix B ScanGen 759
Appendix C LLGen User Manual 768
Appendix D LALRGen User Manual 777
Appendix E Error-Repair Features of LLGen and LALRGen 787
Appendix F Compiler Development Utilities 792
Bibliography 799
Index 806