Syntax: a set of precise rules, what legal (合法的) programs look at before it runs, not specifing what the language will do when it runs
Semantics: meaning of code, what it does when I run it
Imperative languages (C, C++, Java)
Functional languages (Lisp and Haskell)
Programming paradigm: a particular style or approach of programming, that is independent of the language.
Programming methodology: How to write good code, how to best use the language, so depends on the particular programming language
Compilation
The compiler translates the high-level source program into an equivalent target program (machine language). Then at some arbitrary later time, the user tells the operating system to run the target program.
- Upper graph: program written in high-level languages --> compiler written in machine language translates input to the same machine language --> output corresponding program written in machine language.
- Lower graph: Corresponding program written in machine language & input for origianl program --> executed as original program --> output as original program
-
Cross-compilers: compilers written and executed in machine language M, but translates program to language M', because M' machine may not have enough memory, etc., to host a compiler (i.e. not capable of compiling)
-
Compilation Methods (for any new language L')
- Generic way: Code in L' --> write Executable Compiler L' to M
--> Code in M
- Alt. 1 (utilize intermediary language L): Write Compiler L' to M (
) --> Executable Compiler L to M (
) --> Execuatable Compiler L' to M (
). The goal is to write a compiler written in a new langage, utilize the compiler of an existing language, to get a compiler of a new language in Machine Language.
-
Alt. 2 (utilize intermediary language L): Write Compiler L' to L(
) --> Executable Compiler L to M (
) --> Executable Compiler L' to L (
). Code in L' pass through compiler L' to L, output Code in L, then pass through compiler L to M, then we get Code in M.
-
Compilation Methods (for any new Machine M') -- Use intermediary language Int
Implement Compiler L to Int (), Implement Executable Compiler Int to M (
), pass Code in L to
, output Code in Int, pass it to
, we get Code in M.
Interpreter
-
“To interpret” means “to execute” without first compiling, i.e., translating to machine language
-
Code in L --> create one "exec" function in interpreter for each statement in code --> Interpreter loop through the code and call exec func --> All statements executed, interpretion ended
Using an existing interpreter (L)
- Write intepreter for L' in L
(not executable), utilize existing executable interpreter
- Use interpreter
to interpret (i.e. execute)
- While executing
, pass in Code in L', it will run just fine on the machine
**Note: less efficient, becuase has to execute
then execute Code in L', which means that the same code in L' has been executed twice by both interpreters.
Just-in-time (JIT) compiling and Javac compiler
javac translates its .java code to java bytecode, a machine-independent intermediate form, then can be either interpreted by JVM and execute OR compiled with a just-in-time (JIT) compiler that translates bytecode into machine language immediately before each execution of the program.
- Can we combine advantages of JVM security with speed advantage of compilation
- Selective compilation: If a loop occurs -- consumes a long time, jvm compiles that segment into machine code and run it on the machine (instead of interpreting it everytime, which is slow), the rest are interpreted.
Other Information
- Compilation advantange: leads to better performance. Divide the workload in execution into compilation, so the program shall run faster.
-
Compiler is itself a machine language program, presumably created by compiling some other high-level program.
-
When written to a file in a format understood by the operating system, machine language is commonly known as object code.
-
Interpretation, on the other hand, stays around for the execution of the application, which leads to greater flexibility and better diagnostics
-
The interpreter implements a virtual machine whose “machine language” is the high-level programming language. The interpreter reads statements in that language more or less one at a time, executing them as it goes along.
-
Delaying decisions about program implementation until run time is known as late binding;
-
Mixing compilation and interpretation:
We generally say that a language is “interpreted” when the initial translator is simple.We say that a language is compiled if the translator analyzes it thoroughly (rather than effecting some “mechanical” transformation), and if the intermediate program does not bear a strong resemblance to the source.Implementation Strategies:
-
Initial translator (a preprocessor) in interpreted languages: removes comments and white space; and groups characters together into tokens such as keywords, identifiers, numbers, and symbols; expand abbreviations in the style of a macro assembler; identify higher-level syntactic structures, such as loops and subroutines. The goal is to produce an intermediate form that mirrors the structure of the source, but can be interpreted more efficiently.
-
Compiler + Linker: The compiler relies on a separate program, known as a linker, to merge the appropriate library routines into the final program.
- Compilers that generate assembly language instead of machine language: facilitates debugging, easier for people to read, isolates the compiler from changes in the format of machine language files.
- C compiler: Preprocessor + Compiler, providing a conditional compilation facility that allows several versions of a program to be built from the same source
- Source-to-source translation: Generate translation output in some high-level language —commonly C or some simplified version of the input language. Then compiled by C compiler into Assembly.
- Self-hosting compilers: compilers are written in the language they compile, C compilers in C. Compilers themselves are compiled with bootstrapping -- starts with a simple implementation—often an interpreter—and uses it to build progressively more sophisticated versions. Translate interpreter by hand to machine language, and run a higher-level compiler on the interpreter, and recursively run compilers until the boostrapping is finished.
- JVM is like a machine (that run on bytecode), Assembler is like a compiler (compile assembly to machine code), CPU is like an interpreter (that executes machine code)