1、支持unicode
2、支持utf-8
3、例子如下:
VarDeclaration<ref int adr> (. string name; TypeDesc type; .)
= Ident<out name> (. Obj x = symTab.Enter(name);
int n = 1; .)
{ ',' Ident<out name> (. Obj y = symTab.Enter(name);
x.next = y; x = y;
n++; .)
}
':' Type<out type> (. adr += n * typ.size;
for (int a = adr; x != null; x = x.next) {
a -= type.size;
x.adr = a;
} .)
';' .
The core of this specification is the EBNF production
VarDeclaration = Ident {',' Ident} ':' Type ';'.
void VarDeclaration(ref int adr) {
string name; TypeDesc type;
Ident(out name);
Obj x = symTab.Enter(name);
int n = 1;
while (la.kind == comma) {
Get();
Ident(out name);
Obj y = symTab.Enter(name);
x.next = y; x = y;
n++;
}
Expect(colon);
Type(out type);
adr += n * type.size;
for (int a = adr; x != null; x = x.next) {
a -= type.size;
x.adr = a;
}
Expect(semicolon);
}
4、表达式规则如下:
ident = letter {letter | digit}.
number = digit {digit}.
string = '"' {anyButQuote} '"'.
char = '/'' anyButApostrophe '/''.
// backslash /r carriage return /f form feed
/' apostrophe /n new line /a bell
/" quote /t horizontal tab /b backspace
/0 null character /v vertical tab /uxxxx hex char value
The following identifiers are reserved keywords (in the C# version of Cocol/R the
identifier using is also a keyword, in the Java version the identifier import):
ANY CONTEXT IGNORE PRAGMAS TOKENS
CHARACTERS END IGNORECASE PRODUCTIONS WEAK
COMMENTS FROM NESTED SYNC
COMPILER IF out TO
Comments are enclosed in /* and */ and may be nested. Alternatively they can start
with // and go to the end of the line.
EBNF
All syntax descriptions in Cocol/R are written in Extended Backus-Naur Form
(EBNF) [Wirth77]. By convention, identifiers starting with a lower case letter denote
terminal symbols, identifiers starting with an upper case letter denote nonterminal
symbols. Strings denote themselves. The following meta-characters are used:
symbol meaning example
= separates the sides of a production A = a b c .
. terminates a production A = a b c .
| separates alternatives a b | c | d e means a b or c or d e
( ) groups alternatives (a | b) c means a c or b c
[ ] option [a] b means a b or b
{ } iteration (0 or more times) {a} b means b or a b or a a b or ...
Attributes are written between < and >. Semantic actions are enclosed in (. and .).
The operators + and - are used to form character sets.
2.2 Overall Structure
A Cocol/R compiler description has the following structure:
Cocol =
[Imports]
"COMPILER" ident
[GlobalFieldsAndMethods]
ScannerSpecification
ParserSpecification
"END" ident '.'
.