从Sun的javac源码中抽取出来的LL(1) Java语法

该语法抽取自[url=http://download.java.net/openjdk/jdk6/]OpenJDK 6[/url] build 17中javac的语法分析器,[url=http://hg.openjdk.java.net/jdk6/jdk6/langtools/file/536fbf4fba1f/src/share/classes/com/sun/tools/javac/parser/Parser.java]j2se/src/share/classes/com/sun/tools/javac/parser/Parser.java[/url]
该代码以[url=http://www.gnu.org/licenses/gpl-2.0.html]GPLv2[/url]许可证开源。

注意Parser类的注释:
/** The parser maps a token sequence into an abstract syntax
* tree. It operates by recursive descent, with code derived
* systematically from an LL(1) grammar. For efficiency reasons, an
* operator precedence scheme is used for parsing binary operation
* expressions.
*
* <p><b>This is NOT part of any API supported by Sun Microsystems. If
* you write code that depends on this, you do so at your own risk.
* This code and its internal interfaces are subject to change or
* deletion without notice.</b>
*/

Sun JDK 6中的javac使用了递归下降与运算符优先级的混合解析方式。主要是递归下降式,在解析二元运算表达式时采用运算符优先级方式以提高解析效率。

下面是从各个语法分析方法前的注释中提取出来的LL(1)语法。顺序有调整。
该语法采用[url=http://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_Form]EBNF[/url]记法,其中[]表示可选(0或1个),{}表示任意个(0或多个),()表示分组,双引号内的是字面量,没有被双引号包围的名字是语法规则名。
有几条规则,如SuperSuffix,可能有多个定义;它们是从不同方法的注释中提取出来的,其内容本应该不矛盾,我还没仔细看清楚到底这里有什么问题。
CompilationUnit = [ { "@" Annotation } PACKAGE Qualident ";"] {ImportDeclaration} {TypeDeclaration}

AnnotationsOpt = { '@' Annotation }

ImportDeclaration = IMPORT [ STATIC ] Ident { "." Ident } [ "." "*" ] ";"

TypeDeclaration = ClassOrInterfaceOrEnumDeclaration
| ";"

ClassOrInterfaceOrEnumDeclaration = ModifiersOpt
(ClassDeclaration | InterfaceDeclaration | EnumDeclaration)

ModifiersOpt = { Modifier }
Modifier = PUBLIC | PROTECTED | PRIVATE | STATIC | ABSTRACT | FINAL
| NATIVE | SYNCHRONIZED | TRANSIENT | VOLATILE | "@"
| "@" Annotation

Annotation = "@" Qualident [ "(" AnnotationFieldValues ")" ]

AnnotationFieldValues = "(" [ AnnotationFieldValue { "," AnnotationFieldValue } ] ")"

AnnotationFieldValue = AnnotationValue
| Identifier "=" AnnotationValue

AnnotationValue = ConditionalExpression
| Annotation
| "{" [ AnnotationValue { "," AnnotationValue } ] [","] "}"

ClassDeclaration = CLASS Ident TypeParametersOpt [EXTENDS Type]
[IMPLEMENTS TypeList] ClassBody

InterfaceDeclaration = INTERFACE Ident TypeParametersOpt
[EXTENDS TypeList] InterfaceBody

EnumDeclaration = ENUM Ident [IMPLEMENTS TypeList] EnumBody

EnumBody = "{" { EnumeratorDeclarationList } [","]
[ ";" {ClassBodyDeclaration} ] "}"

EnumeratorDeclaration = AnnotationsOpt [TypeArguments] IDENTIFIER [ Arguments ] [ "{" ClassBody "}" ]

TypeList = Type {"," Type}

ClassBody = "{" {ClassBodyDeclaration} "}"
InterfaceBody = "{" {InterfaceBodyDeclaration} "}"

ClassBodyDeclaration =
";"
| [STATIC] Block
| ModifiersOpt
( Type Ident
( VariableDeclaratorsRest ";" | MethodDeclaratorRest )
| VOID Ident MethodDeclaratorRest
| TypeParameters (Type | VOID) Ident MethodDeclaratorRest
| Ident ConstructorDeclaratorRest
| TypeParameters Ident ConstructorDeclaratorRest
| ClassOrInterfaceOrEnumDeclaration
)
InterfaceBodyDeclaration =
";"
| ModifiersOpt Type Ident
( ConstantDeclaratorsRest | InterfaceMethodDeclaratorRest ";" )

MethodDeclaratorRest =
FormalParameters BracketsOpt [Throws TypeList] ( MethodBody | [DEFAULT AnnotationValue] ";")
VoidMethodDeclaratorRest =
FormalParameters [Throws TypeList] ( MethodBody | ";")
InterfaceMethodDeclaratorRest =
FormalParameters BracketsOpt [THROWS TypeList] ";"
VoidInterfaceMethodDeclaratorRest =
FormalParameters [THROWS TypeList] ";"
ConstructorDeclaratorRest =
"(" FormalParameterListOpt ")" [THROWS TypeList] MethodBody

QualidentList = Qualident {"," Qualident}

Qualident = Ident { DOT Ident }

TypeParametersOpt = ["<" TypeParameter {"," TypeParameter} ">"]

TypeParameter = TypeVariable [TypeParameterBound]
TypeParameterBound = EXTENDS Type {"&" Type}
TypeVariable = Ident

FormalParameters = "(" [ FormalParameterList ] ")"
FormalParameterList = [ FormalParameterListNovarargs , ] LastFormalParameter
FormalParameterListNovarargs = [ FormalParameterListNovarargs , ] FormalParameter

FormalParameter = { FINAL | '@' Annotation } Type VariableDeclaratorId
LastFormalParameter = { FINAL | '@' Annotation } Type '...' Ident | FormalParameter

MethodBody = Block

Statement =
Block
| IF ParExpression Statement [ELSE Statement]
| FOR "(" ForInitOpt ";" [Expression] ";" ForUpdateOpt ")" Statement
| FOR "(" FormalParameter : Expression ")" Statement
| WHILE ParExpression Statement
| DO Statement WHILE ParExpression ";"
| TRY Block ( Catches | [Catches] FinallyPart )
| SWITCH ParExpression "{" SwitchBlockStatementGroups "}"
| SYNCHRONIZED ParExpression Block
| RETURN [Expression] ";"
| THROW Expression ";"
| BREAK [Ident] ";"
| CONTINUE [Ident] ";"
| ASSERT Expression [ ":" Expression ] ";"
| ";"
| ExpressionStatement
| Ident ":" Statement

Block = "{" BlockStatements "}"

BlockStatements = { BlockStatement }
BlockStatement = LocalVariableDeclarationStatement
| ClassOrInterfaceOrEnumDeclaration
| [Ident ":"] Statement
LocalVariableDeclarationStatement
= { FINAL | '@' Annotation } Type VariableDeclarators ";"

ParExpression = "(" Expression ")"

ForInit = StatementExpression MoreStatementExpressions
| { FINAL | '@' Annotation } Type VariableDeclarators

ForUpdate = StatementExpression MoreStatementExpressions

VariableDeclarators = VariableDeclarator { "," VariableDeclarator }

VariableDeclaratorsRest = VariableDeclaratorRest { "," VariableDeclarator }
ConstantDeclaratorsRest = ConstantDeclaratorRest { "," ConstantDeclarator }

VariableDeclarator = Ident VariableDeclaratorRest
ConstantDeclarator = Ident ConstantDeclaratorRest

VariableDeclaratorRest = BracketsOpt ["=" VariableInitializer]
ConstantDeclaratorRest = BracketsOpt "=" VariableInitializer

VariableDeclaratorId = Ident BracketsOpt

CatchClause = CATCH "(" FormalParameter ")" Block

SwitchBlockStatementGroups = { SwitchBlockStatementGroup }
SwitchBlockStatementGroup = SwitchLabel BlockStatements
SwitchLabel = CASE ConstantExpression ":" | DEFAULT ":"

MoreStatementExpressions = { COMMA StatementExpression }

Expression = Expression1 [ExpressionRest]
ExpressionRest = [AssignmentOperator Expression1]
AssignmentOperator = "=" | "+=" | "-=" | "*=" | "/=" |
"&=" | "|=" | "^=" |
"%=" | "<<=" | ">>=" | ">>>="
Type = Type1
TypeNoParams = TypeNoParams1
StatementExpression = Expression
ConstantExpression = Expression

Expression1 = Expression2 [Expression1Rest]
Type1 = Type2
TypeNoParams1 = TypeNoParams2

Expression1Rest = ["?" Expression ":" Expression1]

Expression2 = Expression3 [Expression2Rest]
Type2 = Type3
TypeNoParams2 = TypeNoParams3

Expression2Rest = {infixop Expression3}
| Expression3 INSTANCEOF Type
infixop = "||"
| "&&"
| "|"
| "^"
| "&"
| "==" | "!="
| "<" | ">" | "<=" | ">="
| "<<" | ">>" | ">>>"
| "+" | "-"
| "*" | "/" | "%"

Expression3 = PrefixOp Expression3
| "(" Expr | TypeNoParams ")" Expression3
| Primary {Selector} {PostfixOp}
Primary = "(" Expression ")"
| Literal
| [TypeArguments] THIS [Arguments]
| [TypeArguments] SUPER SuperSuffix
| NEW [TypeArguments] Creator
| Ident { "." Ident }
[ "[" ( "]" BracketsOpt "." CLASS | Expression "]" )
| Arguments
| "." ( CLASS | THIS | [TypeArguments] SUPER Arguments | NEW [TypeArguments] InnerCreator )
]
| BasicType BracketsOpt "." CLASS
PrefixOp = "++" | "--" | "!" | "~" | "+" | "-"
PostfixOp = "++" | "--"
Type3 = Ident { "." Ident } [TypeArguments] {TypeSelector} BracketsOpt
| BasicType
TypeNoParams3 = Ident { "." Ident } BracketsOpt
Selector = "." [TypeArguments] Ident [Arguments]
| "." THIS
| "." [TypeArguments] SUPER SuperSuffix
| "." NEW [TypeArguments] InnerCreator
| "[" Expression "]"
TypeSelector = "." Ident [TypeArguments]
SuperSuffix = Arguments | "." Ident [Arguments]

SuperSuffix = Arguments | "." [TypeArguments] Ident [Arguments]

BasicType = BYTE | SHORT | CHAR | INT | LONG | FLOAT | DOUBLE | BOOLEAN

ArgumentsOpt = [ Arguments ]

Arguments = "(" [Expression { COMMA Expression }] ")"

TypeArgumentsOpt = [ TypeArguments ]

TypeArguments = "<" TypeArgument {"," TypeArgument} ">"

TypeArgument = Type
| "?"
| "?" EXTENDS Type {"&" Type}
| "?" SUPER Type

BracketsOpt = {"[" "]"}

BracketsSuffixExpr = "." CLASS
BracketsSuffixType =

Creator = Qualident [TypeArguments] ( ArrayCreatorRest | ClassCreatorRest )

InnerCreator = Ident [TypeArguments] ClassCreatorRest

ArrayCreatorRest = "[" ( "]" BracketsOpt ArrayInitializer
| Expression "]" {"[" Expression "]"} BracketsOpt )

ClassCreatorRest = Arguments [ClassBody]

ArrayInitializer = "{" [VariableInitializer {"," VariableInitializer}] [","] "}"

VariableInitializer = ArrayInitializer | Expression

Ident = IDENTIFIER

Literal =
INTLITERAL
| LONGLITERAL
| FLOATLITERAL
| DOUBLELITERAL
| CHARLITERAL
| STRINGLITERAL
| TRUE
| FALSE
| NULL


现在只是从源码中原样提取了注释,还没检查有没有提取错误或缺失。总之先记下来慢慢看。

P.S. OpenJDK项目中有一个[url=http://openjdk.java.net/projects/compiler-grammar/]Compiler Grammar[/url]项目,其中有一个用ANTLR写的语法文件,[url=http://hg.openjdk.java.net/compiler-grammar/compiler-grammar/langtools/file/dc7563e2b917/src/share/classes/com/sun/tools/javac/antlr/Java.g]Java.g[/url],值得参考。
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值