-
几个基础概念
- A sentence 句子 is a string of characters over some alphabet
- A language is a set of sentences 语言是一些句子的集合
- A lexeme is the lowest level syntactic unit of a language (e.g., *, sum, begin) 最小语法单位
- A token is a category of lexemes (e.g., identifier) 标识符
-
BNF 基础
- In BNF, abstractions 抽象 are used to represent classes of syntactic structures--they act like
syntactic variables 变量 (also called nonterminal symbols, or just nonterminals) - Terminals are lexemes or tokens
- A rule 规则 has a left-hand side (LHS) 左侧, which is a nonterminal, and a right-hand side (RHS) 右侧, which is a string of terminals and/or nonterminals 左侧是nonterminal,右侧是terminals序列或者nonterminal
- In BNF, abstractions 抽象 are used to represent classes of syntactic structures--they act like
-
Nonterminals are often enclosed in angle brackets 角括号 Examples of BNF rules:
<ident_list> -> identifier | identifier, <ident_list>
<if_stmt> -> if <logic_expr> then <stmt> -
Grammar: a finite non-empty set of rules 有限的非空规则集合
-
A start symbol is a special element of the nonterminals of a grammar 起始标志是特殊nonterminal
-
BNF规则
- An abstraction (or nonterminal symbol) can have more than one RHS 右侧可以有多种规则
e.g. <stmt> -> <single_stmt> | begin <stmt_list> end (一个语句可以是单个句子或者一些句子) - Syntactic lists are described using recursion 如何描述一个list
<ident_list> -> ident| ident, <ident_list> - A derivation is a repeated application of rules, starting with the start symbol and ending with a
sentence (all terminal symbols) 一次衍生指不断运用规则直到最后只剩一个全是terminal的语句
e.g. 假设一个语言的语法如下
那么以下就是一种衍生
- An abstraction (or nonterminal symbol) can have more than one RHS 右侧可以有多种规则
-
Derivations 具体了解衍生
- Every string of symbols in a derivation is a sentential form 衍生中每一行都是语句的形式
- sentence is a sentential form that has only terminal symbols 语句只包含terminal
- leftmost derivation is one in which the leftmost nonterminal in each sentential form is the one
that is expanded 左侧衍生是被展开的对象总是以左侧优先(比如上述的例子),以此类推可知右侧
衍生的定义 - derivation may be neither leftmost nor rightmost 衍生可以既不是左侧衍生也不是右侧衍生(顺序
不重要)
-
Parse Tree 解析树
- A hierarchical representation of a derivation 用来表现衍生的一种树状表达方式
e.g.
- A hierarchical representation of a derivation 用来表现衍生的一种树状表达方式
-
Ambiguity in Grammars 语法的含糊性
-
A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct
parse trees 如果一个语句可以用两个不同parse tree解析树表示,则说明这个语言的语法是含糊的
e.g. 下面的语法是含糊的 -
如何避免语法的含糊性?
答: 不让两个以上的同名 nonterminals 出现在一个语句的右侧一个表达式里,比如上述的<expr>
现在将上面的语法稍微修改,就是非含糊的
-
-
EBNF 是BNF的一个扩展版本,增加了一些更便利的语法(如 可选,重复等) 有兴趣可以参考
Wikipedia EBNF
转载于:https://my.oschina.net/Bruce370/blog/889420