Boost.Spirit用户手册翻译（9）：规则

最新推荐文章于 2024-07-15 15:49:04 发布

qingcairousi

最新推荐文章于 2024-07-15 15:49:04 发布

阅读量1.7k

点赞数

分类专栏： boost C++ 文章标签： recursion c++ parameters behavior character class

boost 同时被 2 个专栏收录

24 篇文章 0 订阅

订阅专栏

C++

22 篇文章 0 订阅

订阅专栏

The Rule
规则

The rule is a polymorphic parser that acts as a named place-holder capturing the behavior of an EBNF expression assigned to it. Naming an EBNF expression allows it to be referenced later. The rule is a template class parameterized by the type of the scanner (ScannerT), the rule's context and its tag. Default template parameters are provided to make it easy to use the rule.

规则是一个多态的分析器，就是一个有名字的占位符，这个占位符保有赋给它的EBNF表达式的行为。为EBNF表达式命名是为了在以后能引用它。rule是一个模板参数为扫描器类(ScannerT),规则的语境(context)和标签（tag）的模板类。提供默认的模板参数使得rule易于使用。

    template<
        typename ScannerT = scanner<>,
        typename ContextT = parser_context<>,
        typename TagT = parser_address_tag>
    class rule;

Default template parameters are supplied to handle the most common case. ScannerT defaults to scanner<>, a plain vanilla scanner that acts on char const* iterators and does nothing special at all other than iterate through all the chars in the null terminated input a character at a time. The rule tag, TagT, typically used with ASTs, is used to identify a rule; it is explained here. In trivial cases, declaring a rule as rule<> is enough. You need not be concerned at all with the ContextT template parameter unless you wish to tweak the low level behavior of the rule. Detailed information on the ContextT template parameter is provided elsewhere.

默认模板参数足以应付最常见的情况。ScannerT默认为scanner<>,一个最基本的分析器，使用const char*迭代器，且除了以每次读取一个字符的方式对一个零终结符的输入的所有字符迭代，什么也不作。规则的标签，TagT，主要配合抽象语法树（ASTs）使用，用于识别不同的规则。在这里有说明。在简单类里，将规则声明为rule<>就足够了。你完全不必考虑ContextT参数，除非你想调整规则的底层行为。ContextT参数的详细信息将另行提供。

Order of parameters

参数的顺序

As of v1.8.0, the ScannerT, ContextT and TagT can be specified in any order. If a template parameter is missing, it will assume the defaults. Examples:

在1.8.0这个版本中，ScannerT，ContextT和TagT可以以任意顺序排列。如果某个模板参数缺失，则适用默认的参数。例：

    rule<> rx1;
    rule<scanner<> > rx2;
    rule<parser_context<> > rx3;
    rule<parser_context<>, parser_address_tag> rx4;
    rule<parser_address_tag> rx5;
    rule<parser_address_tag, scanner<>, parser_context<> > rx6;
    rule<parser_context<>, scanner<>, parser_address_tag> rx7;

Multiple scanners

多扫描器

As of v1.8.0, rules can use one or more scanner types. There are cases, for instance, where we need a rule that can work on the phrase and character levels. Rule/scanner mismatch has been a source of confusion and is the no. 1 FAQ. To address this issue, we now have multiple scanner support. Example:

在1.8.0中，规则可以使用一个或者多个扫描器类。在诸如需要一个能同时工作于短句层次和字符层次的分析器的情况下，规则/扫描器的不匹配是迷惑的来源，也是 FAQ的第一个问题。为了解决这个问题，就有了对多扫描器的支持。例子：

    typedef scanner_list<scanner<>, phrase_scanner_t> scanners;

    rule<scanners>  r = +anychar_p;
    assert(parse("abcdefghijk", r).full);
    assert(parse("a b c d e f g h i j k", r, space_p).full);

Notice how rule r is used in both the phrase and character levels.

注意如何同时在句子和字符的层次使用rule。

By default support for multiple scanners is disabled. The macro BOOST_SPIRIT_RULE_SCANNERTYPE_LIMIT must be defined to the maximum number of scanners allowed in a scanner_list. The value must be greater than 1 to enable multiple scanners. Given the example above, to define a limit of two scanners for the list, the following line must be inserted into the source file before the inclusion of Spirit headers:

对多扫描器的支持在默认情况下是关闭的。宏BOOST_SPIRIT_RULE_SCANNERTYPE_LIMIT必须定义成scanner_list所允许的扫描器的最大数目。比如上面的例子，要把扫描器列表的最大数目限定为2，下面一行必须在源码中Spirit的头文件之前被插入：

    #define BOOST_SPIRIT_RULE_SCANNERTYPE_LIMIT 2

See the techniques section for an example of a grammar using a multiple scanner enabled rule, lexeme_scanner and as_lower_scanner.

查看关于语法如何使用允许多扫描器的规则的例子中的技术章节。lexeme_scanner 和 as_lower_scanner.

Rule Declarations

规则声明

The rule class models EBNF's production rule. Example:

rule这个类实现了EBNF的产生规则。例：

    rule<> a_rule = *(a | b) & +(c | d | e);

The type and behavior of the right-hand (rhs) EBNF expression, which may be arbitrarily complex, is encoded in the rule named a_rule. a_rule may now be referenced elsewhere in the grammar:

右手边（rhs）的EBNF表达式的类型（它可以有任意的复杂度），被一个名为a_rule的规则所包含。a_rule可以在语法的任意处被引用。

    rule<> another_rule = f >> g >> h >> a_rule;

Referencing rules

引用规则

When a rule is referenced anywhere in the right hand side of an EBNF expression, the rule is held by the expression by reference. It is the responsibility of the client to ensure that the referenced rule stays in scope and does not get destructed while it is being referenced.

当一个规则被在EBNF的表达式右边引用时，表达式适用的是规则的引用。保证被引用的规则的可见性和被引用时的有效性，是客户程序的责任。

    a = int_p;
    b = a;
    c = int_p >> b;

Copying Rules

复制规则

The rule is a weird C++ citizen, unlike any other C++ object. It does not have the proper copy and assignment semantics and cannot be stored and passed around by value. If you need to copy a rule you have to explicitly call its member function copy():

规则是一个别扭的C++公民。与其他C++对象不同，它没有拷贝和赋值语义，且不能以芝语义储存和引用。如果你需要赋值一个规则，就必须显式地调用它的成员函数:copy():

    r.copy();

However, be warned that copying a rule will not deep copy other referenced rules of the source rule being copied. This might lead to dangling references. Again, it is the responsibility of the client to ensure that all referenced rules stay in scope and does not get destructed while it is being referenced. Caveat emptor.

然而，对一个规则的拷贝并不会引起对这个规则所引用的其他规则的深度拷贝。因为这可能会带来环引用。当然，客户程序同样有责任保证所有被引用的规则都在范围内，并且在引用中不会被析构。你得风险自担。

If you copy a rule, then you'll want to place it in a storage somewhere. The problem is how? The storage can't be another rule:

如果你复制一个规则，那么你可能会想把它储存在某处。问题是怎么储存？反正不能用另一个规则：

    rule<> r2 = r.copy(); // BAD!

because rules are weird and does not have the expected C++ copy-constructor and assignment semantics! As a general rule: Don't put a copied rule into another rule! Instead, use the stored_rule for that purpose.

因为规则是别扭的，而且没有所需要的C++拷贝构造函数和赋值语义！原则：不要把拷贝的规则放到另一个规则中去。而是使用可存储规则来达到这一目的。

Forward declarations

前向声明

A rule may be declared before being defined to allow cyclic structures typically found in BNF declarations. Example:

rule可以在定义前声明以允许EBNF中常见的环状结构。例子：

    rule<> a, b, c;

    a = b | a;
    b = c | a;

Recursion

递归

The right-hand side of a rule may reference other rules, including itself. The limitation is that direct or indirect left recursion is not allowed (this is an unchecked run-time error that results in an infinite loop). This is typical of top-down parsers. Example:

规则的右边可以是其他规则，包括它自己。只是直接或间接的左递归是不允许的（这是一个无法自动检查的错误，在运行时将引起无线循环）。这是自顶向下分析器的典型情况。例子：

    a = a | b; // infinite loop!

What is left recursion?

什么是左递归？

Left recursion happens when you have a rule that calls itself before anything else. A top-down parser will go into an infinite loop when this happens. See the FAQ for details on how to eliminate left recursion.

当某个规则的等号右边的第一个元素是它自己的时候，就产生了左递归。一个自顶向下分析器在这种情况下将陷入无限循环中。 FAQ中有如何削除左递归的细节。

Undefined rules

未定义的规则

An undefined rule matches nothing and is semantically equivalent to nothing_p.

未定义的规则什么都不匹配，语义上等同于nothing_p。

Redeclarations

重定义

Like any other C++ assignment, a second assignment to a rule is destructive and will redefine it. The old definition is lost. Rules are dynamic. A rule can change its definition anytime:

与C++的其他赋值相同，对一个规则的二次赋值会引起它的析构和重定义。老的定义就此丢失。规则是动态的。一个规则可以在任意时间改变它的定义：

    r = a_definition;
    r = another_definition;

Rule r loses the old definition when the second assignment is made. As mentioned, an undefined rule matches nothing and is semantically equivalent to nothing_p.

规则r在第二次赋值发生时，丢掉了它的旧定义。正如上面所提到的，一个未定义的规则总是不匹配，语义上等同于nothing_p。

Dynamic Parsers

动态分析器

Hosting declarative EBNF in imperative C++ yields an interesting blend. We have the best of both worlds. We have the ability to conveniently modify the grammar at run time using imperative constructs such as if, else statements. Example:

将函数式的EBNF置于指令式的C++中，产生了一个有趣的混合。我们集二者之长。我们可以轻松地在运行时使用指令式的构造，比如if,else语句，来改变一个语法。例如：

    if (feature_is_available)
        r = add_this_feature;

Rules are essentially dynamic parsers. A dynamic parser is characterized by its ability to modify its behavior at run time. Initially, an undefined rule matches nothing. At any time, the rule may be defined and redefined, thus, dynamically altering its behavior.

规则本质上就是动态分析器。之所以是动态是因为它的运行时可以改变其行为的可能这一特征。最初，一个未定义的规则总是不匹配。而在任何时候，规则都可以定义和重定义，由此，动态地改变它的行为。

No start rule

没有起始规则

Typically, parsers have what is called a start symbol, chosen to be the root of the grammar where parsing starts. The Spirit parser framework has no notion of a start symbol. Any rule can be a start symbol. This feature promotes step-wise creation of parsers. We can build parsers from the bottom up while fully testing each level or module up untill we get to the top-most level.

一般来说，分析器都有一个起始符，作为语法的根源，分析的起点。Spirit分析器框架并没有起始符。任何规则都可以是起始符。这加强了逐步构造分析器的能力。构建分析器时，可以自底向上，同时完整测试所有层次，或者构建所有层次，直到到达最顶层。

Parser Tags

分析器标签

Rules may be tagged for identification purposes. This is necessary, especially when dealing with parse trees and ASTs to see which rule created a specific AST/parse tree node. Each rule has an ID of type parser_id. This ID can be obtained through the rule's id() member function:

为了识别，可以给规则加上标签。这是必须的，尤其是在分析树和AST树中要判断是哪个规则产生了特定的AST/分析树节点时。每个规则都有一个类型为parser_id的ID。这个ID可以通过规则的id()成员函数得到：

    my_rule.id(); //  get my_rule's id

The parser_id class is declared as:

parser_id类的声明如下：

    class parser_id
    {
    public:
                    parser_id();
        explicit    parser_id(void const* p);
                    parser_id(std::size_t l);
    
        bool        operator==(parser_id const& x) const;
        bool        operator!=(parser_id const& x) const;
        bool        operator<(parser_id const& x) const;
        std::size_t to_long() const;
    };

parser_address_tag

The rule's TagT template parameter supplies this ID. This defaults to parser_address_tag. The parser_address_tag uses the address of the rule as its ID. This is often not the most convenient, since it is not always possible to get the address of a rule to compare against.

规则的TagT模板参数支持这个ID。默认是parser_address_tag。parser_address_tag适用规则在内存中的地址作为它的ID。这个通常不是很舒服，因为并不总是能够得到一个规则的地址拿来比较的。

parser_tag

It is possible to have specific constant integers to identify a rule. For this purpose, we can use the parser_tag<N>, where N is a constant integer:

使用特定的常整数来识别规则也是可以的。为了达到这一目的，可以使用parser_tag<N>,这里N是一个常整数。

    rule<parser_tag<123> > my_rule; //  set my_rule's id to 123

dynamic_parser_tag

The parser_tag<N> can only specifiy a static ID, which is defined at compile time. If you need the ID to be dynamic (changeable at runtime), you can use the dynamic_parser_tag class as the TagT template parameter. This template parameter enables the set_id() function, which may be used to set the required id at runtime:

parser_tag<N>只能表示一个在编译期定义的静态ID。如果你想让ID变成动态的（在运行时可更改），可以适用dynamic_parser_tag作为TagT这个模板参数。这个参数允许set_id()函数，作用是在运行时设定id值：

    rule<dynamic_parser_tag> my_dynrule;
    my_dynrule.set_id(1234);    // set my_dynrule's id to 1234

If the set_id() function isn't called, the parser id defaults to the address of the rule as its ID, just like the parser_address_tag template parameter would do.

如果set_id()没有被调用，分析器的id默认为它的地址。就像parser_address_tag模板参数作的那样。

Copyright © 1998-2003 Joel de Guzman

Use, modification and distribution is subject to the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)