Boost.Spirit用户手册翻译（13）：语法

最新推荐文章于 2022-10-14 21:17:12 发布

qingcairousi

最新推荐文章于 2022-10-14 21:17:12 发布

阅读量1.6k

点赞数

分类专栏： C++ boost 文章标签： parsing reference constructor struct function semantic

boost 同时被 2 个专栏收录

24 篇文章 0 订阅

订阅专栏

C++

22 篇文章 0 订阅

订阅专栏

The Grammar
语法

The grammar encapsulates a set of rules. The grammar class is a protocol base class. It is essentially an interface contract. The grammar is a template class that is parameterized by its derived class, DerivedT, and its context, ContextT. The template parameter ContextT defaults to parser_context, a predefined context.

语法是一个语义集合的封装。grammar类是一个基于约定的类。本质上是一个接口约定。grammar是一个以它的派生类——DerivedT以及语境——ContextT（默认值为parser_context，一个预定义的语境类）为模板参数的模板类。

You need not be concerned at all with the ContextT template parameter unless you wish to tweak the low level behavior of the grammar. Detailed information on the ContextT template parameter is provided elsewhere. The grammar relies on the template parameter DerivedT, a grammar subclass to define the actual rules.

你完全不用关心ContextT这个模板参数，除非你想调整语法的底层的行为。关于ContextT模板参数的详细信息在其他地方提供。grammar依赖于定义实际规则的派生类DerivedT这个模板参数。

Presented below is the public API. There may actually be more template parameters after ContextT. Everything after the ContextT parameter should not be of concern to the client and are strictly for internal use only.

下面就是公开的API。实际上在ContextT后面可能有更多的模板参数。在ContextT后面的所有东西都不应该被用户所考虑，并且被严格限定于只在内部使用。

    template<
        typename DerivedT,
        typename ContextT = parser_context<> >
    struct grammar;

Grammar definition

语法定义

A concrete sub-class inheriting from grammar is expected to have a nested template class (or struct) named definition:

一个派生自grammar的实际子类必须有一个名为definition的嵌套模板类/结构：

It is a nested template class with a typename ScannerT parameter.

它是有typename ScannerT 参数的嵌套模板类

Its constructor defines the grammar rules.

它的构造函数定义了语法规则

Its constructor is passed in a reference to the actual grammar self.

它的构造函数要接受一个名为self的实际语法类的引用对象

It has a member function named start that returns a reference to the start rule.

他有一个名为start，返回起始规则的引用对象的成员函数

Grammar skeleton

语法骨架

    struct my_grammar : public grammar<my_grammar>
    {
        template <typename ScannerT>
        struct definition
        {
            rule<ScannerT>  r;
            definition(my_grammar const& self)  { r = /*..define here..*/; }
            rule<ScannerT> const& start() const { return r; }
        };
    };

Decoupling the scanner type from the rules that form a grammar allows the grammar to be used in different contexts possibly using different scanners. We do not care what scanner we are dealing with. The user-defined my_grammar can be used with any type of scanner. Unlike the rule, the grammar is not tied to a specific scanner type. See "Scanner Business" to see why this is important and to gain further understanding on this scanner-rule coupling problem.

将扫描器类从构成语法的规则中解耦形成了一个可以在不同语境下使用不同扫描器的语法类。我们可以不在乎使用的是什么扫描器。用户定义的my_grammar可以与任意扫描器类配合使用。与规则不同，语法并不与特定的扫描器绑定。阅读“扫描器事务”以理解为什么这种解耦是重要的，且获得对这个“扫描器——规则”问题的进一步理解。

Instantiating and using my_grammar

实例化并使用my_grammar

Our grammar above may be instantiated and put into action:

上面的语法可以这么实例化并使用：

    my_grammar g;

    if (parse(first, last, g, space_p).full)
        cout << "parsing succeeded/n";
    else
        cout << "parsing failed/n";

my_grammar IS-A parser and can be used anywhere a parser is expected, even referenced by another rule:

my_grammar是分析器且可以用在任意分析器适用的场合，甚至被另一个规则引用:

    rule<>  r = g >> str_p("cool huh?");

Referencing grammars
引用语法

Like the rule, the grammar is also held by reference when it is placed in the right hand side of an EBNF expression. It is the responsibility of the client to ensure that the referenced grammar stays in scope and does not get destructed while it is being referenced.
与规则相似，当语法被放在EBNF表达式的等号右边时，它也是被引用的。用户有责任保证语法在被引用的过程中始终处于可访问的范围内，并且不被析构。

Full Grammar Example

完整的语法例子

Recalling our original calculator example, here it is now rewritten using a grammar:

再次使用我们最初的计算器例子，这里它被用一个语法重写了：

    struct calculator : public grammar<calculator>
    {
        template <typename ScannerT>
        struct definition
        {
            definition(calculator const& self)
            {
                group       = '(' >> expression >> ')';
                factor      = integer | group;
                term        = factor >> *(('*' >> factor) | ('/' >> factor));
                expression  = term >> *(('+' >> term) | ('-' >> term));
            }

            rule<ScannerT> expression, term, factor, group;

            rule<ScannerT> const&
            start() const { return expression; }
        };
    };

A fully working example with semantic actions can be viewed here. This is part of the Spirit distribution.

一个添加了语义动作的可工作的完整例子可以在这里查阅。这是Spirit包的一部分。

self

You might notice that the definition of the grammar has a constructor that accepts a const reference to the outer grammar. In the example above, notice that calculator::definition takes in a calculator const& self. While this is unused in the example above, in many cases, this is very useful. The self argument is the definition's window to the outside world. For example, the calculator class might have a reference to some state information that the definition can update while parsing proceeds through semantic actions.
你也许已经注意到了，语法的definition类有一个接受外部grammar引用对象的构造函数。在上面的例子中，注意calculator:: definiton接受一个calculator const& self。虽然上面的例子中并没有使用，但依然是十分有用的。self这个参数是definition与通往外界的窗口。例如，calculator类可能会引用一些状态信息，而definition可以通过语义动作更新这些信息。

Grammar Capsules

语法封装

As a grammar becomes complicated, it is a good idea to group parts into logical modules. For instance, when writing a language, it might be wise to put expressions and statements into separate grammar capsules. The grammar takes advantage of the encapsulation properties of C++ classes. The declarative nature of classes makes it a perfect fit for the definition of grammars. Since the grammar is nothing more than a class declaration, we can conveniently publish it in header files. The idea is that once written and fully tested, a grammar can be reused in many contexts. We now have the notion of grammar libraries.

随着语法变得复杂，把不同的部分封装到不同的逻辑模块中是个好主意。例如，在写一门语言时，把表达式和句子分开放到不同的语法封装中会是个明智的主意。语法从C++的类封装中获得不少优点。由于grammar除了类的声明什么也不是，我们可以舒服地把它放在头文件中。这样，一旦写完并且经过完成的测试，一个语法就可以在很多语境中重用。现在我们有了语法库的概念。

Reentrancy and multithreading

再入和多线程

An instance of a grammar may be used in different places multiple times without any problem. The implementation is tuned to allow this at the expense of some overhead. However, we can save considerable cycles and bytes if we are certain that a grammar will only have a single instance. If this is desired, simply define BOOST_SPIRIT_SINGLE_GRAMMAR_INSTANCE before including any spirit header files.

一个语法的对象可以在不同的地方反复使用而不带来任何问题。它的实现被调整得在可以接受的效率损失下允许这种重用。然而，我们可以节省可观的时钟周期和字节如果我们确信一个语法将只有一个对象。如果想这么做，只需在包含任何spirit头文件前定义BOOST_SPIRIT_SINGLE_GRAMMAR_INSTANCE。

    #define BOOST_SPIRIT_SINGLE_GRAMMAR_INSTANCE

On the other hand, if a grammar is intended to be used in multithreaded code, we should then define BOOST_SPIRIT_THREADSAFE before including any spirit header files. In this case it will also be required to link against Boost.Threads

而在另一方面，如果倾向于在多线程的代码中使用一个语法，那么我们要在包含任何spirit头文件前定义BOOST_SPIRIT_THREADSAFE。在这种情况下需要连接Boost.Threads库

    #define BOOST_SPIRIT_THREADSAFE

Using more than one grammar start rule

在语法中使用多重起始规则

Sometimes it is desirable to have more than one visible entry point to a grammar (apart from the start rule). To allow additional start points, Spirit provides a helper template grammar_def, which may be used as a base class for the definition subclass of your grammar. Here's an example:

有时，我们希望一个语法有不止一个的可见的入口（不是起始规则）。为了允许附加的规则起点，Spirit提供了一个grammar_def模板，它可以用作你的语法里definition的基类。这里是例子：

    // this header has to be explicitly included
    #include <boost/spirit/utility/grammar_def.hpp> 

    struct calculator2 : public grammar<calculator2>
    {
        enum 
        {
            expression = 0,
            term = 1,
            factor = 2,
        };

        template <typename ScannerT>
        struct definition
        : public grammar_def<rule<ScannerT>, same, same>
        {
            definition(calculator2 const& self)
            {
                group       = '(' >> expression >> ')';
                factor      = integer | group;
                term        = factor >> *(('*' >> factor) | ('/' >> factor));
                expression  = term >> *(('+' >> term) | ('-' >> term));

                this->start_parsers(expression, term, factor); 
            }

            rule<ScannerT> expression, term, factor, group;
        };
    };

The grammar_def template has to be instantiated with the types of all the rules you wish to make visible from outside the grammar:

grammar_def模板需要以你想使其在grammar外部可见的全部规则类型来实例化：

    grammar_def<rule<ScannerT>, same, same>

The shorthand notation same is used to indicate that the same type be used as specified by the previous template parameter (e.g. rule<ScannerT>). Obviously, same may not be used as the first template parameter.

same这个简写用来说明，这个参数使用的类型与前面的模板参数（比如rule<ScannerT>）相同。很明显，same不一定被用作第一个模板参数。

grammar_def start types
grammar_def 起始类型

It may not be obvious, but it is interesting to note that aside from rule<>s, any parser type may be specified (e.g. chlit<>, strlit<>, int_parser<>, etc.).
可能并不明显，但是有趣的是，除了规则，也可以指定任意分析器类型（比如chlit<>,strlit<>,int_parser<>等等）。

Using the grammar_def class, there is no need to provide a start()member function anymore. Instead, you'll have to insert a call to the this->start_parsers() (which is a member function of the grammar_def template) to define the start symbols for your grammar. Note that the number and the sequence of the rules used as the parameters to the start_parsers() function should match the types specified in the grammar_def template:

使用grammar_def类后，就没有必要提供start()成员函数了。取而代之的是，你要插入一个this->start_parsers()（这个是grammar_def的成员函数）的调用来指明你的语法的起始规则。注意用于start_parsers()的参数的数量和顺序必须符合garmmar_def的模板参数的数量和类型：

    this->start_parsers(expression, term, factor);

The grammar entry point may be specified using the following syntax:

语法的入口可以用下面的语法指定：

    g.use_parser<N>() // Where g is your grammar and N is the Nth entry.

This sample shows how to use the term rule from the calculator2 grammar above:

这个样本展示了如何在上面的calculator2语法中使用term规则作为起始符：

    calculator2 g;

    if (parse(
            first, last, 
            g.use_parser<calculator2::term>(),
            space_p
        ).full)
    {
        cout << "parsing succeeded/n";
    }
    else {
        cout << "parsing failed/n";
    }

The template parameter for the use_parser<> template type should be the zero based index into the list of rules specified in the start_parsers() function call.

use_parser<>的模板参数必须是与start_parser()参数列表相对应的从零开始的索引号。

use_parser<0>

Note, that using 0 (zero) as the template parameter to use_parser is equivalent to using the start rule, exported by conventional means through the start() function, as shown in the first calculator sample above. So this notation may be used even for grammars exporting one rule through its start() function only. On the other hand, calling a grammar without the use_parser notation will execute the rule specified as the first parameter to the start_parsers() function.
注意，使用0（零）作为use_parser的模板参数等价于使用起始规则，就是在上面第一个calculator里展示的，通过约定的start()函数导出的规则。因此这种做法甚至可以用于只通过start()函数输入一个规则的语法。而在另一方面，不通过use_parser而直接调用一个语法则等价与使用start_parsers()里指定的第一个参数。

The maximum number of usable start rules is limited by the preprocessor constant:

起始规则的最大数目是被一个预处理常量限制的：

    BOOST_SPIRIT_GRAMMAR_STARTRULE_TYPE_LIMIT // defaults to 3

Copyright © 1998-2003 Joel de Guzman
Copyright © 2003-2004 Hartmut Kaiser

Use, modification and distribution is subject to the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)