In-depth: The Parsers



What makes Spirit tick? Now on to some details... The parser class is the most fundamental entity in the framework. A parser accepts a scanner comprised of a first-last iterator pair and returns a match object as its result. The iterators delimit the data currently being parsed. The match object evaluates to true if the parse succeeds, in which case the input is advanced accordingly. Each parser can represent a specific pattern or algorithm, or it can be a more complex parser formed as a composition of other parsers.


All parsers inherit from the base template class, parser:


template <typename DerivedT>
struct parser

DerivedT& derived();
DerivedT const& derived() const;

This class is a protocol base class for all parsers. The parser class does not really know how to parse anything but instead relies on the template parameter DerivedT to do the actual parsing. This technique is known as the "Curiously Recurring Template Pattern" in template meta-programming circles. This inheritance strategy gives us the power of polymorphism without the virtual function overhead. In essence this is a way to implement compile time polymorphism.

该类是针对所有分析器的基于协议的类。parser类并不真正知道如何分析,而是依赖于它的模板参数DerivedT来做真正的分析。这种技术在模板元编程圈子里被称为“神奇递归模板模式(Curiously Recurring Template Pattern)”。这种继承策略使我们拥有了多态的威力而不必承担虚函数的代价。本质上这是一个实现编译多态的方法。


Each derived parser has a typedef parser_category_t that defines its category. By default, if one is not specified, it will inherit from the base parser class which typedefs its parser_category_t as plain_parser_category. Some template classes are provided to distinguish different types of parsers. The following categories are the most generic. More specific types may inherit from these.

每个派生的分析器都有一个名为定义其种类的typedef parser_category_t。默认情况下,如果某个分析器没有指定,它将从把parser_category_t定义为plain_parser_category的基类继承这个typedef。提供了一些模板类以区分分析器的种类。下面的种类是最粗泛的。更特定的种类将派生自这些类。

Parser categories

Your plain vanilla parser


binary_parser_categoryA parser that has subject a and b (e.g. alternative)
unary_parser_categoryA parser that has single subject (e.g. kleene star)
action_parser_categoryA parser with an attached semantic action
    struct plain_parser_category {};
struct binary_parser_category : plain_parser_category {};
struct unary_parser_category : plain_parser_category {};
struct action_parser_category : unary_parser_category {};


Each parser has a typedef embed_t. This typedef specifies how a parser is embedded in a composite. By default, if one is not specified, the parser will be embedded by value. That is, a copy of the parser is placed as a member variable of the composite. Most parsers are embedded by value. In certain situations however, this is not desirable or possible. One particular example is the rule. The rule, unlike other parsers is embedded by reference.

每个分析器都有一个typedef embed_t。这个typedef指定了分析器是如何被嵌入合成物的。默认情况下,如果某个分析器未指定的话,它将以值内嵌。然而在特定的情况下,这可能不是我们想要的或者是不可能的。特定的例子就是rule。规则,与其他分析器不同,是以引用内嵌。

The match


The match holds the result of a parser. A match object evaluates to true when a succesful match is found, otherwise false. The length of the match is the number of characters (or tokens) that is successfully matched. This can be queried through its length() member function. A negative value means that the match is unsucessful.


Each parser may have an associated attribute. This attribute is also returned back to the client on a successful parse through the match object. We can get this attribute via the match's value() member function. Be warned though that the match's attribute may be invalid, in which case, getting the attribute will result in an exception. The member function has_valid_attribute() can be queried to know if it is safe to get the match's attribute. The attribute may be set anytime through the member function value(v)where v is the new attribute value.


A match attribute is valid:


  • on a successful match
  • 在成功匹配时
  • when its value is set through the value(val) member function
  • 当它的值通过value(val)成员函数设置时
  • if it is assigned or copied from a compatible match object (e.g. match<double> from match<int>) with a valid attribute. A match object A is compatible with another match object B if the attribute type of A can be assigned from the attribute type of B (i.e. a = b; must compile).
  • 如果它是一个兼容的带有有效属性的匹配对象(比如从match<int>到match<float>)的赋值或者拷贝时。一个匹配对象A与另一个匹配对象B是兼容的如果A的属性可以被从B的属性赋值(比如a=b必须兼容)。

The match attribute is undefined:


  • on an unsuccessful match
  • 当没有成功匹配时 
  • when an attempt to copy or assign from another match object with an incompatible attribute type (e.g. match<std::string> from match<int>).
  • 当试图从不兼容的另外的匹配对象拷贝或赋值匹配属性时(比如从match<int>到match<string>) 

The match class:


    template <typename T>
class match


typedef T attr_t;
operator safe_bool() const; // convertible to a bool int length() const;
bool has_valid_attribute() const; void value(T const&) const;
T const& value();


It has been mentioned repeatedly that the parser returns a match object as its result. This is a simplification. Actually, for the sake of genericity, parsers are really not hard-coded to return a match object. More accurately, a parser returns an object that adheres to a conceptual interface, of which the match is an example. Nevertheless, we shall call the result type of a parser a match object regardless if it is actually a match class, a derivative or a totally unrelated type.




What are meta-functions? We all know how functions look like. In simplest terms, a function accepts some arguments and returns a result. Here is the function we all love so much:


int identity_func(int arg)
{ return arg; } // return the argument arg

Meta-functions are essentially the same. These beasts also accept arguments and return a result. However, while functions work at runtime on values, meta-functions work at compile time on types (or constants, but we shall deal only with types). The meta-function is a template class (or struct). The template parameters are the arguments to the meta-function and a typedef within the class is the meta-function's return type. Here is the corresponding meta-function:


template <typename ArgT>
struct identity_meta_func
{ typedef ArgT type; } // return the argument ArgT

The meta-function above is invoked as:


typename identity_meta_func<ArgT>::type

By convention, meta-functions return the result through the typedef type. Take note that typename is only required within templates.

按照习惯,元函数通过typedef type返回结果。注意typename 只有在模板中才是必须的。

The actual match type used by the parser depends on two types: the parser's attribute type and the scanner type. match_result is the meta-function that returns the desired match type given an attribute type and a scanner type.


    typename match_result<ScannerT, T>::type

The meta-function basically answers the question "given a scanner type ScannerT and an attribute type T, what is the desired match type?" [ typename is only required within templates ].

这个元函数基本上回答了“给定一个分析器类ScannerT和一个属性类T,所期望的匹配类型为何”这个问题。[ typename 只在模板中才是必须的 ]。

The parse member function


Concrete sub-classes inheriting from parser must have a corresponding member function parse(...) compatible with the conceptual Interface:


    template <typename ScannerT>
parse(ScannerT const& scan) const;

where RT is the desired return type of the parser.


The parser result


Concrete sub-classes inheriting from parser in most cases need to have a nested meta-function result that returns the result type of the parser's parse member function, given a scanner type. The meta-function has the form:

从parse派生的实际子类在大多数情况下需要一个嵌套的元函数result ,给定一个分析器类,它将返回分析器parse成员函数的结果类型。该元函数有如下形式:

    template <typename ScannerT>
struct result
typedef RT type;

where RT is the desired return type of the parser. This is usually, but not always, dependent on the template parameter ScannerT. For example, given an attribute type int, we can use the match_result metafunction:

这里RT是分析器要返回的类型。通常是这样的,但并非总是如此,这取决与模板参数ScannerT。举例来说,给定属性类 int,我们可以使用match_result元函数:

    template <typename ScannerT>
struct result
typedef typename match_result<ScannerT, int>::type type;

If a parser does not supply a result metafunction, a default is provided by the base parser class. The default is declared as:


    template <typename ScannerT>
struct result
typedef typename match_result<ScannerT, nil_t>::type type;

Without a result metafunction, notice that the parser's default attribute is nil_t (i.e. the parser has no attribute).



Given a a scanner type ScannerT and a parser type ParserT, what will be the actual result of the parser? The answer to this question is provided to by the parser_result meta-function.




    typename parser_result<ParserT, ScannerT>::type

In general, the meta-function just forwards the invocation to the parser's result meta-function:


    template <typename ParserT, typename ScannerT>
struct parser_result
typedef typename ParserT::template result<ScannerT>::type type;

This is similar to a global function calling a member function. Most of the time, the usage above is equivalent to:


    typename ParserT::template result<ScannerT>::type

Yet, this should not be relied upon to be true all the time because the parser_result metafunction might be specialized for specific parser and/or scanner types.


The parser_result metafunction makes the signature of the required parse member function almost canonical:


    template <typename ScannerT>
typename parser_result<self_t, ScannerT>::type
parse(ScannerT const& scan) const;

where self_t is a typedef to the parser.


parser class declaration


    template <typename DerivedT>
struct parser
typedef DerivedT embed_t;
typedef DerivedT derived_t;
typedef plain_parser_category parser_category_t;

template <typename ScannerT>
struct result
typedef typename match_result<ScannerT, nil_t>::type type;

DerivedT& derived();
DerivedT const& derived() const;

template <typename ActionT>
action<DerivedT, ActionT>
operator[](ActionT const& actor) const;





