Boost.Spirit用户手册翻译（8）：数值

最新推荐文章于 2008-01-17 09:02:00 发布

qingcairousi

最新推荐文章于 2008-01-17 09:02:00 发布

阅读量1.5k

点赞数

分类专栏： boost C++ 文章标签： numbers parsing struct integer iterator semantic

boost 同时被 2 个专栏收录

24 篇文章 0 订阅

订阅专栏

C++

22 篇文章 0 订阅

订阅专栏

Numerics
数值

Similar to chlit, strlit etc. numeric parsers are also primitives. Numeric parsers are placed on a section of their own to give this important building block better focus. The framework includes a couple of predefined objects for parsing signed and unsigned integers and real numbers. These parsers are fully parametric. Most of the important aspects of numeric parsing can be finely adjusted to suit. This includes the radix base, the minimum and maximum number of allowable digits, the exponent, the fraction etc. Policies control the real number parsers' behavior. There are some predefined policies covering the most common real number formats but the user can supply her own when needed.

与chlit、strlit等类似，数值分析器也是元素。把数值分析器单独放在一节是为了让能更好的聚焦于这个重要的构件。框架包含了很多预定义的成对有/无符号整形，实数分析器。这些分析器都是可参数化的。调整参数就可以很好的适应数值解的大多数方面。这些参数包含了进制、小数点两边的精度的最大最小值、幂、虚部等。策略用于控制实数分析器的行为。已经预定义了涵盖常见的实数格式分析器，同时，当必要时，用户也可以使用自定义的策略。

uint_parser

This class is the simplest among the members of the numerics package. The uint_parser can parse unsigned integers of arbitrary length and size. The uint_parser parser can be used to parse ordinary primitive C/C++ integers or even user defined scalars such as bigints (unlimited precision integers). Like most of the classes in Spirit, the uint_parser is a template class. Template parameters fine tune its behavior. The uint_parser is so flexible that the other numeric parsers are implemented using it as the backbone.

这个类是数值构件中最简单的一个成员。uint_parser可分析任意长度和大小的无符号整数。uint_parser不仅可用于分析C/C++格式的整数这样的简单类型，还可以分析用户定义的诸如大整数（没有精度限制的整数）这样的整数形式。像Spirit里面大多的类那样，uint_parser是一个模板类。模板参数可以很好的调整它的行为。uint_parser是如此的有弹性，因此其他的数值分析器都是以它为骨架构建的。

    template <
        typename T = unsigned,
        int Radix = 10,
        unsigned MinDigits = 1,
        int MaxDigits = -1>
    struct uint_parser { /*...*/ };

uint_parser template parameters uint_parser 模板参数
T	The numeric base type of the numeric parser. Defaults to `unsigned 数值分析器的基本数值类型，默认为unsigned`
Radix	The radix base. This can be either 2: binary, 8: octal, 10: decimal and 16: hexadecimal. Defaults to 10; decimal 数值进制。可以为：2：二进制、8：八进制、10：十进制以及16：十六进制。默认为10。
MinDigits	The minimum number of digits allowable 所允许的最少数位
MaxDigits	The maximum number of digits allowable. If this is -1, then the maximum limit becomes unbounded 所允许的最多数位，如果为-1,则允许任意多数位

Predefined uint_parsers 预定义的无符号整形分析器
bin_p	`uint_parser<unsigned, 2, 1, -1> const`
oct_p	`uint_parser<unsigned, 8, 1, -1> const`
uint_p	`uint_parser<unsigned, 10, 1, -1> const`
hex_p	`uint_parser<unsigned, 16, 1, -1> const`

The following example shows how the uint_parser can be used to parse thousand separated numbers. The example can correctly parse numbers such as 1,234,567,890.

下面的例子示范了如何使用uint_parser分析千位分隔的数值的。例子可正确地分析诸如1,234,567,890这样的数字。

    uint_parser<unsigned, 10, 1, 3> uint3_p;        //  1..3 digits
    uint_parser<unsigned, 10, 3, 3> uint3_3_p;      //  exactly 3 digits
    ts_num_p = (uint3_p >> *(',' >> uint3_3_p));    //  our thousand separated number parser

bin_p, oct_p, uint_p and hex_p are parser generator objects designed to be used within expressions. Here's an example of a rule that parses comma delimited list of numbers (We've seen this before):

bin_p, oct_p, uint_p 以及 hex_p是设计成在表达式中使用的分析器生成器。这里是一个分析以逗号分隔的数值的规则的例子（之前我们见过这个）

    list_of_numbers = real_p >> *(',' >> real_p);

Later, we shall see how we can extract the actual numbers parsed by the numeric parsers. We shall deal with this when we get to the section on specialized actions.

不久，我们将会看到如何从数值分析器中获得它们分析出的数字。我们将在特定动作这节中接触这些。

int_parser

The int_parser can parse signed integers of arbitrary length and size. This is almost the same as the uint_parser. The only difference is the additional task of parsing the '+' or '-' sign preceding the number. The class interface is the same as that of the uint_parser.

int_parser可以分析任意长度和大小的有符号整数。这与uint_parser几乎完全一样，唯一的不同是int_parser多分析了数值前面的'+'、'-'号。这个类的接口与uint_parser一样。

A predefined int_parser 预定义的int_parser
int_p	`int_parser<int, 10, 1, -1> const`

real_parser

The real_parser can parse real numbers of arbitrary length and size limited by its parametric type T. The real_parser is a template class with 2 template parameters. Here's the real_parser template interface:

real_parser可以它的模板参数T限制内的分析任意长度和大小的实数。real_parser是有两个模板参数的模板类。这里是它的接口：

    template<
        typename T = double,
        typename RealPoliciesT = ureal_parser_policies<T> >
    struct real_parser;

The first template parameter is its numeric base type T. This defaults to double.

第一个模板参数是它的数值类型。默认值为double.

Parsing special numeric types
分析特定的数值类型

Notice that the numeric base type T can be specified by the user. This means that we can use the numeric parsers to parse user defined numeric types such as fixed_point (fixed point reals) and bigint (unlimited precision integers).
注意，数值类型T可以由用户定义。这意味着我们可以使用数值分析器分析用户定义的的数值类型，比如fix_point(定点实数)和bigint(精度无限的整数)。

The second template parameter is a class that groups all the policies and defaults to ureal_parser_policies<T>. Policies control the real number parsers' behavior. The default policies provided are designed to parse C/C++ style real numbers of the form nnn.fff.Eeee where nnn is the whole number part, fff is the fractional part, E is 'e' or 'E' and eee is the exponent optionally preceded by '-' or '+'. This corresponds to the following grammar, with the exception that plain integers without the decimal point are also accepted by default.

第二个模板参数是组织全部策略的策略类，默认值为ureal_parser_policies<T>。策略控制着实数分析器的行为。所提供的默认策略被设计成识别C/C++风格的实数，格式为nnn.fff.Eeee，nnn为整数部分，fff为小数部分，E为字母'E'或'e'，eee为指数部分，前面可以有'+'或'-'。

    floatingliteral
        =   fractionalconstant >> !exponentpart
        |  +digit_p >> exponentpart
        ;

    fractionalconstant
        =  *digit_p >> '.' >> +digit_p
        |  +digit_p >> '.'
        ;

    exponentpart
        =   ('e' | 'E') >> !('+' | '-') >> +digit_p
        ;

The default policies are provided to take care of the most common case (there are many ways to represent, and hence parse, real numbers). In most cases, the default setting of the real_parser is sufficient and can be used straight out of the box. Actually, there are four real_parsers pre-defined for immediate use:

为了应付最常见的情况（实数有很多种写法，因此分析过程也是），我们提供了默认策略。再大多数情况下，real_parser的默认设定完全够用，而且可以拿来就用。实际上，有四种预定义的实数分析器可以直接使用：

Predefined real_parsers 预定义的实数分析器
ureal_p	`real_parser<double, ureal_parser_policies<double> > const`
real_p	`real_parser<double, real_parser_policies<double> > const`
strict_ureal_p	`real_parser<double, strict_ureal_parser_policies<double> > const`
strict_real_p	`real_parser<double, strict_real_parser_policies<double> > const`

We've seen real_p before. ureal_p is its unsigned variant.

前面我们已经见过real_p。ureal_p是它的有符号变体。

Strict Reals

严格实数

Integer numbers are considered a subset of real numbers, so real_p and ureal_p recognize integer numbers (without a dot) as real numbers. strict_real_p and strict_ureal_p are the equivalent parsers that require a dot to be present for a number to be considered a successful match.

整数被认为是实数的一个子集，所以real_p和ureal_p将整数（没有小数点的数值）识别为实数。strict_real_p和strict_ureal_p与前面两个分析器相同，只是需要小数点在数值中出现才会产生成功匹配。

Advanced: real_parser policies

高级：real_parser的策略

The parser policies break down real number parsing into 6 steps:

分析器的策略将实数的分析过程分为六步：

1	parse_sign	Parse the prefix sign 分析前缀的符号
2	parse_n	Parse the integer at the left of the decimal point 分析小数点左边的整数部分
3	parse_dot	Parse the decimal point 分析小数点
4	parse_frac_n	Parse the fraction after the decimal point 分析小数点之后的小数部分
5	parse_exp	Parse the exponent prefix (e.g. 'e') 分析指数前缀(比如字母'e')
6	parse_exp_n	Parse the actual exponent 分析实际的指数

And the interaction of these sub-parsing tasks is further controlled by these 3 policies:

而且这些子分析步骤间的护动更进一步被下面这三个策略所控制：

1	allow_leading_dot	Allow a leading dot to be present (".1" becomes equivalent to "0.1") 允许数值前出现小数点（".1"变得与"0.1"等价）
2	allow_trailing_dot	Allow a trailing dot to be present ("1." becomes equivalent to "1.0") 允许数值后出现小数点("1."变得与"1.0"等价)
3	expect_dot	Require a dot to be present (disallows "1" to be equivalent to "1.0") 要求数值中有小数点(不允许“1”与"1.0"等价)

[ From here on, required reading: The Scanner, In-depth The Parser and In-depth The Scanner ]

[要理解下面的内容，需要阅读扫描器、深入分析器、深入扫描器]

sign_parser 和 sign_p

Before we move on, a small utility parser is included here to ease the parsing of the '-' or '+' sign. While it is easy to write one:

在我们继续深入之前，先加入一个工具分析器，以使解析'-'或'+'符号变得容易些。虽然这玩意很容易写:

    sign_p = (ch_p('+') | '-');

it is not possible to extract the actual sign (positive or negative) without resorting to semantic actions. The sign_p parser has a bool attribute returned to the caller through the match object which, after parsing, is set to true if the parsed sign is negative. This attribute detects if the negative sign has been parsed. Examples:

抛开语义动作而又正确的提取符号（正或负）是不可能的。sign_p分析器在匹配后，在返回给调用着的匹配对象中有一个bool属性，如果所解析的符号为负号，则该属性值为true。这个属性用于检测是否分析到符号。例：

    bool is_negative;
    r = sign_p[assign_a(is_negative)];

or simply...

或者简单些...

    // directly extract the result from the match result's value
    bool is_negative = sign_p.parse(scan).value();

The sign_p parser expects attached semantic actions to have a signature (see Specialized Actions for further detail) compatible with:

sign_p分析器要求连接的语义动作有与下面兼容的签名（查看特定动作可获得更进一步的细节）：

Signature for functions:

函数签名：

    void func(bool is_negative);

Signature for functors:

仿函数签名：

    struct ftor
    {
        void operator()(bool is_negative) const;
    };

ureal_parser_policies

    template <typename T>
    struct ureal_parser_policies
    {
        typedef uint_parser<T, 10, 1, -1>   uint_parser_t;
        typedef int_parser<T, 10, 1, -1>    int_parser_t;

        static const bool allow_leading_dot  = true;
        static const bool allow_trailing_dot = true;
        static const bool expect_dot         = false;

        template <typename ScannerT>
        static typename match_result<ScannerT, nil_t>::type
        parse_sign(ScannerT& scan)
        { return scan.no_match(); }

        template <typename ScannerT>
        static typename parser_result<uint_parser_t, ScannerT>::type
        parse_n(ScannerT& scan)
        { return uint_parser_t().parse(scan); }

        template <typename ScannerT>
        static typename parser_result<chlit<>, ScannerT>::type
        parse_dot(ScannerT& scan)
        { return ch_p('.').parse(scan); }

        template <typename ScannerT>
        static typename parser_result<uint_parser_t, ScannerT>::type
        parse_frac_n(ScannerT& scan)
        { return uint_parser_t().parse(scan); }

        template <typename ScannerT>
        static typename parser_result<chlit<>, ScannerT>::type
        parse_exp(ScannerT& scan)
        { return as_lower_d['e'].parse(scan); }

        template <typename ScannerT>
        static typename parser_result<int_parser_t, ScannerT>::type
        parse_exp_n(ScannerT& scan)
        { return int_parser_t().parse(scan); }
    };

The default ureal_parser_policies uses the lower level integer numeric parsers to do its job.

默认的ureal_parser_policies用更低一层的整数分析器来完成它的工作。

real_parser_policies

    template <typename T>
    struct real_parser_policies : public ureal_parser_policies<T>
    {
        template <typename ScannerT>
        static typename parser_result<sign_parser, ScannerT>::type
        parse_sign(ScannerT& scan)
        { return sign_p.parse(scan); }
    };

Notice how the real_parser_policies replaced parse_sign of the ureal_parser_policies from which it is subclassed. The default real_parser_policies simply uses a sign_p instead of scan.no_match() in the parse_sign step.

注意real_parser_policies 是如何替换ureal_parser_policies——它的基类的parse_sign的。默认的real_parser_policies只是在parse_sign这步简单的使用sign_p替换scan.no_match()。

strict_ureal_parser_policies and strict_real_parser_policies

strict_ureal_parser_policies 和 strict_real_parser_policies

    template <typename T>
    struct strict_ureal_parser_policies : public ureal_parser_policies<T>
    {
        static const bool expect_dot = true;
    };

    template <typename T>
    struct strict_real_parser_policies : public real_parser_policies<T>
    {
        static const bool expect_dot = true;
    };

Again, these policies replaced just the policies they wanted different from their superclasses.

再次，这些策略类只替换了那些它们想与基类不同的策略。

Specialized real parser policies can reuse some of the defaults while replacing a few. For example, the following is a real number parser policy that parses thousands separated numbers with at most two decimal places and no exponent.

特化的实数分析器策略可复用一些经过小的修改的默认策略。例如，下面是一个分析千位分隔、最多两位小数、没有指数部分的实数。

The full source code can be viewed here.

完整的代码在这里。

    template <typename T>
    struct ts_real_parser_policies : public ureal_parser_policies<T>
    {
        //  These policies can be used to parse thousand separated
        //  numbers with at most 2 decimal digits after the decimal
        //  point. e.g. 123,456,789.01

        typedef uint_parser<int, 10, 1, 2>  uint2_t;
        typedef uint_parser<T, 10, 1, -1>   uint_parser_t;
        typedef int_parser<int, 10, 1, -1>  int_parser_t;

        //  2 decimal places Max
        template <typename ScannerT>
        static typename parser_result<uint2_t, ScannerT>::type
        parse_frac_n(ScannerT& scan)
        { return uint2_t().parse(scan); }

        //  No exponent
        template <typename ScannerT>
        static typename parser_result<chlit<>, ScannerT>::type
        parse_exp(ScannerT& scan)
        { return scan.no_match(); }

        //  No exponent
        template <typename ScannerT>
        static typename parser_result<int_parser_t, ScannerT>::type
        parse_exp_n(ScannerT& scan)
        { return scan.no_match(); }

        //  Thousands separated numbers
        template <typename ScannerT>
        static typename parser_result<uint_parser_t, ScannerT>::type
        parse_n(ScannerT& scan)
        {
            typedef typename parser_result<uint_parser_t, ScannerT>::type RT;
            static uint_parser<unsigned, 10, 1, 3> uint3_p;
            static uint_parser<unsigned, 10, 3, 3> uint3_3_p;

            if (RT hit = uint3_p.parse(scan))
            {
                T n;
                typedef typename ScannerT::iterator_t iterator_t;
                iterator_t save = scan.first;
                while (match<> next = (',' >> uint3_3_p[assign_a(n)]).parse(scan))
                {
                    hit.value() *= 1000;
                    hit.value() += n;
                    scan.concat_match(hit, next);
                    save = scan.first;
                }
                scan.first = save;
                return hit;

                // Note: On erroneous input such as "123,45", the result should
                // be a partial match "123". 'save' is used to makes sure that
                // the scanner position is placed at the last *valid* parse
                // position.
            }
            return scan.no_match();
        }
    };

Copyright © 1998-2002 Joel de Guzman

Use, modification and distribution is subject to the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)