boost.spirit用户手册翻译(31):字符集

Character Sets

字符集


The character set chset matches a set of characters over a finite range bounded by the limits of its template parameter CharT. This class is an optimization of a parser that acts on a set of single characters. The template class is parameterized by the character type CharT and can work efficiently with 8, 16 and 32 and even 64 bit characters.

字符集分析器chset匹配由它的模板参数CharT所限制的范围内的有限字符子集。这个分析器针对单个字符的集合的分析而优化。这个模板类由字符类CharT参数化并且可以使用8、16和32甚至64位的字符。

    template <typename CharT = char>
    class chset;

The chset is constructed from literals (e.g. 'x'), ch_p or chlit<>, range_p or range<>, anychar_p and nothing_p (see primitives) or copy-constructed from another chset. The chset class uses a copy-on-write scheme that enables instances to be passed along easily by value.

chset构造自字符分析器,ch_p或chlit<>、range_p或range<>、anychar_p和nothing_p(见元素)或者拷贝构造自其他chset。chset类使用"延迟拷贝"技术,使其可以轻松传值使用。

Sparse bit vectors

稀疏比特vector

To accomodate 16/32 and 64 bit characters, the chset class statically switches from a std::bitset implementation when the character type is not greater than 8 bits, to a sparse bit/boolean set which uses a sorted vector of disjoint ranges (range_run). The set is constructed from ranges such that adjacent or overlapping ranges are coalesced.

为了容纳16/32和64位字符,当字符类长度小于8位时,chset类的实现从std::bitset类切换到一个使用不连续范围(range_run)的有序向量(vector)的位/布尔集合。这个集合使用范围(range_p?),使得那些相邻的或重叠的范围得以链接。

range_runs are very space-economical in situations where there are lots of ranges and a few individual disjoint values. Searching is O(log n) where n is the number of ranges.

range_run在有大批间续的值和少量离散值的情况下是非常节省空间的。搜索时间为O(log n),n是范围(range)的数目。

Examples:

例子:

    chset<> s1('x');
    chset<> s2(anychar_p - s1);

Optionally, character sets may also be constructed using a definition string following a syntax that resembles posix style regular expression character sets, except that double quotes delimit the set elements instead of square brackets and there is no special negation ^ character.


此外,字符集可以由一个根据类似于posix风格正则表达式(除了用双引号代替方括号且没有特殊的反^字符)的字符串定义。
    range = anychar_p >> '-' >> anychar_p;
    set = *(range_p | anychar_p);

Since we are defining the set using a C string, the usual C/C++ literal string syntax rules apply. Examples:

由于我们用C字符串定义,因此通常的C/C++字符串语法同样适用。例子:

    chset<> s1("a-zA-Z");       // alphabetic characters
    chset<> s2("0-9a-fA-F");    // hexadecimal characters
    chset<> s3("actgACTG");     // DNA identifiers
    chset<> s4("/x7f/x7e");     // Hexadecimal 0x7F and 0x7E

The standard Spirit set operators apply (see operators) plus an additional character-set-specific inverse (negation ~) operator:


标准的Spirit集合操作符(见 操作符)还添加了一个字符集专属的补集(求反~)操作符:

Character set operators

字符集操作符

~aSet inverse补集
a | bSet union并集
a &bSet intersection交集
a - bSet difference差集
a ^ bSet xor集合异或








where operands a and b are both chsets or one of the operand is either a literal character, ch_p or chlit, range_p or range, anychar_p or nothing_p. Special optimized overloads are provided for anychar_p and nothing_p operands. A nothing_p operand is converted to an empty set, while an anychar_p operand is converted to a set having elements of the full range of the character type used (e.g. 0-255 for unsigned 8 bit chars).

这里算子a和b都是chset或者其中一个是字面字符,ch_p或chlit,range_p或range,anychar_p或nothing_p。为anychar_p和nothing_p算子提供了特别优化的算符重载。nothing_p算子被转化为一个空集而anychar_p算子则被转化为一个所使用的字符的全集(比如8位的字符就是0-255)。

A special case is ~anychar_p which yields nothing_p, but ~nothing_p is illegal. Inversion of anychar_p is asymmetrical, a one-way trip comparable to converting T* to a void*.

不过特例是~anychar_p会产生nothing_p而~nothcing_p是不合法的。anychar_p的求反是不对称的,单向的,类似于从T*到void*。

Special conversions

特殊转换

chset (nothing_p)

empty set

空集

chset (anychar_p)

full range of CharT (e.g. 0-255 for unsigned 8 bit chars)

CharT的全范围(比如8位的字符就是0-255)

~anychar_pnothing_p
~nothing_p

illegal

非法




Powered by Zoundry

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器学习模型机器
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值