c++ regex 库,通过 #include <regex>
来使用,是c++11标准引入的功能。
regex 库提供三个最基本的正则表达式函数
regex_match -> 完全匹配
regex_search -> 局部匹配
regex_replace -> 匹配后替换
regex 有几个基本的类(其实都是模板类) :
basic_regex -> 存储正则表达式
match_results -> 存储匹配结果们
sub_match -> 一个独立的比对结果,可以无感觉转化为字符串
regex_iterator -> 迭代执行regex_search的语法糖
basic_regex 通过传入正则表达式来构建,默认是ECMAScript格式,也支持其他的比如POSIX 格式,grep 格式等,其他格式需要通过第二个参数指定就是。
// explicit basic_regex ( const charT* str, flag_type flags = ECMAScript );
std::regex seventh ("[0-9A-Z]+", std::regex::ECMAScript);
match_results 存储匹配结果(也就是n个sub_match),如果匹配失败,就是空的,否则,数组的第0位置是完整的匹配原始字符串,从第一位开始依序是每一个匹配块。
regex_match , 必须是 正则表达式完整的匹配了整个字符串才行
// regex_match example
#include <iostream>
#include <string>
#include <regex>
int main ()
{
if (std::regex_match ("subject", std::regex("(sub)(.*)") ))
std::cout << "string literal matched\n";
const char cstr[] = "subject";
std::string s ("subject");
std::regex e ("(sub)(.*)");
if (std::regex_match (s,e))
std::cout << "string object matched\n";
if ( std::regex_match ( s.begin(), s.end(), e ) )
std::cout << "range matched\n";
std::cmatch cm; // same as std::match_results<const char*> cm;
std::regex_match (cstr,cm,e);
std::cout << "string literal with " << cm.size() << " matches\n";
std::smatch sm; // same as std::match_results<string::const_iterator> sm;
std::regex_match (s,sm,e);
std::cout << "string object with " << sm.size() << " matches\n";
std::regex_match ( s.cbegin(), s.cend(), sm, e);
std::cout << "range with " << sm.size() << " matches\n";
// using explicit flags:
std::regex_match ( cstr, cm, e, std::regex_constants::match_default );
std::cout << "the matches were: ";
for (unsigned i=0; i<cm.size(); ++i) {
std::cout << "[" << cm[i] << "] ";
}
std::cout << std::endl;
return 0;
}
regex_search 只找到第一个局部匹配,不匹配的部分存储在match_results的prefix和suffix变量中,因此可以方便的迭代匹配。
// regex_search example
#include <iostream>
#include <string>
#include <regex>
int main ()
{
std::string s ("this subject has a submarine as a subsequence");
std::smatch m;
std::regex e ("\\b(sub)([^ ]*)"); // matches words beginning by "sub"
std::cout << "Target sequence: " << s << std::endl;
std::cout << "Regular expression: /\\b(sub)([^ ]*)/" << std::endl;
std::cout << "The following matches and submatches were found:" << std::endl;
while (std::regex_search (s,m,e)) {
for (auto x:m) std::cout << x << " ";
std::cout << std::endl;
s = m.suffix().str();
}
return 0;
}
regex_iterator 在字符串上迭代进行regex_search
// regex_iterator constructor
#include <iostream>
#include <string>
#include <regex>
int main ()
{
std::string s ("this subject has a submarine as a subsequence");
std::regex e ("\\b(sub)([^ ]*)"); // matches words beginning by "sub"
std::regex_iterator<std::string::iterator> rit ( s.begin(), s.end(), e );
std::regex_iterator<std::string::iterator> rend;
while (rit!=rend) {
std::cout << rit->str() << std::endl;
++rit;
}
return 0;
}
regex_replace 进行替换
// regex_replace example
#include <iostream>
#include <string>
#include <regex>
#include <iterator>
int main ()
{
std::string s ("there is a subsequence in the string\n");
std::regex e ("\\b(sub)([^ ]*)"); // matches words beginning by "sub"
// using string/c-string (3) version:
std::cout << std::regex_replace (s,e,"sub-$2");
// using range/c-string (6) version:
std::string result;
std::regex_replace (std::back_inserter(result), s.begin(), s.end(), e, "$2");
std::cout << result;
// with flags:
std::cout << std::regex_replace (s,e,"$1 and $2",std::regex_constants::format_no_copy);
std::cout << std::endl;
return 0;
}
注意被替换的fmt可以如下设计