std::match_results
(匹配的结果存入其中)
result[0]是完整的文本,result[1]是第一个分组匹配的数据。如果正则表达式有n个分组,match_results的size也就是n+1个
This is a specialized allocator-aware container. It can only be default created, obtained from std::regex_iterator, or modified by std::regex_search or std::regex_match. Because std::match_results holds std::sub_matches, each of which is a pair of iterators into the original character sequence that was matched, it’s undefined behavior to examine std::match_results if the original character sequence was destroyed or iterators to it were invalidated for other reasons.
Type | Definition |
---|---|
std::cmatch | std::match_results<const char*> |
std::wcmatch | std::match_results<const wchar_t*> |
std::smatch | std::match_results<std::string::const_iterator> |
std::wsmatch | std::match_results<std::wstring::const_iterator> |
std::pmr::cmatch (C++17) | std::pmr::match_results<const char*> |
std::pmr::wcmatch (C++17) | std::pmr::match_results<const wchar_t*> |
std::pmr::smatch (C++17) | std::pmr::match_results<std::string::const_iterator> |
std::pmr::wsmatch (C++17) | std::pmr::match_results<std::wstring::const_iterator> |
std::sub_match
用来观测match_results的结果
The class template std::sub_match is used by the regular expression engine to denote sequences of characters matched by marked sub-expressions.
regex_match
Returns true if a match exists, false otherwise.
#include <iostream>
#include <regex>
#include <string>
int main()
{
// Simple regular expression matching
const std::string fnames[] = {"foo.txt", "bar.txt", "baz.dat", "zoidberg"};
const std::regex txt_regex("[a-z]+\\.txt");
for (const auto &fname : fnames)
std::cout << fname << ": " << std::regex_match(fname, txt_regex) << '\n';
/*
foo.txt: 1
bar.txt: 1
baz.dat: 0
zoidberg: 0
*/
// Extraction of a sub-match
const std::regex base_regex("([a-z]+)\\.txt");
std::smatch base_match;
for (const auto &fname : fnames){
if (std::regex_match(fname, base_match, base_regex)){
// The first sub_match is the whole string; the next
// sub_match is the first parenthesized expression.
if (base_match.size() == 2){
std::ssub_match base_sub_match = base_match[1];
std::string base = base_sub_match.str();
std::cout << fname << " has a base of " << base << '\n';
}
}
}
/*
foo.txt has a base of foo
bar.txt has a base of bar
*/
// Extraction of several sub-matches
const std::regex pieces_regex("([a-z]+)\\.([a-z]+)");
std::smatch pieces_match;
for (const auto &fname : fnames){
if (std::regex_match(fname, pieces_match, pieces_regex)){
std::cout << fname << '\n';
for (size_t i = 0; i < pieces_match.size(); ++i){
std::ssub_match sub_match = pieces_match[i];
std::string piece = sub_match.str();
std::cout << " submatch " << i << ": " << piece << '\n';
}
}
}
}
/*
foo.txt
submatch 0: foo.txt
submatch 1: foo
submatch 2: txt
bar.txt
submatch 0: bar.txt
submatch 1: bar
submatch 2: txt
baz.dat
submatch 0: baz.dat
submatch 1: baz
submatch 2: dat
*/
regex_search
std::regex_search: 搜素正则表达式参数,但它不要求整个字符序列完全匹配。而且它只进行单次搜索,搜索到即停止继续搜索,不进行重复多次搜索。
Determines if there is a match between the regular expression e and some subsequence in the target character sequence.
1- Analyzes generic range [first, last). Match results are returned in m.
2- Analyzes a null-terminated string pointed to by str. Match results are returned in m.
3- Analyzes a string s. Match results are returned in m.
4-6- Equivalent to (1-3), just omits the match results.
7- The overload (3) is prohibited from accepting temporary strings, otherwise this function populates match_results m with string iterators that become invalid immediately.
regex_search will successfully match any subsequence of the given sequence, whereas std::regex_match will only return true if the regular expression matches the entire sequence.
#include <iostream>
#include <regex>
#include <string>
int main()
{
std::string lines[] = {"Roses are #ff0000",
"violets are #0000ff",
"all of my base are belong to you"};
std::regex color_regex("#([a-f0-9]{2})"
"([a-f0-9]{2})"
"([a-f0-9]{2})");
// simple match
for (const auto &line : lines) {
std::cout << line << ": " << std::boolalpha
<< std::regex_search(line, color_regex) << '\n';
}
std::cout << '\n';
// show contents of marked subexpressions within each match
std::smatch color_match;
for (const auto& line : lines) {
if(std::regex_search(line, color_match, color_regex)) {
std::cout << "matches for '" << line << "'\n";
std::cout << "Prefix: '" << color_match.prefix() << "'\n";
for (size_t i = 0; i < color_match.size(); ++i)
std::cout << i << ": " << color_match[i] << '\n';
std::cout << "Suffix: '" << color_match.suffix() << "\'\n\n";
}
}
// repeated search (see also std::regex_iterator)
std::string log(R"(
Speed: 366
Mass: 35
Speed: 378
Mass: 32
Speed: 400
Mass: 30)");
std::regex r(R"(Speed:\t\d*)");
std::smatch sm;
while(regex_search(log, sm, r))
{
std::cout << sm.str() << '\n';
log = sm.suffix();
}
// C-style string demo
std::cmatch cm;
if(std::regex_search("this is a test", cm, std::regex("test")))
std::cout << "\nFound " << cm[0] << " at position " << cm.prefix().length();
}
std::regex_replace
- Copies characters in the range [first, last) to out, replacing any sequences that match re with characters formatted by fmt. In other words:
Constructs a std::regex_iterator object i as if by std::regex_iterator<BidirIt, CharT, traits> i(first, last, re, flags), and uses it to step through every match of re within the sequence [first,last).
For each such match m, copies the non-matched subsequence (m.prefix()) into out as if by out = std::copy(m.prefix().first, m.prefix().second, out) and then replaces the matched subsequence with the formatted replacement string as if by calling out = m.format(out, fmt, flags).
When no more matches are found, copies the remaining non-matched characters to out as if by out = std::copy(last_m.suffix().first, last_m.suffix().second, out) where last_m is a copy of the last match found.
If there are no matches, copies the entire sequence into out as-is, by out = std::copy(first, last, out)
If flags contains std::regex_constants::format_no_copy, the non-matched subsequences are not copied into out.
If flags contains std::regex_constants::format_first_only, only the first match is replaced. - same as 1), but the formatted replacement is performed as if by calling out = m.format(out, fmt, fmt + char_traits::length(fmt), flags)
3-4) Constructs an empty string result of type std::basic_string<CharT, ST, SA> and calls std::regex_replace(std::back_inserter(result), s.begin(), s.end(), re, fmt, flags).
5-6) Constructs an empty string result of type std::basic_string and calls std::regex_replace(std::back_inserter(result), s, s + std::char_traits::length(s), re, fmt, flags)
Return value
1-2) Returns a copy of the output iterator out after all the insertions.
3-6) Returns the string result which contains the output.
#include <iostream>
#include <iterator>
#include <regex>
#include <string>
int main()
{
std::string text = "Quick brown fox";
std::regex vowel_re("a|e|i|o|u");
// write the results to an output iterator
std::regex_replace(std::ostreambuf_iterator<char>(std::cout),
text.begin(), text.end(), vowel_re, "*");
// construct a string holding the results
std::cout << '\n' << std::regex_replace(text, vowel_re, "[$&]") << '\n';
}
std::regex_iterator
It is the programmer’s responsibility to ensure that the std::basic_regex object passed to the iterator’s constructor outlives the iterator. Because the iterator stores a pointer to the regex, incrementing the iterator after the regex was destroyed accesses a dangling pointer.
If the part of the regular expression that matched is just an assertion (^, $, \b, \B), the match stored in the iterator is a zero-length match, that is, match[0].first == match[0].second.
#include <iostream>
#include <iterator>
#include <regex>
#include <string>
int main()
{
const std::string s = "Quick brown fox.";
std::regex words_regex("[^\\s]+");
auto words_begin =
std::sregex_iterator(s.begin(), s.end(), words_regex);
auto words_end = std::sregex_iterator();
std::cout << "Found "
<< std::distance(words_begin, words_end)
<< " words:\n";
for (std::sregex_iterator i = words_begin; i != words_end; ++i)
{
std::smatch match = *i;
std::string match_str = match.str();
std::cout << match_str << '\n';
}
}