正则表达式学习指南(十七)----Atomic Grouping

Atomic Grouping

An atomic group is a group that, when the regex engine exits from it, automatically throws away all backtracking positions remembered by any tokens inside the group. Atomic groups are non-capturing. The syntax is(?>group). Lookaround groups are also atomic. Atomic grouping is supported by most modern regular expression flavors, including theJGsoft flavor, Java, PCRE, .NET, Perl and Ruby. The first three of these also supportpossessive quantifiers, which are essentially a notational convenience for atomic grouping.

An example will make the behavior of atomic groups clear. The regular expressiona(bc|b)c (capturing group) matches abcc andabc. The regex a(?>bc|b)c (atomic group) matchesabcc but not abc.

When applied to abc, both regexes will match a to a, bc to bc, and then c will fail to match at the end of the string. Here there paths diverge. The regex with the capturing group has remembered a backtracking position for the alternation. The group will give up its match,b then matches b and c matches c. Match found!

The regex with the atomic group, however, exited from an atomic group after bc was matched. At that point, all backtracking positions for tokens inside the group are discarded. In this example, the alternation's option to tryb at the second position in the string is discarded. As a result, whenc fails, the regex engine has no alternatives left to try.

Of course, the above example isn't very useful. But it does illustrate very clearly how atomic grouping eliminates certain matches. Or more importantly, it eliminates certain match attempts.

Regex Optimization Using Atomic Grouping

Consider the regex \b(integer|insert|in)\b and the subjectintegers. Obviously, because of the word boundaries, these don't match. What's not so obvious is that the regex engine will spend quite some effort figuring this out.

\b matches at the start of the string, and integer matches integer. The regex engine makes note that there are two more alternatives in the group, and continues with\b. This fails to match between the r ands. So the engine backtracks to try the second alternative inside the group. The second alternative matchesin, but then fails to match s. So the engine backtracks once more to the third alternative.in matches in. \b fails between then and t this time. The regex engine has no more remembered backtracking positions, so it declares failure.

This is quite a lot of work to figure out integers isn't in our list of words. We can optimize this by telling the regular expression engine that if it can't match\b after it matched integer, then it shouldn't bother trying any of the other words. The word we've encountered in the subject string is a longer word, and it isn't in our list.

We can do this by turning the capturing group into an atomic group: \b(?>integer|insert|in)\b. Now, when integer matches, the engine exits from an atomic group, and throws away the backtracking positions it stored for the alternation. When\b fails, the engine gives up immediately. This savings can be significant when scanning a large file for a long list of keywords. This savings will be vital when your alternatives containrepeated tokens (not to mention repeated groups) that lead to catastrophic backtracking.

Don't be too quick to make all your groups atomic. As we saw in the first example above, atomic grouping can exclude valid matches too. Compare how\b(?>integer|insert|in)\b and \b(?>in|integer|insert)\b behave when applied toinsert. The former regex matches, while the latter fails. If the groups weren't atomic, both regexes would match. Remember thatalternation tries its alternatives from left to right. If the second regex matchesin, it won't try the two other alternatives due to the atomic group.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
根据提供的引用内容,您想要安装atomic-sqlite。根据引用\[1\],在执行yum install atomic-sqlite命令时,出现了错误信息"Package requirements (oniguruma) were not met: No package 'oniguruma' found"。根据引用\[2\],您可以尝试使用yum clean dbcache命令来清除缓存并重新下载安装包。另外,根据引用\[3\],在安装atomic-sqlite之前,您可能需要先安装一些依赖项,如zlib-devel、pcre2-devel、sqlite-devel等。您可以使用以下命令来安装这些依赖项: ``` yum install zlib-devel pcre2-devel sqlite-devel ``` 然后,您可以尝试重新执行yum install atomic-sqlite命令来安装atomic-sqlite。 #### 引用[.reference_title] - *1* [php 源码安装常见问题汇总](https://blog.csdn.net/wfl020315/article/details/124141141)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] - *2* [CentOS 6.9安装MySQL 5.6 (使用yum安装)](https://blog.csdn.net/weixin_33995481/article/details/94448872)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] - *3* [OSSEC-hids 主机入侵检测系统概述](https://blog.csdn.net/fured/article/details/112221773)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值