优秀(和困难)的问题!
首先,使用PCRE正则表达式引擎,(?R)的行为就像一个原子组(不像Perl?)。一旦匹配(或不匹配),递归调用中发生的匹配是最终的(并且所有在递归调用中保存的回溯导航痕迹都将被丢弃)。然而,正则表达式引擎确实保存了整个(?R)表达式匹配的内容,并且可以将其重新设计,并尝试使用其他替代方案来实现整体匹配。为了描述发生的情况,让我们稍微改变一下你的例子,以便在每个步骤中更容易谈论和跟踪什么是匹配的。而不是:aaaa作为主题文本,可以使用:abcd。并且让我们将正则表达式从’#a(?:(?R)| a?)a#’更改为:’#。(?:(?R)|。?)。正则表达式引擎匹配行为是一样的。
匹配正则表达式:/.(?:(?R)|.?)./至:“abcd”
answer = r'''
Step Depth Regex Subject Comment
1 0 .(?:(?R)|.?). abcd Dot matches "a". Advance pointers.
^ ^
2 0 .(?:(?R)|.?). abcd Try 1st alt. Recursive call (to depth 1).
^ ^
3 1 .(?:(?R)|.?). abcd Dot matches "b". Advance pointers.
^ ^
4 1 .(?:(?R)|.?). abcd Try 1st alt. Recursive call (to depth 2).
^ ^
5 2 .(?:(?R)|.?). abcd Dot matches "c". Advance pointers.
^ ^
6 2 .(?:(?R)|.?). abcd Try 1st alt. Recursive call (to depth 3).
^ ^
7 3 .(?:(?R)|.?). abcd Dot matches "d". Advance pointers.
^ ^
8 3 .(?:(?R)|.?). abcd Try 1st alt. Recursive call (to depth 4).
^ ^
9 4 .(?:(?R)|.?). abcd Dot fails to match end of string.
^ ^ DEPTH 4 (?R) FAILS. Return to step 8 depth 3.
Give back text consumed by depth 4 (?R) = ""
10 3 .(?:(?R)|.?). abcd Try 2nd alt. Optional dot matches EOS.
^ ^ Advance regex pointer.
11 3 .(?:(?R)|.?). abcd Required dot fails to match end of string.
^ ^ DEPTH 3 (?R) FAILS. Return to step 6 depth 2
Give back text consumed by depth3 (?R) = "d"
12 2 .(?:(?R)|.?). abcd Try 2nd alt. Optional dot matches "d".
^ ^ Advance pointers.
13 2 .(?:(?R)|.?). abcd Required dot fails to match end of string.
^ ^ Backtrack to step 12 depth 2
14 2 .(?:(?R)|.?). abcd Match zero "d" (give it back).
^ ^ Advance regex pointer.
15 2 .(?:(?R)|.?). abcd Dot matches "d". Advance pointers.
^ ^ DEPTH 2 (?R) SUCCEEDS.
Return to step 4 depth 1
16 1 .(?:(?R)|.?). abcd Required dot fails to match end of string.
^ ^ Backtrack to try other alternative. Give back
text consumed by depth 2 (?R) = "cd"
17 1 .(?:(?R)|.?). abcd Optional dot matches "c". Advance pointers.
^ ^
18 1 .(?:(?R)|.?). abcd Required dot matches "d". Advance pointers.
^ ^ DEPTH 1 (?R) SUCCEEDS.
Return to step 2 depth 0
19 0 .(?:(?R)|.?). abcd Required dot fails to match end of string.
^ ^ Backtrack to try other alternative. Give back
text consumed by depth 1 (?R) = "bcd"
20 0 .(?:(?R)|.?). abcd Try 2nd alt. Optional dot matches "b".
^ ^ Advance pointers.
21 0 .(?:(?R)|.?). abcd Dot matches "c". Advance pointers.
^ ^ SUCCESSFUL MATCH of "abc"
'''
正则表达式引擎没有错。正确的匹配是abc(或原始问题的aaa)。对于另一个较长的结果字符串,可以做出类似的(虽然长得多)的步骤序列。