python正则表达式多行匹配,Python正则表达式匹配多行（re.DOTALL）

最新推荐文章于 2023-06-14 09:30:00 发布

姗胖胖Joyce

最新推荐文章于 2023-06-14 09:30:00 发布

阅读量1k

点赞数

文章标签： python正则表达式多行匹配

我正在尝试解析多行字符串。

假设是：

text = '''

Section1

stuff belonging to section1

Section2

stuff belonging to section2

'''

我想使用re模块的finditer方法来获得像这样的字典：

{'section': 'Section1', 'section_data': 'stuff belonging to section1\nstuff belonging to section1\nstuff belonging to section1\n'}

{'section': 'Section2', 'section_data': 'stuff belonging to section2\nstuff belonging to section2\nstuff belonging to section2\n'}

我尝试了以下方法：

import re

re_sections=re.compile(r"(?PSection\d)\s*(?P.+)", re.DOTALL)

sections_it = re_sections.finditer(text)

for m in sections_it:

print m.groupdict()

但这导致：

{'section': 'Section1', 'section_data': 'stuff belonging to section1\nstuff belonging to section1\nstuff belonging to section1\nSection2\nstuff belonging to section2\nstuff belonging to section2\nstuff belonging to section2\n'}

因此，section_data也匹配Section2。

我还试图告诉第二组匹配第一个组以外的所有组。但这根本没有输出。

re_sections=re.compile(r"(?PSection\d)\s+(?P^(?P=section))", re.DOTALL)

我知道我可以使用以下内容，但我正在寻找一个版本，无需在此告诉第二组的外观。

re_sections=re.compile(r"(?PSection\d)\s+(?P[a-z12\s]+)", re.DOTALL)

非常感谢你！

解决方案

使用先行查找将所有内容匹配到下一部分标题或字符串的末尾：

re_sections=re.compile(r"(?PSection\d)\s*(?P.+?)(?=(?:Section\d|$))", re.DOTALL)

请注意，这也需要一个非贪婪的.+?方法，否则它仍然会一直匹配到最后。

演示：

>>> re_sections=re.compile(r"(?PSection\d)\s*(?P.+?)(?=(?:Section\d|$))", re.DOTALL)

>>> for m in re_sections.finditer(text): print m.groupdict()

...

{'section': 'Section1', 'section_data': 'stuff belonging to section1\nstuff belonging to section1\nstuff belonging to section1\n'}

{'section': 'Section2', 'section_data': 'stuff belonging to section2\nstuff belonging to section2\nstuff belonging to section2'}

姗胖胖Joyce

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python正则表达式多行匹配,Python正则表达式匹配多行（re.DOTALL）

我正在尝试解析多行字符串。假设是：text = '''Section1stuff belonging to section1stuff belonging to section1stuff belonging to section1Section2stuff belonging to section2stuff belonging to section2stuff belonging to sec...
复制链接

扫一扫