python—正则表达式

最新推荐文章于 2024-02-24 22:07:12 发布

Py-Frank

最新推荐文章于 2024-02-24 22:07:12 发布

阅读量200

点赞数

分类专栏： python基础知识文章标签： python 正则表达式

本文链接：https://blog.csdn.net/FrankGavin/article/details/119120968

版权

本文详细介绍了Python中的正则表达式，包括常用的正则语法、贪婪与非贪婪模式的差异，以及re模块的match、search、findall、finditer、split和sub等关键函数的使用。此外，还提到了正则表达式的修饰符选项。

摘要由CSDN通过智能技术生成

python—正则表达式

官方中文文档：https://docs.python.org/zh-cn/3/library/re.html#search-vs-match

正则语法表

以下为常用语法，部分不常用语法（先行断言(lookahead)和后行断言(lookbehind) ）见补充：https://www.runoob.com/w3cnote/reg-lookahead-lookbehind.html

范例所用文本：

"""<link rel="dns-prefetch" href="//wl.jd.com" />
<title>eastman - 商品搜索 - 京东</title>
<meta name="Keywords" content="eastman，京东eastman" />
<meta name="description" content="在京东找到了eastman1313件eastman的类似商品，其中包含了eastman价格、eastman评论、eastman导购、eastman图片等相关信息" />"""

在这里插入图片描述

贪婪模式与非贪婪模式

python默认贪婪模式。
贪婪模式：在匹配结果有二义时，尽量匹配最大长度字符串。
'’, ‘+’，和 ‘?’ 修饰符都是贪婪的，它们在字符串进行尽可能多的匹配。有时候并不需要这种行为。如果正则式 <.> 希望找到 ’ b '，它将会匹配整个字符串，而不仅是 ‘’。
在修饰符之后添加 ? 将使样式以非贪婪`方式进行匹配，尽量少的字符将会被匹配。使用正则式 <.*?> 将会仅仅匹配 ‘’。
python中常用正则函数

re.match() 与 re.search()

语法：re.match(pattern, string, flags=0) 
	尝试从字符串的起始位置匹配一个模式，如果不是起始位置匹配成功的话，match()就返回none。
即便是 MULTILINE 多行模式， re.match() 也只匹配字符串的开始位置，而不匹配每行开始。

import re

text = '''<link rel="dns-prefetch" href="//wl.jd.com" />
<title>eastman - 商品搜索 - 京东</title>
'''
mo = re.match('<\w\w\w',text)
print(mo)#<re.Match object; span=(0, 4), match='<lin'>
print(mo.group())#<lin
print(mo.start())#0
print(mo.span())#(0, 4)
mo1 = re.