python built-in lib: re

最新推荐文章于 2024-05-13 20:14:45 发布

伊洛克

最新推荐文章于 2024-05-13 20:14:45 发布

阅读量138

点赞数

分类专栏： Python 文章标签：正则表达式 python

本文链接：https://blog.csdn.net/caohang1981/article/details/112730060

版权

Python 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

使用re有两种方式,

一种是先编译"compile"获得re.Pattern对象, 在调用对象的匹配方法, 当循环匹配或跨函数时使用预先编译能提高效率;
另外是直接使用re的模块级的函数, 内部同样会先编译在调用对应的匹配函数, 这类使用相对简单;

编译使用

>>> import re
>>> p = re.compile(r'h.')   #编译获得一个<class 're.Pattern'>对象
>>> m = p.match('hello') #匹配方法返回一个re.Match对象
<re.Match object; span=(0, 2), match='he'>
>>> m.group() #获取匹配到的字符串
'he'
>>> m.start() #返回匹配到的开始位置
0
>>> m.end() #返回匹配到的结束位置
2
>>> m.span() #返回一个元组包括(开始, 结束)的匹配位置
(0, 2)

直接调用re模块级函数

>>> re.match(r'h.', 'hello')
<re.Match object; span=(0, 2), match='he'>
>>> re.search(r'h.', 'hello')
<re.Match object; span=(0, 2), match='he'>
>>> re.findall(r'h.', 'hello')
['he']
>>> re.finditer(r'h.', 'hello')
<callable_iterator object at 0x10e735c18>

re.Pattern 常用方法

Method/Attribute	Purpose
match()	Determine if the RE matches at the beginning of the string.
search()	Scan through a string, looking for any location where this RE matches.
findall()	Find all substrings where the RE matches, and returns them as a list.
finditer()	Find all substrings where the RE matches, and returns them as an iterator.

re.Match 常用方法

Method/Attribute	Purpose
group()	Return the string matched by the RE
start()	Return the starting position of the match
end()	Return the ending position of the match
span()	Return a tuple containing the (start, end) positions of the match

正则表达式特殊字符

特殊字符主要分为: 类别字符,集合字符,边界字符,数量字符

Here’s a complete list of the metacharacters;

. ^ $ * + ? { } [ ] \ | ( )

类别字符表

特殊字符	等价字符集	含义
.	[^\n]	匹配除换行符的任一字符
\d	[0-9]	匹配0-9数字符
\D	[^0-9]	匹配非数字字符
\w	[a-zA-Z0-9_]	匹配字母或数字字符和下划线
\W	[^a-zA-Z0-9_]	匹配非字母或数字字符和下划线
\s	[ \t\n\r\f\v]	匹配所有空白字符
\S	[^ \t\n\r\f\v]	匹配非空白字符
\t		匹配一个水平制表符
\v		匹配垂直制表符
\r		匹配一回车符
\n		匹配一换行符
\v		匹配一换页符

集合字符表

特殊字符	等价字符集	含义
[abc]		匹配括号中的一个字符
[a-z]		匹配范围内的一个字符
[^abc]		匹配非集合内的所有字符
[^a-z]		匹配非集合范围内的所有字符

边界字符表

特殊字符	等价字符集	含义
^		匹配字符串开头, ^字符出现在集合匹配符内([])时表示非, 否则表示字符串开头
$		匹配字符串末尾

数量字符表

特殊字符	等价数量	含义
a*	{0,}	重复匹配a字符0-无数次
a+	{1,}	重复匹配a字符1-无数次
a?	{0,1}	重复匹配a字符0-1次
a(?!x)		当a后面不是x字符才匹配
a\|b		匹配字符a或b
a{n}		匹配连续n个a字符
a{n,}		匹配至少连续n个a字符
a{n,m}		匹配连续出现n到m个a字符