正则表达式 2

最新推荐文章于 2022-04-08 11:28:52 发布

Rich_Z_b_f

最新推荐文章于 2022-04-08 11:28:52 发布

阅读量101

点赞数

分类专栏： python学习文章标签： python 正则表达式

本文链接：https://blog.csdn.net/weixin_43525185/article/details/116404717

版权

python学习专栏收录该内容

11 篇文章 0 订阅

订阅专栏

正则表达式

- 匹配开头结尾
- 匹配分组
re模块的其他用法

匹配开头结尾

字符	功能
^	匹配字符串开头
$	匹配字符串结尾

其中re.match自带^，这里只演示一下$的用法

In [17]: re.match("12a$","12a").group()
Out[17]: '12a'

In [18]: re.match("12a$","12ab").group()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-18-6c5a080b2ad6> in <module>()
----> 1 re.match("12a$","12ab").group()

AttributeError: 'NoneType' object has no attribute 'group'
# 没有$符号不会报错
In [19]: re.match("12a","12ab").group()
Out[19]: '12a'

匹配分组

字符	功能
\|	匹配左右任意一个表达式
(ab)	将括号中字符作为一个分组
\num	引用分组num匹配到的字符串
(?p<name>)	分组起别名
(?p=name)	引用别名为name分组匹配到的字符串

| 相当于“或”

In [22]: re.match("12a|bb","12a").group()
Out[22]: '12a'

In [23]: re.match("12a|bb","bb").group()
Out[23]: 'bb'

(ab) 的使用

In [34]: re.match("([^-]*)-(\d+)","111-1234").group()
Out[34]: '111-1234'

\num 的使用 第一个()分组后匹配出来的字符可以用\1来表示，
则第二个()分组后匹配出来的字符可以用\2表示，以此类推

In [41]: re.match(r"<(\w*)><(\w*)>.*</\2></\1>","<html><h1>123456</h1></html>" ).group()
Out[41]: '<html><h1>123456</h1></html>'

In [42]: re.match(r"<(\w*)><(\w*)>.*</\2></\1>","<html><h1>www.itcast.cn</h2></html>" ).group()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-42-b24647114061> in <module>()
----> 1 re.match(r"<(\w*)><(\w*)>.*</\2></\1>","<html><h1>www.itcast.cn</h2></html>" ).group()

AttributeError: 'NoneType' object has no attribute 'group'

(?P<name>) 和 (?P=name)的使用
之前()分组后匹配到的字符串可以用\1来表示，
(?P<name>) 为匹配到的字符串起了名字
(?P=name)可以用之前起的名字来进行提取之前匹配到的字符串

并且其中的P为大写

In [48]: re.match(r"<(?P<aa>\w*)><(?P<aaa>\w*)>.*</(?P=aaa)></(?P=aa)>", "<html><h1>153</h1></html>").group()
Out[48]: '<html><h1>153</h1></html>'

re模块的其他用法

表格中的用法都不是从头开始匹配的

方法	功能
search	不要求必须在开头只要有即可
findall	匹配到全部适合的字符串
sub	把匹配到的字符串进行替换
split	根据匹配进行切割字符串

search 的用法

In [50]: re.search(r"\d+", "abcd 1516").group()
Out[50]: '1516'

findall 的用法, findall返回的就是一个列表，不需要在进行.group()

In [52]: re.findall(r"\d+", "python = 9999, c = 7890, c++ = 12345")
Out[52]: ['9999', '7890', '12345']

sub 的用法， 
格式： re.sub(r"正则表达式","替换的字符串","匹配字符串")
返回更改后的字符串

In [55]: re.sub(r"\d+", "78", "dadad 99")
Out[55]: 'dadad 78'

split 的用法
返回的是一个列表

In [56]: re.split(r":","aaa:33:bbb")
Out[56]: ['aaa', '33', 'bbb']

Rich_Z_b_f

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
正则表达式 2

正则表达式匹配开头结尾匹配分组re模块的其他用法匹配开头结尾字符功能^匹配字符串开头$匹配字符串结尾其中re.match自带^，这里只演示一下$的用法In [17]: re.match("12a$","12a").group()Out[17]: '12a'In [18]: re.match("12a$","12ab").group()---------------------------------------------------------------
复制链接

扫一扫