python中正则匹配之re模块

老街下着雨

已于 2022-02-17 16:58:59 修改

阅读量1.9k

点赞数 3

文章标签： python 正则表达式

于 2020-06-22 14:50:36 首次发布

本文链接：https://blog.csdn.net/l15767016983/article/details/106900679

版权

一、元字符

符号	描述	例子
\d	匹配数字	matchObj = re.match(r'\d', line)
\D	匹配非数字	matchObj = re.match(r'\D', line)
\w	匹配数字、字母、下划线、中文	matchObj = re.match(r'\w', line)
\W	匹配非数字、字母、下划线、中文	matchObj = re.match(r'\W', line)
\s	匹配空白符	matchObj = re.match(r'\s', line)
\S	匹配非空白符	matchObj = re.match(r'\S', line)
\t	匹配Tab键	matchObj = re.match(r'\t', line)
\n	匹配回车	matchObj = re.match(r'\n', line)
.	匹配除换行符外任意一个字符	matchObj = re.match(r'.', line)
re.S	使 . 匹配所有字符	matchObj = re.match(r'.*', line,re.S)
re.I	匹配大小写	matchObj = re.match(r'[a-z]*', line,re.I)
^	匹配以某字符开头	matchObj = re.match(r'(^A)', line)
$	匹配以某字符结尾	matchObj = re.match(r'.*Z$', line)
A\|B	匹配A或者B	matchObj = re.match(r'(A\|B)', line)
（）	匹配分组	matchObj = re.match(r'(.*)', line)
(?P<name>.*)	匹配分组并且命名分组为“name”，返回一个字典	matchObj = re.match(r'.*(?P<name>\d+), line)

二、常用字符组

符号	描述
[0-9]、[a-z]、[A-Z]、[0-9A-z]	匹配数字、小写字母、大写字母
[^abc]	匹配除abc以外的所有字符
2[0-9]	匹配20-29的数字

三、量词符号

符号	描述
{n}、{n,}、{n,m}	重复n次、至少n次、n到m次
？	0到1次
+	至少1次
*	0到无数次

四、贪婪模式与非贪婪模式

符号	描述
a.*b	贪婪模式，在ab之间匹配所有的字符
a.*?b	非贪婪模式，会在符合条件的基础上尽量少的匹配其他内容

s="IP地址192.168.2"
url=re.match(r".*(?P<ip>\d+.\d+.\d+)",s)
url=re.match(r".*?(?P<ip>\d+.\d+.\d+)",s)

>>>168.2  贪婪模式
>>>192.168.2  非贪婪模式

五、具体实现

1.re.math()函数

match() 试图从字符串的开始位置对模式进行逐个匹配，如果匹配成功，就返回一个匹配对象，如果匹配失败，就返回None，group()方法能显示成功匹配的对象。

ret = re.match("hello","hello word")
print(ret.group())  # 使用group()方式返回对应的分组

2、re.search()介绍

re.match() 是从字符串的开始位置寻找匹配，任何位置都可以被匹配，返回的是第一次出现的匹配对象(因为正则字符串中可能会多出匹配)。

ret = re.search("hello","ahello word")  # 搜索成功
print(ret.group())  # hello

3、re.fildall()介绍

re.fildall() 对整个字符串从左到右进行匹配，返回所有匹配对象的列表。

ret = re.fildall('\d+','hello123,word456')
print(ret)
>>>['123','456']

4、re.compile()介绍

re.compile() 对整个正则表达式进行预编译，生成表达式对象，用于正则匹配。

import re
rule = re.compile('\d+')
ret = rule.findall('hellow word456')
print(ret)
>>>456

5、re.split()介绍

re.split()方法与字符串的切割方法一样，返回的是一个列表。

strs = 'zhangsan123wangwu345we'
set = re.split('\d+',strs)
print(set)
>>>['zhangsan', 'wangwu', 'we']

6、re.sub()介绍

re.sub() 批量替换字符串中某个字符，如：将'hello word'替换成'HELLO word'。

strs = 'hello word'
ret = re.sub('hello','HELLO',strs)
print(ret)
>>>HELLO word
# 使用正则来匹配替换
strs = 'abcd123efg345hi'
ret = re.sub(r'\d+','HELLO',strs)
print(ret)
>>>abcdHELLOefgHELLOhi