Python：正则表达式（三）-re模块的内容_re.search怎么支持正则表达式-CSDN博客

本文链接：https://blog.csdn.net/qq_14908027/article/details/77841473

Python中re模块包含对正则表达式（regular expression）的支持，re模块包含一些常用的操作正则表达式的函数。接下来将通过Python代码实例来加深对这些函数的理解

函数	描述
compile(pattern[,flags])	根据包含正则表达式的字符串创建模式对象
search(pattern,string[,flags])	在字符串中寻找模式
match(pattern,string[,flags])	在字符串的开始处匹配模式
split(pattern,string[,maxsplit=0])	根据模式的匹配项来分隔字符串
findall(pattern,string)	列出字符串中模式的所有匹配项

re.compile()，re.search()和re.match()

re.compile()将以字符串书写的正则表达式转换为模式对象，可以实现更有效率的匹配。如果是在调用search或者match函数的时候使用的字符串表达的正则表达式，他们也会在内部将字符串转换为正则表达式对象。所以在使用compile完成一次转换之后，在每次使用模式的时候就不用再进行转换。
所以：re.search(pat,string)（pat是用字符串表示的正则表达式）等价于pat.search(string)（pat是用compile创建的模式对象）

#coding=utf-8
'''
@author=chengww3
'''
import re

result1 = re.search(r"(http://)?(www\.)?python\.org","http://www.python.org")
print("%s" %result1)

str = r"(http://)?(www\.)?python\.org"#定义包含正则的字符串
pat = re.compile(str)#转为模式对象
result2 = pat.search("http://www.python.org")
print("%s" %result2)

运行结果保持一致

match函数会在给定字符串的开头匹配正则表达式。因此，match(‘p’,’python’)返回真（即匹配对象MatchObject），而re.match(‘p’,’www.python.org’)则返回假（None）

#coding=utf-8
'''
@author=chengww3
'''
import re

res1 = re.match('p','python')
res2 = re.match('p','www.python.org')
print(res1)
print(res2)

这里写图片描述

如果要求模式匹配整个字符串，可以在模式的结尾加上美元符号。美元符号会对字符串的末尾进行匹配，从而顺延了整个匹配。我们对上述代码稍作修改。

#coding=utf-8
'''
@author=chengww3
'''
import re

res1 = re.search('p','pythonp')
res2 = re.search('p$','pythonp')
print(res1)
print(res2)

这里写图片描述

re.split()

函数re.split()会根据模式的匹配项来分隔字符串。它类似于字符串方法split，不过是用完整的正则表达式代替了固定的分隔符字符串。re.split允许任意长度的逗号和空格序列来分隔字符串。

#coding=utf-8
'''
@author=chengww3
'''
import re

str = "alpha, beta,,,,gamma delta"
#使用空格分隔
res1 = re.split('[ ]+',str)
print(res1)
#使用逗号分隔
res2 = re.split('[,]+',str)
print(res2)
#使用逗号和空格分隔
res3 = re.split('[, ]+',str)
print(res3)

这里写图片描述
maxsplit参数表示字符串最多可以分隔的次数，修改上述代码

#coding=utf-8
'''
@author=chengww3
'''
import re

str = "alpha, beta,,,,gamma delta"
#加入maxsplit
res4 = re.split('[, ]+',str,maxsplit=2)
print(res4)
res5 = re.split('[, ]+',str,maxsplit=1)
print(res5)

这里写图片描述

re.findall()

re.findall以列表形式返回给定模式的所有匹配项。比如要在字符串中查找所有的单词

#coding=utf-8
'''
@author=chengww3
'''
import re

pat = '[a-zA-Z]+'
str = '"Hm... Err -- are you sure?" he said, sounding insecure.'
res = re.findall(pat,str)
print(res)

这里写图片描述