9、正则函数
findall search match split sub subn finditer compile
findall:匹配所有符合条件的字符,并返回到列表中
search:匹配第一次匹配到的符合条件的,返回对象,通过group获取结果,通过groups获取分组结果并返回元组中
match:验证用户输入内容,search在正则表达式中加上^就和match效果一样
split:切割,比字符串的split切割功能更加强大
sub:替换,比字符串的replace替换功能更加强大(使用方式:正则表达式,替换的字符串,原字符串,[可选的替换次数])
subn:替换,和sub的使用方法一样,但返回值是一个元组
finditer:匹配字符串中相应的内容,返回迭代器
compile:指定一个统一的匹配规则;
正常情况下,正则表达式执行一次,编译一次,如需要反复使用,会浪费系统资源,比如内存,cpu
可以使正则编译一次,无限使用,无需反复编译
import re
# search
strvar = "1+2 3*4 "
ret = re.search("\d+(.*?)\d+", strvar)
print(ret.group(), type(ret.group()))
ret1 = ret.groups()
print(ret1, type(ret1))
# match
strvar = "s12346579"
strvar = "465879132"
ret = re.search("^\d+", strvar)
print(ret.group())
ret = re.match("\d+", strvar)
print(ret.group())
# split
strvar = "eric|dsf|sdf&sdf-asf"
ret = re.split("[|&-]", strvar)
print(ret)
strvar = "13af4534fasf33saf34afggg"
ret = re.split("\d+", strvar)
print(ret)
# sub
strvar = "asdf|faf&sadf-asf"
ret = re.sub("[|&-]", "%", strvar)
print(ret)
ret = re.sub("[|&-]", "%", strvar, 1)
print(ret)
# subn
ret = re.subn("[|&-]", "%", strvar)
print(ret)
ret = re.subn("[|&-]", "%", strvar, 1)
print(ret)
# finditer
from collections import Iterator
strvar = "sdfasasdf1234654撒法dfsa132"
it = re.finditer(r"\d+", strvar)
print(isinstance(it, Iterator))
for i in range(2):
print(next(it).group())
for i in it:
print(i.group())
# compile
strvar = "adsf12346学位DVD133"
pattern = re.compile("\d+")
print(pattern)
ret = pattern.search(strvar)
print(ret.group())
ret = pattern.findall(strvar)
print(ret)
ret = re.compile("\d+").findall(strvar)
print(ret)
10、修饰符
re.I re.M re.S
re.I:使匹配对大小写不敏感
re.M:使每一行都能够单独匹配(多行匹配),影响^和$
re.S:使 . 匹配包括换行\n在内的所有字符
# re.I
strvar = "<h1>123</H1>"
pattern = re.compile("<h1>(.*?)</h1>", flags=re.I)
ret = pattern.search(strvar)
print(ret)
print(ret.group())
print(ret.groups())
# re.M
strvar = """<h1>123</H1>
<p>123</p>
<div>123</div>
"""
pattern = re.compile("^<.*?>(?:.*?)<.*?>$", flags=re.M)
print(pattern.findall(strvar))
pattern = re.compile("^<.*?>(?:.*?)<.*?>$", flags=re.M).findall(strvar)
print(pattern)
# re.S
strvar = """give
123213mezdmefive
"""
pattern = re.compile("(.*?)mefive", flags=re.S)
ret = pattern.search(strvar)
print(ret.group())