re.match
re.match 尝试从字符串的起始位置匹配一个模式,如果不是起始位置匹配成功的话,match()就返回none
re.match(pattern,string,flags=0)
pattern-----匹配的正则表达式
string-----匹配的字符串
flags-----匹配的模式
eg.
import re
content = 'Hello 1234567 World_This is a Regex Demo'
print(len(content))
result = re.match('^Hello\s\d{3}\d{4}\s\w{10}.*Demo$',content)
# result = re.match('^Hello.*Demo$',content)
#result = re.match('^Hello\s(\d+)\sWorld.*Demo$',content)
print (result)
print(result.group())
print(result.span())
40
<_sre.SRE_Match object;span=(0,4),match='Hello 1234567 World_This is a Regex Demo'>
Hello 1234567 World_This is a Regex Demo
(0,40)
eg.贪婪匹配
import re
content = 'Hello 1234567 World_This is a Regex Demo'
result = re.match('^He.*(\d+).*Demo$',content)
print(result)
print(result.group(1))
<_sre.SRE_Match object;span=(0,40),match='Hello 1234567 World_This is a Regex Demo'>
7
.*匹配到了123456,\d匹配到7,.*尽可能多的匹配
result = re.match('^He.*?(\d+).*Demo$',content)
增加一个问号,尽可能少的匹配
print(result.group(1))
1234567
eg.匹配模式
import re
content='Hello 1234567 World_This
is a Regex Demo
'
result = re.match('^Hello,*?(\d+).*?Demo$',content)
print(result) #None
点无法匹配换行符
import re
content='Hello 1234567 World_This
is a Regex Demo
'
result = re.match('^Hello,*?(\d+).*?Demo$',content,re.S)
print(result.group(1))
re.S 点可以代替换行符
eg.转义
import re
content ='price is $5.00'
result = re.match('price is $5.00',content)
print(result) #None
没有匹配到任意字符,必须转义
import re
content ='price is $5.00'
result = re.match('price is \$5\.00',content)
print(result)
总结:
尽量使用泛匹配、使用括号得到匹配目标、尽量使用非贪婪模式、有换行就用re.S
re.search
re.search 扫描整个字符串并返回第一个成功的匹配
eg.
import re
content = 'Extra stings Hello 1234567 World_This is a Regex Demo Extra stings'
result = re.match('^Hello.*?(\d+).*?Demo',content)
print(result) #None
无法匹配
import re
content = 'Extra stings Hello 1234567 World_This is a Regex Demo Extra stings'
result = re.search('^Hello.*?(\d+).*?Demo',content)
print(result.group(1))
print(result)
总结:
为匹配方便,能用search就不用match
re.sub
替换字符串中每一个匹配的子串后返回替换后的字符串
第一个参数为正则表达式
第二个参数为需要替换的字符串
第三个参数为原来的字符串
import re
content = 'Extra stings Hello 1234567 World_This is a Regex Demo Extra stings'
content = re.sub('\d+','Replacement',content)
print(content)
import re
content = 'Extra stings Hello 1234567 World_This is a Regex Demo Extra stings'
content = re.sub('(\d+)',r'\1 8910',content)
print(content)