正则表达式
在文本中查找pattern
正则表达式最常用的就是在文本中查找匹配项,比如:
import re
patterns = ['this', 'that']
text = 'does this text match the patterns?'
for pattern in patterns:
print('looking for "%s" in "%s" ->' % (pattern, text))
if re.search(pattern, text):
print('found a match')
else:
print('no match')
#search()函数返回一个Match对象,如果没有找到匹配项,则函数返回None
search函数返回的Match对象包含了匹配的基本信息数据,包括初始的输入文本,匹配表达式,文本中匹配的起始位置与结束位置等。如:
import re
pattern = 'this'
text = 'does this text match the patterns?'
match = re.search(pattern, text)
s = match.start()
e = match.end()
print('found "%s" in "%s" from %d to %d ("%s")' % \
(match.re.pattern, match.string, s, e, text[s:e]))
如果在程序中经常使用某个匹配表达式,可以首先编译匹配表达式,得到Regular Expression Objects,然后利用Regular Expression Objects继续匹配查找:
import re
regexes = [re.compile(p) for p in ['this', 'that']]
text = 'does this text match the patterns?'
for regex in regexes