你可能熟悉文本查找,按Ctrl+F即可
正则表达式则更进一步,能够制定你要查找的模式
1. 普通查找
def isPhoneNumber(text):
if len(text) != 12:
return False
for i in range(0, 3):
if not text[i].isdecimal():
return False
if text[3] != '-':
return False
for i in range(4, 7):
if not text[i].isdecimal():
return False
if text[7] != '-':
return False
for i in range(8, 12):
if not text[i].isdecimal():
return False
return True
print(isPhoneNumber('415-555-4242'))
2. 正则查找
import re
phoneNumRegex = re.compile(r'\d\d\d-\d\d\d\-\d\d\d\d')
mo = phoneNumRegex.search('My number is 415-555-8888.')
print(mo.group())
3.匹配更多模式
括号匹配
import re
phoneNumRegex = re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d\d)')
mo = phoneNumRegex.search('My number is 415-555-4242.')
print(mo.group(1))
print(mo.group(2))
print(mo.group(0))
print(mo.group(1)+mo.group(2))
管道匹配
import re
heroRegex = re.compile(r'Batman|Tina Fey')将匹配两个中的一个
mo1 = heroRegex.search('Batman and Tina Fey.')
print(mo1.group())
mo2 = heroRegex.search('Tina Fey and Batman.')
print(mo2.group())
问号匹配
import re
batRegex = re.compile(r'Bat(wo)?man')
mo1 = batRegex.search('The Adventures of Batman')
print(mo1.group())
mo2 = batRegex.search('The Adventures of Batwoman')
print(mo2.group())
?表示一次或者零次
星号匹配
import re
batRegex = re.compile(r'Bat(wo)*man')
mo1 = batRegex.search('The Adventures of Batman')
print(mo1.group())
mo2 = batRegex.search('The Adventures of Batwowowowowowoman')
print(mo2.group())
*表示多次或者零次
加号匹配
import re
batRegex = re.compile(r'Bat(wo)+man')
mo1 = batRegex.search('The Adventures of Batwoman')
print(mo1.group())
mo2 = batRegex.search('The Adventures of Batwowowowowowoman')
print(mo2.group())
mo3 = batRegex.search('The Adventures of Batman')
print(mo3 == None)
+意味着匹配一次或多从 但是加号前面的分组至少出现一次
花括号匹配
import re
haRegex = re.compile(r'(Ha){3}')
mo1 = haRegex.search('HaHaHa')
print(mo1.group())
mo2 = haRegex.search('HaHa')
print(mo2 == None)
匹配固定次数
4.贪心匹配
import re
greedyHaRegex = re.compile(r'(Ha){3,5}')
mo1 = greedyHaRegex.search('HaHaHaHaHa')
print(mo1.group())
5.findall方法
import re
phoneNumRegex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')
mo = phoneNumRegex.findall('Cell:415-555-9999 work:212-555-0000')
print(mo)
phoneNumRegex = re.compile(r'(\d\d\d)-(\d\d\d)-(\d\d\d\d)')
mo = phoneNumRegex.findall('Cell:415-555-9999 work:212-555-0000')
print(mo)