在前学习过正则表达式的组可以通过组号来自引用,看起来使用很简单的样子,其实它还是不容易维护的,比如你某一天需要在这个正则表达式里插入一个组时,就发现后面的组号全打乱了,因此需要一个一个地更改组号,有没有更容易维护的方式呢?是有的,就是使用组名称来引用。如下面的例子:
结果输出如下:
Candidate: cai junsheng <cai.junsheng@example.com>
Match name : cai junsheng
Match email: cai.junsheng@example.com
Candidate: Different Name <cai.junsheng@example.com>
No match
Candidate: Cai Middle junsheng <cai.junsheng@example.com>
Match name : Cai junsheng
Match email: cai.junsheng@example.com
Candidate: Cai M. junsheng <cai.junsheng@example.com>
Match name : Cai junsheng
Match email: cai.junsheng@example.com
深入浅出Numpy
五子棋游戏开发
http://edu.csdn.net/course/detail/5487
#python 3.6
#蔡军生
#http://blog.csdn.net/caimouse/article/details/51749579
#
import re
address = re.compile(
'''
# The regular name
(?P<first_name>\w+)
\s+
(([\w.]+)\s+)? # optional middle name or initial
(?P<last_name>\w+)
\s+
<
# The address: first_name.last_name@domain.tld
(?P<email>
(?P=first_name)
\.
(?P=last_name)
@
([\w\d.]+\.)+ # domain name prefix
(com|org|edu) # limit the allowed top-level domains
)
>
''',
re.VERBOSE | re.IGNORECASE)
candidates = [
u'cai junsheng <cai.junsheng@example.com>',
u'Different Name <cai.junsheng@example.com>',
u'Cai Middle junsheng <cai.junsheng@example.com>',
u'Cai M. junsheng <cai.junsheng@example.com>',
]
for candidate in candidates:
print('Candidate:', candidate)
match = address.search(candidate)
if match:
print(' Match name :', match.groupdict()['first_name'],
end=' ')
print(match.groupdict()['last_name'])
print(' Match email:', match.groupdict()['email'])
else:
print(' No match')
结果输出如下:
Candidate: cai junsheng <cai.junsheng@example.com>
Match name : cai junsheng
Match email: cai.junsheng@example.com
Candidate: Different Name <cai.junsheng@example.com>
No match
Candidate: Cai Middle junsheng <cai.junsheng@example.com>
Match name : Cai junsheng
Match email: cai.junsheng@example.com
Candidate: Cai M. junsheng <cai.junsheng@example.com>
Match name : Cai junsheng
Match email: cai.junsheng@example.com
在这个例子里,就是通过(?P=first_name)引用。