python re.compile(?P<name>)

正则还可以这样匹配。。。
geeksquiz 网站(https://www.geeksforgeeks.org/functions-python-gq/)提供代码题,可用于自测一门语言的掌握情况,今天做python的题有了有趣的发现——原来正则还可以这样写>>>

sentence = 'cats are fast'  
regex = re.compile('(?P<animal>\w+) (?P<verb>\w+) (?P<adjective>\w+)')  
matched = re.search(regex, sentence)  
print(matched.groupdict())  
output: {'adjective': 'fast', 'verb': 'are', 'animal': 'cats'}  

Python 帮助文档中的说明如下:

The syntax for a named group is one of the Python-specific extensions: (?P…). name is, obviously, the name of the group. Named groups also behave exactly like capturing groups, and additionally associate a name with a group. Thematch object methods that deal with capturing groups all accept either integers that refer to the group by number or strings that contain the desired group’s name. Named groups are still given numbers, so you can retrieve information about a group in two ways:

>>> p = re.compile(r'(?P<word>\b\w+\b)')  
>>> m = p.search( '(((( Lots of punctuation )))' )  
>>> m.group('word')  
'Lots'  
>>> m.group(1)  
'Lots'  

The syntax for backreferences in an expression such as (...)\1 refers to the number of the group. There’s naturally a variant that uses the group name instead of the number. This is another Python extension:(?P=name) indicates that the contents of the group calledname should again be matched at the current point. The regular expression for finding doubled words,(\b\w+)\s+\1 can also be written as (?P<word>\b\w+)\s+(?P=word):

>>> p = re.compile(r'(?P<word>\b\w+)\s+(?P=word)')  
>>> p.search('Paris in the the spring').group()  
'the the'  

正则表达式文档的整理:https://docs.python.org/2/howto/regex.html
常用

^ Matches the beginning of a line

$ Matches the end of the line

. Matches any character

\s Matches whitespaces

\S Matches any non-whitespace character

  • Repeats a character 0 or more times

*? Repeats a character 0 or more times(non-greedy)

  • Repeats a character one or more times

+? Repeats a character one or more times(non-greedy)

( Indicates where string extraction is to start

) Indicates where string extraction is to end

\d Matches any decimal digit

\D Matches any non-didgit character

\w Matches any alphanumeric character

\W Matches any non-alphanumeric character

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值