Python正则表达式前向/后向搜索的肯定/否定模式的区别和示例

最新推荐文章于 2024-03-27 16:00:00 发布

小龙在山东

最新推荐文章于 2024-03-27 16:00:00 发布

阅读量1.2w

点赞数 7

分类专栏： python 正则表达式文章标签： python 正则表达式

本文链接：https://blog.csdn.net/lilongsy/article/details/78505309

版权

python 同时被 2 个专栏收录

242 篇文章 13 订阅

订阅专栏

正则表达式

4 篇文章 0 订阅

订阅专栏

零宽断言区别

	含义	语法	示例
前向搜索肯定模式零宽度正预测先行断言	匹配exp前面的位置	`(?=exp)`	用`\b\w+(?=ing\b)`查找`I'm singing while you're dancing.`匹配到`sing danc`
前向搜索否定模式零宽度负预测先行断言	匹配后面跟的不是exp的位置	`(?!exp)`	`\d{3}(?!\d)`匹配三位数字，而且这三位数字的后面不能是数字； `\b((?!abc)\w)+\b`匹配不包含连续字符串abc的单词
后向搜索肯定模式零宽度正回顾后发断言	匹配exp后面的位置	`(?<=exp)`	用`(?<=\bre)\w+\b`查找`reading a book`得到`ading`。用`((?<=\d)\d{3})+\b`查找`1234567890`得到`234567890` `(?<=<(\w+)>).*(?=<\/\1>)`匹配不包含属性的简单HTML标签内里的内容
后向搜索否定模式零宽度负回顾后发断言	匹配前面不是exp的位置	`(?<!exp)`	`(?<![a-z])\d{7}`匹配前面不是小写字母的七位数字

他们只匹配一个位置，并不消费任何字符。
带<表示把零宽度（预查）放到要匹配的表达式前面，不带就放到后面。
!表示非，不需要的意思。

前向搜索肯定模式例子

# -*-coding:utf-8-*-

import re 
  
address = re.compile(u'((?P<name>([\w.,]+\s+)*[\w.,]+)\s+)(?=(<.*>$)|([^<].*[^>]$))<?(?P<email>[\w\d.+-]+@([\w\d.]+\.)+(com|org|edu))>?', re.VERBOSE)

candidates = [
    u'First Last <first.last@example.com>',  
    u'No Brackets first.last@example.com',  
    u'Open Bracket <first.last@example.com',  
    u'Close Bracket first.last@example.com>',  
]

for candidate in candidates:  
    print u'Candidate:', candidate
    match = address.search(candidate)
    if match:  
        print u'  Name :', match.groupdict()['name']
        print u'  Email:', match.groupdict()['email']
    else:  
        print '  No match'

输出：

Candidate: First Last <first.last@example.com>
  Name : First Last
  Email: first.last@example.com
Candidate: No Brackets first.last@example.com
  Name : No Brackets
  Email: first.last@example.com
Candidate: Open Bracket <first.last@example.com
  No match
Candidate: Close Bracket first.last@example.com>
  No match

前向搜索否定模式例子

# -*-coding:utf-8-*-

import re

address = re.compile(
    '''
    ^

    # An address: username@domain.tld

    # Ignore noreply addresses
    (?!noreply@.*$)

    [\w\d.+-]+       # username
    @
    ([\w\d.]+\.)+    # domain name prefix
    (com|org|edu)    # limit the allowed top-level domains

    $
    ''',
    re.VERBOSE)

candidates = [
    u'first.last@example.com',
    u'noreply@example.com',
]

for candidate in candidates:
    print('Candidate:', candidate)
    match = address.search(candidate)
    if match:
        print('  Match:', candidate[match.start():match.end()])
    else:
        print('  No match')

输出：

('Candidate:', u'first.last@example.com')
('  Match:', u'first.last@example.com')
('Candidate:', u'noreply@example.com')
  No match

后向搜索否定模式例子

# -*-coding:utf-8-*-

import re

pattern = u'^[\w\d\.+-]+(?<!noreply)@([\w\d.]+\.)+(com|org|edu)$'
ls = [u'first.last@example.com', u'noreply@example.com']

for txt in ls:
    print 'Candidate:', txt
    match = re.search(pattern, txt)
    if match:
        print u'    Match:', match.group(0)
    else:
        print u'    No match'

输出结果：

Candidate: first.last@example.com
    Match: first.last@example.com
Candidate: noreply@example.com
    No match

后向搜索肯定模式例子

# -*-coding:utf-8-*-

import re

pattern = re.compile('(?<=@)([\w\d_]+)', re.VERBOSE)
text = '''This text includes two Twitter handles. 
One for @caimouse, and one for the author, @caijunsheng. 
'''

print text
for match in pattern.findall(text):
    print match

输出：

This text includes two Twitter handles.
One for @caimouse, and one for the author, @caijunsheng.

caimouse
caijunsheng

参考

http://deerchao.net/tutorials/regex/regex.htm
http://blog.csdn.net/caimouse/article/details/78481084

小龙在山东

关注

7
点赞
踩
21

收藏

觉得还不错? 一键收藏
打赏
0
评论
Python正则表达式前向/后向搜索的肯定/否定模式的区别和示例

零宽断言区别语法含义示例前向搜索肯定模式零宽度正预测先行断言匹配exp前面的位置 (?=exp) 用\b\w+(?=ing\b)查找I'm singing while you're dancing.匹配到sing danc 前向搜索否定模式零宽度负预测先行断言匹配后面跟的不是exp的位置 (?!exp) \d{3}(?!\d)匹配三位数字
复制链接

扫一扫