Python正则表达式1.3：正则表达式和Python(下）

最新推荐文章于 2022-04-25 15:10:29 发布

中文过六级再取名

最新推荐文章于 2022-04-25 15:10:29 发布

阅读量178

点赞数

分类专栏： Python核心编程

本文链接：https://blog.csdn.net/w666667/article/details/103821220

版权

Python核心编程专栏收录该内容

8 篇文章 0 订阅

订阅专栏

Python正则表达式1.3

四个re模块函数：

四个re模块函数：

1.使用findall()和finditer()查找每一次出现的位置

findall()

findall()查询字符串中华某个正则表达式模式全部的非重复出现情况。这与search()在执行字符串搜索时类似，但与match()和search()的不同之处在于，findall()总是返回一个列表。就算没有找到匹配的部分，也会返回一个空的列表。如果成功，将会返回一个包含所有成功匹配部分的列表。
具体示例如下：

import re

def re_find():
    m = re.findall('car', 'car')
    print(m)

if __name__ == '__main__':
    re_find()

结果如下：

['car']
['car']
['car', 'car', 'car']

2.finditer()

finditer()函数是一个与findall()函数类似但是更节省内存的变体。两者之间以及和其他变体函数之间的差异在于，和某些匹配字符串相比，finditer()在匹配对象中迭代。

注意：使用finditer()函数完成的所有额外工作是为了获取它的输出来匹配findall()的输出

使用sub()和subn()搜索与替换

sub()与subn()两者几乎一样，都是将某字符串中所有匹配正则表达式的部分进行某种形式的替换。用来替换的部分通常是一个字符串，也可能是一个函数，该函数返回一个用来替换的字符串。但subn()还返回一个表示替换的总数。

>>> re.sub('X','Mr.Smith','attn:X\n\nDear X,\n')
'attn:Mr.Smith\n\nDear Mr.Smith,\n'
>>> re.subn('X','Mr.Smith','attn:X\n\nDear X,\n')
('attn:Mr.Smith\n\nDear Mr.Smith,\n', 2)
>>> print(re.sub('X','Mr.Smith','attn:X\n\nDear X,\n'))
attn:Mr.Smith

Dear Mr.Smith,

>>> re.sub('[ae]','X','abcdef')
'XbcdXf'
>>> re.subn('[ae]','X','abcdef')
('XbcdXf', 2)
>>>

前面讲到，group()方法除了能够提取出匹配分组编号外，还可以使用\N，其中N是在替换字符串中使用的分组编号。下面的代码仅仅是将美式的日期表示法MM/DD/YY{,YY}格式转换为其他国家的常用格式DD/MM/YY{,YY}.

>>> import re
>>> re.sub(r'(\d{1,2})/(d{1,2})/(\d{2}|\d{4})',r'\2/\1/\3','2/20/91')
'2/20/91'
>>> re.sub(r'(\d{1,2})/(d{1,2})/(\d{2}|\d{4})',r'\2/\1/\3','2/20/1991')
'2/20/1991'
>>>

在限定模式下使用split()分隔字符串

假设一个用于web站点（类似于Google或者Yahoo！Maps）的简单解析器，该如何实现？用户需要输入城市和州名，或者城市名加上ZIP编码，还是同时输入？这就需要比仅仅是普通字符串分割更强大的处理方式，具体如下：

import re

DATA = (
    'Mountain View, CA 94040',
    'Sunnyvale, CA',
    'Los Altos, 94023',
    'Cupertino 95014',
    'Palo Alto CA',
)

def demo_1():
    for datum in DATA:
        print(re.split(',|(?=(?:\d(5)|(A-Z){2}))',datum))

if __name__ == '__main__':
    demo_1()

结果如下：

['Mountain View', None, None, ' CA 94040']
['Sunnyvale', None, None, ' CA']
['Los Altos', None, None, ' 94023']
['Cupertino ', '5', None, '95014']
['Palo Alto CA']

扩展符号

Python的正则表达式支持大量的扩展符号，让我们查看一下他们中的一些内容，然后展示一些有用的的示例。

通过使用（?iLmsux）系列选项，用户可以直接在正则表达式里面指定一个或者多个标记，而不是通过compile()或者其他re模块函数。

下面为一些使用re.I/IGNORECASE的示例:

import re

def findall_1():
    demo1=re.findall(r'(?i)yes','yes? yes. YES!!')
    print(demo1)
    
    print(re.findall(r'(?i)th\w+','The quickest way is through this tunnel.'))
    
    demo2=re.findall(r'(?im)(^th[\w ]+)', """This line is the first,another line,it's the best""")
    print(demo2)

if __name__ == '__main__':
    findall_1()

结果如下：

['yes', 'yes', 'YES']
['The', 'through', 'this']
['This line is the first']

下一组演示使用re.S/DOTALL。该标记表明点号(.)能够用来表示\n符号（反之其通常用于表示除了\n之外的全部字符）：

import re

def findall_2():
    print(re.findall(r'th.+', '''
    The first line
    the second line
    the third line
    '''))

def findall_3():
    print(re.findall(r'(?s)th.+', '''
    The first line
    the second line
    the third line
    '''))

if __name__ == '__main__':
    findall_2()
    findall_3()

中文过六级再取名

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Python正则表达式1.3：正则表达式和Python(下）

Python正则表达式1.3四个re模块函数：1.使用findall()和finditer()查找每一次出现的位置findall()2.finditer()四个re模块函数：1.使用findall()和finditer()查找每一次出现的位置findall()findall()查询字符串中华某个正则表达式模式全部的非重复出现情况。这与search()在执行字符串搜索时类似，但与match(...
复制链接

扫一扫