正则表达式总结（search方法，返回第一次检测到的）

最新推荐文章于 2023-06-10 22:04:34 发布

欧阳通红

最新推荐文章于 2023-06-10 22:04:34 发布

阅读量769

点赞数

本文链接：https://blog.csdn.net/kuba_kuki/article/details/104250582

版权

正则表达式步骤：
1.导入re；
2.re.compile（）函数创建Regex；
3.使用search（）返回字符串；
4.调用group（），返回实际文本

1.利用括号分组

>>> import re
>>> phoneNumRegex=re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d\d)')#括号代表两个字符串；**表达式中有几个()就有几个相应的匹配字符串**
>>>mo=phoneNumRegex.search('My number is 415-555-4242.')
>>> mo.group(1)
'415'
>>> mo.group(2)
'555-4242'
>>> mo.group()
'415-555-4242'
>>> mo.group(0)
'415-555-4242'
>>> mo.groups()#注意groups用的是复数，代表输出所有字符串，**注意和上面的区别**
('415', '555-4242')
>>> areaCode,mainNumber=mo.groups()#多重赋值，每个值赋予一个单独的变量
>>> print(areaCode)
415
>>> print(mainNumber)
555-4242
>>> phoneNumRegex=re.compile(r'(\(\d\d\d\))(\d\d\d-\d\d\d\d)')#转义字符\的不同匹配：："\("则匹配"("，“\)”匹配“）”
>>> mo=phoneNumRegex.search('My phone number is (415)555-4242.')
>>> mo.group(1)
'(415)'
>>> mo.groups()
('(415)', '555-4242')

感谢感谢这篇博客，解决了我的一个困扰点，对正则表达式的转义字符也解释的很明白
插一句：即使re,compile（）函数不使用Regex，也不影响：

>>> import re
>>> phoneNum=re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d\d)')
>>> mo=phoneNum.search('My number is 415-555-4242.')
>>> mo.group(1)
'415'
>>>

2.用管道匹配多个分组（字符“|”称为“管道”）

>>> heroRegex=re.compile(r'Batman|Tian Fey')#将匹配其中之一
>>> mo1=heroRegex.search('Batman and Tina Fey.')#若果都出现在被查找的字符串中，第一次出现的匹配文本将被返回
>>> mo1.group()
'Batman'
>>> mo2=heroRegex.search('Tina Fey and Batman')#第一次出现的被返回
>>> mo2.group()
'Tina Fey '
>>> mo3=heroRegex.search('oTina Feys and fff')#只要符合要求，军可以查找到
>>> mo3.group()
'Tina Feys'
>>> mo4=heroRegex.search('Batman and tins ang Batman')
>>> mo4.group()
'Batman'
>>>

对于多词前缀相同情况下，可以使用括号，仅指定一次前缀即可：
例如‘Batman’、‘Batmobile’、‘Batcopter’、‘Batbat’，仅指定一次‘Bat’即可，其他使用括号实现：

>>> barRegex=re.compile(r'Bat(man|mobile|copter|bat)')#一个括号，证明有一个字符串
>>> mo=barRegex.search('Batmobile lost a wheel')
>>> mo.group()
'Batmobile'
>>> mo.groups()
('mobile',)
>>> mo.group(1)
'mobile'
>>> mo.group(2)#由于只有一个字符串，后续group（2）时，系统会报错
Traceback (most recent call last):
  File "<pyshell#37>", line 1, in <module>
    mo.group(2)
IndexError: no such group
>>> mo1=barRegex.search('Batmobile lost a wheel and Batbat ou andm Batman')#多个也只能返回第一次出现的
>>> mo1.group()
'Batmobile'
>>>

3.使用问号实现可选匹配：
匹配这个❓之前的分组零次或一次

>>> batRegex=re.compile(r'Bat(wo)?man')
>>> mo1=batRegex.search('The Advantures of Batman')
>>> mo1.group()
'Batman'
>>> mo1.groups()#此时没有检测到batwoman，即括号里的内容没有被检测到，系统认为无分组，使用groups也就只会返回NOne
(None,)
>>> mo2=batRegex.search('The Advantures of Batwoman')
>>> mo2.group()
'Batwoman'
>>> mo2.groups()#检测到（wo），所以系统认为分组有1个，所以groups返回括号里的内容
('wo',)
>>> mo3=batRegex.search('The Advantures of Batwoman and Batman')
>>> mo3.group()
'Batwoman'
>>> mo3.groups()
('wo',)
>>> mo4=batRegex.search('The Advantures of Batman and Batwoman')
>>> mo4.group()
'Batman'
>>> mo4.groups()
(None,)

## 修改电话号码的例子，实现包含区号和不包含的查找

>>> phoneRegex=re.compile(r'(\d\d\d-)?\d\d\d-\d\d\d\d')
>>> mo1=phoneRegex.search('My number is 415-555-4242')
>>> mo1.group()
'415-555-4242'
>>> mo1.groups()
('415-',)
>>> mo2=phoneRegex.search('My number is 555-4242')
>>> mo2.group()
'555-4242'
>>> mo2.groups()
(None,)

4.与第三条对比，使用✳匹配零次或多次

>>> batRegex=re.compile(r'Bat(wo)*man')
>>> mo1=batRegex.search('The Advantures of Batman')
>>> mo1.group()
'Batman'
>>> mo2=batRegex.search('The Advantures of Batwoman')
>>> mo2.group()
'Batwoman'
>>> mo3=batRegex.search('The Advantures of Batwowowowowowoman')
>>> mo3.group()
'Batwowowowowowoman'
>>> mo3.groups()
 ('wo',)

5.与第四条对比，使用➕匹配一次或多次

>>> batRegex=re.compile(r'Bat(wo)+man')
>>> mo1=batRegex.search('The Advantures of Batman')
>>> mo1.group()#➕前面的分组必须匹配一次或多次，即“至少出现一次”
Traceback (most recent call last):
  File "<pyshell#44>", line 1, in <module>
    mo1.group()
AttributeError: 'NoneType' object has no attribute 'group'
>>> mo1==None#测试一下，mo1就是NOne
True
>>> mo2=batRegex.search('The Advantures of Batwoman')
>>> mo2.group()
'Batwoman'
>>>mo3=batRegex.search('The Advantures of Batwowowowowowowowowoman')
>>> mo3.group()
'Batwowowowowowowowowoman'
>>> mo3.groups()#习惯性测试groups
('wo',)

6.使用花括号匹配特定次数（可以节省管道“|”，使程序更加短）

>>> haRegex=re.compile(r'(Ha){3}')
>>> mo1=haRegex.search('HaHaHa')
>>> mo1.group()
'HaHaHa'
>>> mo2=haRegex.search('Ha')
>>> mo2==None
True
>>> mo2==none#突发奇想，测试一下小写none
Traceback (most recent call last):
  File "<pyshell#59>", line 1, in <module>
    mo2==none
NameError: name 'none' is not defined

关于花括号需要补充如下
a.(Ha){3,}👉将匹配3次或更多；
（Ha）{，5}👉将匹配0到5次实例。
b.

以下每两行都代表同样模式，同样含义同样功能
（Ha){3}
(Ha)(Ha)(Ha)

（Ha）{3,5}
((Ha)(Ha)(Ha)|(Ha)(Ha)(Ha)(Ha)|(Ha)(Ha)(Ha)(Ha)(Ha))

7.增加正则表达式的“贪心”版本和“不贪心”版本：
注意：❓在正则表达式中两种含义且之间没有关系：声明非贪心、可选的分组。

>>> import re
>>> greedyHaRegex=re.compile(r'(Ha){3,5}')
>>> mo1=greedyHaRegex.search('HaHaHaHaHa')
>>> mo1.group()
'HaHaHaHaHa'
>>> nongreedyGaRegex=re.compile(r'(Ha){3,5}?')#匹配了3~5之间出现的最少次数
>>> mo2=nongreedyGaRegex.search('HaHaHaHaHa')
>>> mo2.group()
'HaHaHa'

欧阳通红

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
正则表达式总结（search方法，返回第一次检测到的）

>>> import re>>> phoneNumRegex=re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d\d)')#括号代表两个字符串；**表达式中有几个()就有几个相应的匹配字符串**>>>mo=phoneNumRegex.search('My number is 415-555-4242.')>&g...
复制链接

扫一扫

正则表达式总结（search方法，返回第一次检测到的）

“相关推荐”对你有帮助么？