正则表达式使用总结

布玛&

已于 2023-05-10 18:31:31 修改

阅读量405

点赞数

分类专栏： python 文章标签：正则表达式

于 2023-04-21 17:34:56 首次发布

本文链接：https://blog.csdn.net/xiaozhiamy/article/details/130293566

版权

python 专栏收录该内容

44 篇文章 3 订阅

订阅专栏

(1)re.findall()

import re

mo=re.compile(r'(\d+) tests from (\d+) test cases ran\.')
m1=re.compile(r'\d+ tests from \d+ test cases ran\.')
temp="phyTest 84 tests from 9 test cases ran.tests"
temp1="phyTest 84 tests from 9 test cases ran.tests abc 66 tests from 8 test cases ran.abc"
#for i in mo.findall(temp):#发现用这种方式也可以
for i in re.findall(mo,temp):
    print("i is:",i)

for j in re.findall(m1,temp):
    print("j is:",j)

print("k is:",re.findall(mo,temp1))

输出：
i is: (‘84’, ‘9’)
j is: 84 tests from 9 test cases ran.
k is: [(‘84’, ‘9’), (‘66’, ‘8’)]
总结：
1 正则表达式中有括号时，其输出内容就是括号匹配到的内容，而不是整个表达式所匹配到的结果。即整个正则表达式执行了，只不过只输出括号匹配到的内容。
2 当正则表达式中有2个括号时，其输出是个list中显示元组形式如上

(2)何时用re.compile()?
re模块中包含一个重要函数是compile(pattern [, flags]) ，该函数根据包含的正则表达式的字符串创建模式对象。
可以实现更有效率的匹配。在直接使用字符串表示的正则表达式进行search,match和findall操作时，python会将字符串转换为正则表达式对象。而使用compile完成一次转换之后，在每次使用模式的时候就不用重复转换。
在进行search,match等操作前不适用compile函数，会导致重复使用模式时，需要对模式进行重复的转换？？待验证
（3）re.search()

import re
# 1.匹配结果
res = re.search("python", "Life is short. I learn python!")
print(res)
# 输出结果:<_sre.SRE_Match object; span=(23, 29), match='python'>
print(res.group())  # 获取匹配数据
# 输出结果:python

# 2.未匹配到结果
res = re.search("java", "Life is short. I learn python!")
print(res)
# 输出结果:None

# 3.是否区分大小写匹配对比
res = re.search("Python", "Life is short. I learn python!")
print(res)
# 输出结果:None
res = re.search("Python", "Life is short. I learn python!", re.I)
print(res)
# 输出结果:<_sre.SRE_Match object; span=(23, 29), match='python'>
print(res.group())
# 输出结果:python

# 4.输出匹配位置和原配字符串
res = re.search("python", "Life is short. I learn python!")
print(res.start())  # 匹配字符串的开始位置
# 输出结果:23
print(res.end())  # 匹配字符串的结束位置
# 输出结果:29
print(res.span())  # 匹配字符串的元组(开始位置+结束位置)
# 输出结果:(23, 29)
print(res.string)  # 匹配字符串
# 输出结果:Life is short. I learn python!

# 定义一个正则对象调用
pat = re.compile("[0-9]")  # 匹配单个数字
res = pat.search("abcd23ef4gh!")
print(res)
print(res.group())
# 输出结果:<_sre.SRE_Match object; span=(4, 5), match='2'>
#输出2
import re
pat = re.compile("[0-9]")  # 匹配单个数字
#res = pat.search("abcd23ef4gh!")
res = re.search(pat,"abcd23ef4gh!")
print(res)#<re.Match object; span=(4, 5), match='2'>
print(res.group(0))#2

（4）re.findall和re.search区别
re.findall函数可以在字符串中查找所有匹配正则表达式的子串，并返回一个列表
re.search函数可以在字符串中查找第一个匹配正则表达式的子串，并返回一个匹配对象

import re
text = 'Hello, my name is John. I am 25 years old.'
ages = re.findall(r'(\d+)', text)
ages1 = re.findall(r'\d+', text)
print(ages)  # ['25']
print(ages1)  # ['25']

ages2 = re.findall(r'Hello, my name is John. I am (\d+) years old.', text)
ages3 = re.findall(r'Hello, my name is John. I am \d+ years old.', text)
print(ages2)  #['25']
print(ages3)  #['Hello, my name is John. I am 25 years old.']

text = 'Hello, my name is John. I am 85 years old.'
match = re.search(r'\d+', text)
if match:
    print(match.group())  # 85

match0 = re.search(r'(\d+)', text)
if match0:
    print(match0.group())  # 85

match1 = re.search(r'Hello, my name is John. I am (\d+) years old.', text)
if match1:
    print(match1.group())  # Hello, my name is John. I am 85 years old.

match2 = re.search(r'Hello, my name is John. I am \d+ years old.', text)
if match2:
    print(match2.group())  # Hello, my name is John. I am 85 years old.

(5)re.match和re.search区别
re.match和re.search都是Python中的正则表达式匹配函数。

re.match从字符串的开头开始匹配，只匹配一次。如果匹配成功，返回一个匹配对象；如果匹配失败，返回None。

re.search在整个字符串中搜索匹配，只匹配一次。如果匹配成功，返回一个匹配对象；如果匹配失败，返回None。

两者的区别在于匹配的起始位置不同。re.match只匹配字符串的开头，而re.search可以在整个字符串中搜索匹配。

(6)re.split()

import re
some_text = 'a,b,,,,c d'
a=re.split('[, ]+',some_text)
print(a)#输出['a', 'b', 'c', 'd']

(7)re.sub
检索和替换re.sub
Python 的 re 模块提供了re.sub用于替换字符串中的匹配项。

语法：
re.sub(pattern, repl, string, count=0, flags=0)
参数：
pattern : 正则中的模式字符串。
repl : 替换的字符串，也可为一个函数。
string : 要被查找替换的原始字符串。
count : 模式匹配后替换的最大次数，默认 0 表示替换所有的匹配。

import re

phone = "2004-959-559 # 这是一个国外电话号码"

# 删除字符串中的 Python注释
num = re.sub(r'#.*$', "", phone)
print("电话号码是: ", num)#输出：电话号码是:  2004-959-559 

# 删除非数字(-)的字符串
num = re.sub(r'\D', "", phone)
print("电话号码是: ", num)#输出：电话号码是:  2004959559

(8)贪婪匹配和懒惰匹配
表达式 .* 的意思很好理解，就是单个字符匹配任意次，即贪婪匹配。
表达式 .*? 是满足条件的情况只匹配一次，即懒惰匹配

(9)replace

str = "this is string example....wow!!! this is really string";

print(str.replace("is", "was"))#thwas was string example....wow!!! thwas was really string

print(str.replace("is", "was", 3))#thwas was string example....wow!!! thwas is really string

Python replace() 方法把字符串中的 old（旧字符串）替换成 new(新字符串)，如果指定第三个参数max，则替换不超过 max 次
语法
replace()方法语法：
str.replace(old, new[, max])
参数
old – 将被替换的子字符串。
new – 新字符串，用于替换old子字符串。
max – 可选字符串, 替换不超过 max 次
返回值
返回字符串中的 old（旧字符串）替换成 new(新字符串)后生成的新字符串，如果指定第三个参数max，则替换不超过 max 次。
replace和re.sub区别是：后者可以用正则表达式使用场景更多

布玛&

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
正则表达式使用总结

1 正则表达式中有括号时，其输出内容就是括号匹配到的内容，而不是整个表达式所匹配到的结果。即整个正则表达式执行了，只不过只输出括号匹配到的内容。2 当正则表达式中有2个括号时，其输出是个list中显示元组形式如上。
复制链接

扫一扫