正则表达式

最新推荐文章于 2023-06-05 10:24:12 发布

hjhcos

最新推荐文章于 2023-06-05 10:24:12 发布

阅读量606

点赞数

分类专栏： Python

原文链接：https://www.cnblogs.com/shenjianping/p/11647473.html

版权

Python 专栏收录该内容

22 篇文章 0 订阅

订阅专栏

Python官网

字符组`[]`

# r 可以让转义字符失效
#可以匹配数字，大小写形式的a～f，用来验证十六进制字符
[0-9a-fA-F]
.  	# 匹配除换行符以外的任意字符 相当于占位符
\w 	# 匹配字母或数字或下划线
\s 	# 匹配任意的空白符
\d 	# 匹配数字
\n 	# 匹配一个换行符
\t 	# 匹配一个制表符
\b 	# 匹配一个单词的结尾
^ 	# 匹配字符串的开始
$ 	# 匹配字符串的结尾
\W 	# 匹配非字母或数字或下划线
\D 	# 匹配非数字
\S 	# 匹配非空白符
a|b # 匹配字符a或字符b
() 	# 匹配括号内的表达式，也表示一个组
[...] 	# 匹配字符组中的字符
[^...] 	# 匹配除了字符组中字符的所有字符
* 	# 重复零次或更多次　贪婪模式
+ 	# 重复一次或更多次
? 	# 重复零次或一次
{n} 	# 重复n次
{n,} 	# 重复n次或更多次
{n,m} 	# 重复n到m次
*? 	# 重复任意次，但尽可能少重复　懒惰模式
+? 	# 重复1次或更多次，但尽可能少重复
?? 	# 重复0次或1次，但尽可能少重复
{n,m}? 	# 重复n到m次，但尽可能少重复
{n,}?  	# 重复n次以上，但尽可能少重复

re库

"""
findall:匹配所有符号规律的内容,返回包含结果的列表
search:匹配第一个符合规律的内容,返回一个正则表达式对象(object)
sub:替换符合规律的内容,返回替换后的值
"""
import re

code = 'xxIxxsfadfdxxlovexxsdfsasxxyouxxxxxxx'

# * 重复多次x或者零次

c = re.findall('x*xx', code)
c		# ['xx', 'xx', 'xx', 'xx', 'xx', 'xxxxxxx']

# . 任意一个字符不包括换行

c = re.findall('xx.xx', code)
c		# ['xxIxx', 'xxxxx']

# .* 无限个任意字符 匹配最长的

c = re.findall('xx.*xx', code)
c		# ['xxIxxsfadfdxxlovexxsdfsasxxyouxxxxxxx']

# ？ 重复x零次或一次  优先重复一次

c = re.findall('xx?xx', code)
c		# ['xxxx', 'xxx']

# .*? 至少有一个或多个任意字符 匹配符合就返回值,再去匹配下一个,重复执行这个操作直到没有字符

c = re.findall('xx.*?xx', code)
c		# ['xxIxx', 'xxlovexx', 'xxyouxx', 'xxxx']

# (.*?)至少有一个或多个任意字符 只需要括号里面的

c = re.findall('xx(.*?)xx', code)
c		# ['I', 'love', 'you', '']


code = '''xxI
xxsfadfd
xxlovexx
sdfsasxxyouxxxxxxx'''

# re.S 使点可以匹配换行符

c = re.findall('xx(.*?)xx', code, re.S)	
c		# ['I\n', 'love', 'you', '']


code = 'abcxxIxx123xxLovexx123xxyouxxxx'

# .group(num) 有几个括号 group里面的数不能超过括号的数量 索引最高为2 没有0

c = re.search('xx(.*?)xx123xx(.*?)xx', code).group(2)	
c		# Love

# c[0][1] == Love  c:[('I', 'Love')]

c = re.findall('xx(.*?)xx123xx(.*?)xx', code)
c[0][0]	# I

# % 格式化字符

code = 'abc=0'
for i in range(1, 21):
	c = re.sub('abc=\d+', 'abc=%d'%i, code, re.S)
	print(c, end="\t")
	
# abc=1   abc=2   abc=3   ...   abc=19   abc=20

# 替换字符

c = re.sub('^a', 'b', code)
print(c)

# bbc

re库详解

findall(pattern, string, flags=0)
    Return a list of all non-overlapping matches in the string.
    If one or more capturing groups are present in the pattern, return
    a list of groups; this will be a list of tuples if the pattern
    has more than one group.
    Empty matches are included in the result.

search(pattern, string, flags=0)
    Scan through string looking for a match to the pattern, returning
    a Match object, or None if no match was found.
    
sub(pattern, repl, string, count=0, flags=0)
    Return the string obtained by replacing the leftmost
    non-overlapping occurrences of the pattern in string by the
    replacement repl.  repl can be either a string or a callable;
    if a string, backslash escapes in it are processed.  If it is
    a callable, it's passed the Match object and must return
    a replacement string to be used.