正则表达式

最新推荐文章于 2020-10-14 09:30:27 发布

lee发发发发发发

最新推荐文章于 2020-10-14 09:30:27 发布

阅读量782

点赞数

分类专栏：聚沙成塔文章标签：正则表达式

本文链接：https://blog.csdn.net/weixin_41011570/article/details/105018353

版权

聚沙成塔专栏收录该内容

10 篇文章 0 订阅

订阅专栏

Python

基础

import re
a=['5','3','2','5','3']
r = re.findall('5',a) #在a里面找5 (返回列表)


> print(a.index('5')>-1)
> print('5' in a) #两种内置函数

进阶

元字符符号列表-百度百科

常用积累：
\d 数字也可写作 [0-9]
\D 非数字也可写作^[0-9]
^ 取反操作

概括字符集
\w 字母和数字下划线也可写作 [A-Za-z0-9_]
\W 非单词字符 ‘#’ ，‘¥’ ，’%’ ，’\n’ ，’\r’
\w \W 匹配所有字符

\s 空白字符空格
\S 非空白字符
\s\S 匹配所有字符

. 匹配除换行符之外其他所有字符

re.findall(‘\d’,a)

1.字符集

s='fgh','wef','sdf','ytee','poii'

*#找到a[c或者f]c*
r = re.findall('a[cf]c',s)
*#找到a[c到f中的任意]c*
r = re.findall('a[c-f]c',s)

2.数量词
2.1贪婪和非贪婪概念

a='python123java790php'
r=re.findall('[a-z]{3-6}',a)

['python','java','php'] #结果

#表示只取在数量区间3-6包括6的a-z字母集

#上述例子是贪婪方式 ‘尽可能匹配更多’
3-6的区间，尽量找到更多匹配所以取到了
‘python’ 而不是‘pyt’

但是如果

r=re.findall('[a-z]{3-6}?',a)
#加了'?' 就是非贪婪，输出结果就变为
['pyt','hon','jav','php',]

本质上输出结果在上述举例中和
r=re.findall(’[a-z]{3}’,a) 没有区别
但是规定了最大取值就是匹配6个

2.2
符号‘*’ 匹配0次或无限多次
符号‘+’ 匹配1次或者无限多次
符号‘？’ 匹配0次或者1次

a = 'pytho0python1pythonn2'
#符号‘\*’
r= re.findall('python*',a)
#输出结果为
['pytho','python','pythonn']
#符号*在这里表示对字母‘n’可以匹配0次或者无数次
#符号‘+’
r= re.findall('python+',a)
#输出结果为
['python','pythonn']
#符号*在这里表示对字母‘n’可以匹配1次或者无数次
#符号‘?’
#要区别在这里的问好是作为数量词符号出现的
#在2.1贪婪非贪婪中，问号是用于强调非贪婪的
#两个？具有不同的概念
r= re.findall('python?',a)
#输出结果为
['pytho','python','python']
#符号?在这里表示对字母‘n’可以匹配0次或者1次

3.边界匹配

qq = '123456789'
r = re.findall('\d{4-8}',qq)
#输出结果为
['12345678']
r = re.findall('^\d{4-8}\$',qq)
#输出结果为
[]

这里的^表示从前匹配也就是qq中1的位置
这里的$表示从后匹配也就是qq中9的位置
再举例

qq='100000001'
r = re.findall('000}',qq)
#输出结果为
['000','000']
r = re.findall('^000',qq)
#输出结果为
[] #因为qq里面第一个字符是1不是0
r = re.findall('^000\$',qq)
#输出结果为
[] #因为qq里面倒数第一个字符是1不是0

4.组

a = 'pythonpythonpythonpythonpython'
r = re.findall('(python){3}',a)
#[]是或关系 ()是且关系
#[abc] 包含a或b或c
#(abc)包含a，b和c

5.正则替换

re.sub(pattern,repl,string,count=0,flags=0)
#count=0表示替换次数，0表示无限制
language = 'pythonc#javac#phpc#'
re.sub('c#','go',language)
#也可以用replace
language = language.replace('c#','go')

6.把函数作为参数传入

s='A8c3721d86'
def convert(value):
	matched = value.group()
	if int(matched)>=6:
		return '9'
	else:
		return '0'
r=r.sub('\d,convert,s) 
#把数字大于等于6的换成9，小于6的换成0

7.日常积累补充
7.1 match and rearch

s = 'A8689cfh12hj453'

#从首字母开始匹配
r = re.match('\d',s)
#结果为空 
#看到是A 而不是数字 直接停止搜索

#搜索字符串，直到找到匹配
r1 =  re.search('\d',s)
#有结果 返回对象

r.span()#返回在原字符串的位置

7.2 group

r.group() #可以传入组号,也就是说可以有多个分组
r.group(0) #表示完整的匹配 
r.groups() #以tuple形式返回所有括号的匹配
s = 'life is short, i use python'
r = re.search('life.*python',s)
print(r.group())
#结果为 life is short, i use python

r = re.search('life(.*)python',s)
print(r.group(1))
#结果为 is short, i use

也可以用findall

r = re.findall('life(.*)python',s)
print(r)
#结果为
['is short, i use']

ending 常用正则表达式积累

lee发发发发发发

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录