python re模块

最新推荐文章于 2024-05-17 07:06:05 发布

xiaofengfeng20

最新推荐文章于 2024-05-17 07:06:05 发布

阅读量394

点赞数 1

分类专栏： python 文章标签： python 正则表达式

本文链接：https://blog.csdn.net/xiaofengfeng20/article/details/125938970

版权

python 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

干货演示

import re

re.match函数

原型：match(pattern,string,flags=0)
pattern：匹配的正则表达式
string：要匹配的字符串
flags:标志位，用于控制正则表达式的匹配方式
功能：尝试从字符串的起始位置匹配一个模式，
如果不是起始位置匹配，成功的话返回None，不匹配成功也返回None

re.I 忽略大小写
re.S 是.匹配包括换行符在内的所有字符
re.M 多行匹配，影响^和$
以上三个常用

re.L 做本地户识别
re.U 根据Unicode字符集解析字符，影响\w \W \b \B
re.X 是我们更灵活的格式理解正则表达式

str='13517737986abcdefghijklmnopqretuvwsyz'

print(re.match('www','www.baidu.com'))
print(re.match('www','.baidu.com.www'))

print(re.match('www','Www.baidu.com'))
print(re.match('www','wWw.baidu.com',flags=re.I))

re.search()函数

re.search()函数
原型：search(pattern,string,flags=0)
pattern：匹配的正则表达式
string：要匹配的字符串
flags:标志位，用于控制正则表达式的匹配方式
功能：扫描整个字符串，并返回第一个匹配成功的

print(re.search('python','hhh is aaa il  python,fjdklsa python '))

re.findall()

re.findall函数
原型：match(pattern,string,flags=0)
pattern：匹配的正则表达式
string：要匹配的字符串
flags:标志位，用于控制正则表达式的匹配方式
功能：扫描整个字符串并返回结果列表

print(re.findall('python','hhh is a pyThon a pythOn a il  python,fjdklsa python ',re.I))

正则表达式元字符

print(‘------re.search(）--------------匹配单个字符与数字----------------’)

'''
r
.     匹配除换行符以外的任意字符
[0123456789]    []字符集合，表示匹配方括号中所包含的任意一个字符
[python]        匹配其中任意一个字符
[a-z]   匹配任意小写字母
[A-Z]   匹配任意大写字母
[0-9]   匹配任意数字
[0-9a-zA-Z]     匹配任意数字和字母和下划线

[^python]   匹配除了python这几个字母意外的所有字符，^表示不匹配集合中的字符(脱字符)
[^0-9]      匹配所有非数字字符
\d          匹配数字，效果同[0-9]
\D          匹配非数字字符，效果通[^0-9]
\w          匹配数字，字母和下划线，效果同[0-9a-zA-Z]
\W          匹配非数字，字母和下划线，效果同[^0-9a-zA-Z]
\s          匹配任意的空白符（空格，回车，换页，制表）效果同[ \f\n\r\t]
\S          匹配任意的非空白符，效果同[^ \f\n\r\t]
'''
# print(re.findall('\d', 'hhh is a pyThon a pyt2hOn a il python '))

print(‘--------------锚字符(边界字符)-------------------’)


'''
^   行首匹配，与在[]中的^不是一个意思
$   行尾匹配，

\A  匹配字符串的开始，它和^的区别是：\A只匹配整个字符的开头，
    即使在re.M的模式下也不会匹配它行的行首
\Z  匹配字符串的结束，它和$的区别是：\Z只匹配整个字符的开头，
    即使在re.M的模式下也不会匹配它行的行尾部


\b  匹配一个单词的边界，也就是值单词和空格的位置
\B  匹配非单词边界
'''

print(re.search('^is','is python,is a good python'))
print(re.search('python$','is python,is a good python'))
print(re.findall('^is','is python,is a good python\n'
                      'is python,is a good python',re.M))
print(re.findall('\Ais','is python,is a good python\n'
                    'is python,is a good python',re.M))
 print(re.findall('python$','is python,is a good python\n'
                     'is python,is a good python',re.M))
 print(re.findall('python\Z','is python,is a good python\n'
                      'is python,is a good python',re.M))

 print(re.search(r'er\b','never'))

print(re.search(r'er\B','never'))

print(re.search(r'er\B','nerve'))

print(‘------------------匹配多个字符-------------’)

'''
说明：下方的下x,y,z均为假设的普通字符，n,m(非负整数)不是正则表达的元字符
(xyz)       匹配小括号内的xyz(作为一个整体取匹配)
x?          匹配0个或者1个x
x*          匹配0个或者任意多个x
x+          匹配至少1个x
x{n}        匹配确定的n个x(n是一个非负整数)
x{n,}       匹配至少n个x
x{n,m}      匹配至少n个最多m个x,注意：n<=m
x|y         |表示或，匹配的是x或y
'''

例子

print(re.findall(r'(python)','python is my teacher teacher is a good python'))

print(re.findall(r'o?','python is my teacher teacher is a good python'))

print(re.findall(r'a*','aaa'))   #非贪婪匹配，尽可能少的匹配
print(re.findall(r'n*','nnyytntoonn'))   #贪婪匹配，尽可能多的匹配
print(re.findall(r'n+','nnyytntoonn'))  #贪婪匹配，尽可能多的匹配
print(re.findall(r'n{2}','nnnnyytnntoonnn'))
print(re.findall(r'n{3,}','nnnnyytnntoonnn'))   #贪婪匹配，尽可能多的匹配
print(re.findall(r'n{4,5}','nnnnyytnntoonnn'))
print(re.findall(r'((p|P)ython)','python-Python'))

需求：提取python…man

str='python is a good hello!python is a box!python is a very good hello'

print(re.findall(r'python.*?hello',str))


print("--------------特殊——————————————————")
'''
*?  +?  x?  最小匹配，通常都是尽可能多的匹配，可以使用这种解决贪婪匹配
(?:x)       类似(xyz)，但不表示一个组


'''
#注释：/* part1 */  /* part2 */
#                  /*  .  */  /  匹配
print(re.findall(r'/*.*/*/','/* part1 */  /* part2 */'))
print(re.findall(r'/*.*?/*/','/* part1 */  /* part2 */'))

判断是否是电话号码

def checkPhon(str):
    if len(str) != 11:
        return False
    elif str[0] != '1':
        return False
    elif str[1:3] != '30' and str[1:3] != '35':
        return False
    for i in range(3, 11):
        if str[i] < '0' and str[i] > '9':
            return False
    return True


import re


def checkPhone2(str):
    # pat=r'^1(([3578]\d)|(47))\d{8}$'
    pat = r'^1[3578]\d{9}$'
    res = re.match(pat, str)
    print(res)


# print(checkPhon('135177337861'))
# print(checkPhon('135177337a6'))
# print(checkPhon('23517733786'))
# print(checkPhon('13517737986'))
print('************')
checkPhone2('13412345678')