python正则表达式基础学习

最新推荐文章于 2024-09-17 00:15:00 发布

东流-beyond the label

最新推荐文章于 2024-09-17 00:15:00 发布

阅读量217

点赞数

分类专栏： python学习文章标签： python 正则表达式

本文链接：https://blog.csdn.net/qq_43402639/article/details/103775138

版权

python学习专栏收录该内容

7 篇文章 3 订阅

订阅专栏

代码实例：

一以下例子中涉及（可直接看例子）：
’^’ ：匹配开头字符串
’$’：匹配结尾字符串
’.’ ：匹配除换行符外任一字符
’*’ ：匹配零次或多次
’+’ ：匹配一次或多次
’?’ ：匹配一次或零次
’\w’ ：匹配字母数字或者下划线
常在模式中出现的 r : r 表示字符串为非转义的原始字符串，让编译器忽略反斜杠，也就是忽略转义字符
想要了解更多字符含义推荐：https://www.runoob.com/python/python-reg-expressions.html

import re

ad = ["https://baidu/item/.html","https://baidu/items/wohao.org",
       "https://baidu/items/dajiahao.html","https://baidu/"]
for one in ad:
    '''
    @exmple1
    匹配字符串必须以‘https://baidu/items/’开头
    '^'：匹配开头字符串
    所以" ^指定字符串或字符 " 表示匹配指定字符开头的字符串
    '''
    if re.search("^https://baidu/items/",one): 
        print(one)
    else:
        print("wrongaddress!")

for one in ad:
    '''
    @exmple2
    匹配字符串必须以‘https://baidu/items/’开头
    '.':匹配除换行符外任一字符
    '*':重复匹配零次或多次
    所以"指定字符串或字符.*" 也表示匹配指定字符开头的字符串，任意结尾
    '''
    if re.search("https://baidu/items/.*",one): 
        print(one)
    else:
        print("wrongaddress!")
    
for one in ad:
    '''
    @exmple3
    匹配字符串必须以‘.html’结尾
    '$'：匹配结尾字符串
    所以" 指定字符串或字符$ " 表示匹配指定字符结尾的字符串
    有一点需要注意的是转义字符 '\' ，跟C语言一样，冲突了要用'\'，这里的'\.'即是
    当然也可以写成 r'.html$'同样效果
    r表示字符串为非转义的原始字符串，让编译器忽略反斜杠，也就是忽略转义字符
    '''
    if re.search("\.html$",one):
        print(one)
    else:
        print("wrongaddress!")
        
for one in ad:
    '''
    @exmple4
    匹配字符串中必须含有‘items’
    需要含有什么特定内容写模式里就行了
    '''
    if re.search("items",one):
        print(one)
    else:
        print("wrongaddress!")
        
ad1 = ['wang@163.com','8765432@qq.com','new@.com','where@@.666.com']
for one in ad1:
    '''
    @exmple5
    匹配邮箱：格式 ***@***.com
    '''
    if re.search("\w+@\w+\.com",one):
        print(one)
    else:
        print("wrongaddress!")

模块re中的四个方法常用方法

：re.match() , re.seach() , re.findall()，re.compile
方法介绍：

'''
    re.match(pattern,string,flag) 匹配成功re.match方法返回一个匹配的对象，否则返回None。
    re.search(pattern,string,flag) 匹配成功re.search方法返回一个匹配的对象，否则返回None。
    flag :
        re.I : 忽略大小写；
        re.M : 多行匹配；
        re.S : 使.匹配换行符在内任意字符(.原本是只可以匹配除换行符以外字符)
    返回对象的方法 :
        group()  返回与模式匹配的内容
        groups() 以元组形式返回子串()中匹配到的内容
    re.match 和 re.search区别：re.match只匹配字符串的开始，如果字符串开始不符合正则表达式，
    则匹配失败，函数返回None；而re.search匹配整个字符串，直到找到一个匹配
    
    re.finall(pattern,string,flag)  匹配完成返回一个列表
    区别于match和search，findall匹配所有，match和search只匹配一次
'''
res = re.match(r'(.*) am a (.*) .*','I am a code creator.')
print(res.group()) 
print(res.groups())
#I am a code creator.
#('I', 'code')

if re.match(r'code','I am a code creator.'):
    print('True')
else:
    print('False')
if re.search('code','I am a code creator.'):
    print('True')
else:
    print('False')
# False True

res = re.findall('(.*) am a (.*) .*','I am a code creator.')
print(res)
#[('I', 'code')]

'''
re.compile() 用以编译正则表达式，生成一个正则表达式对象，供给前面三个函数用
'''
pattern = re.compile('(.*) am a (.*) .*')
res = pattern.match('I am a code creator.')
print(res.groups())