python正则表达式

最新推荐文章于 2022-07-19 22:41:08 发布

qq_27758151

最新推荐文章于 2022-07-19 22:41:08 发布

阅读量388

点赞数

分类专栏： python 文章标签： python

本文链接：https://blog.csdn.net/qq_27758151/article/details/105734516

版权

python 专栏收录该内容

16 篇文章 1 订阅

订阅专栏

正则re.complie作用
re.compile是将正则表达式编译成一个对象，加快速度，并重复使用
match，search，findal区别l
match只从字符串的开头开始,只匹配开头
search从字符串开头开始搜索直到成功匹配上字符就停止
findall从头开始搜索直到把所有符合条件的符号都找出来为止，返回列表
在这里插入图片描述

**`（.）和（.?）`**

(.*)是贪婪匹配，会把满足正则的尽可能多的往后匹配
(.*?)是非贪婪匹配，会把满足正则的尽可能少匹配
例：正则表达式匹配中，（.）和（.?）匹配区别？

import re
b='1xt1xe1356781--=091'
r1=re.match(r'1(.*)1',b)
print(r1)
r2=re.match(r'1(.*?)1',b)
print(r2)
#输出
# span=(0, 19), match='1xt1xe1356781--=091'>
#span=(0, 4), match='1xt1'>

re.sub
例：a=“张明 98分”，用re.sub，将98替换为100

import re
a="张明 98分"
print(re.sub(r'98','100',a))

re.split
例：s=“info:xiaoZhang 33 shandong”,用正则切分字符串输出[‘info’, ‘xiaoZhang’, ‘33’, ‘shandong’]

import re
s="info:xiaoZhang 33 shandong"
ss=re.split(r'\:|\s',s)
print(ss)

例：<div class="nam">中国</div>，用正则匹配出标签里面的内容（“中国”），其中class的类名是不确定的

import re
s='<div class="nam">中国</div>'
#方法一
m=re.match(r'<div class=".*">(.*)</div>',s)
print(m.group(1))#中国
#方法二
m=re.findall(r'<div class=".*">(.*)</div>',s)
print(m)#['中国']

例：字符串a = “not 404 found 张三 99 深圳”，每个词中间是空格，用正则过滤掉英文和数字，最终输出"张三深圳"

import re
a = "not 404 found 张三 99 深圳"
b=a.split(" ")
print(b)
L=re.findall(r'[0-9]+|[a-zA-Z]+',a)
print(L)
for l in L:
    if l in b:
        b.remove(l)
s=" ".join(b)
print(s)

例：正则匹配，匹配日期2018-03-20

import re
print(re.match(r'\d{4}\-\d{2}\-\d{2}','2018-03-20'))
print(re.match(r'\d{4}\-\d{2}','2018-03-20'))

例：正则匹配以http://163.com结尾的邮箱

import re
#s="someone@gmail.com"
s="someone@163.com"
ss=re.match(r'.*163.com$',s)
print(ss)

例：正则匹配不是以4和7结尾的手机号

import re 
P=["13532316157","13532316154","13532316153"]
for p in P:
    r=re.match(r'1\d{9}[0-3,5-6|8-9]$',p)
    if r:
        print("不是以4和7结尾的手机号")
    else:
        print("是以4和7结尾的手机号")

例：正则表达式匹配第一个URL

import re
s='<img data="https://cc.jpg" src="https://bj.jpg" style="hgvghv">'
r1=re.findall(r'https://.*?\.jpg',s)[0]
print(r1)
r2=re.search(r'https://.*?\.jpg',s)
print(r2.group())#findall结果无需加group(),search需要加group()提取

例：正则匹配中文
正则匹配中文，固定形式：\u4E00-\u9FA5

import re
title='你好，hello,世界'
res=re.findall(r'[\u4E00-\u9FA5]+',title)
print(res)

例：正则表达式匹配出

http://www.itcast.cn

import re
s=["<html><h1>http://www.itcast.cn</h1></html>","<html><h2>http://www.itcast.cn</h2></html>"]
for l in s:
    res=re.match(r'(^<html><h1>).*(</h1></html>)$',l)
    if res:
        print("Y")
    else:
        print("N")
# re.match(r'<\w*><\w*>.*?</\2></\1>',l)
# </\2></\1>和前面的<>是对应的

qq_27758151

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python正则表达式

正则re.complie作用re.compile是将正则表达式编译成一个对象，加快速度，并重复使用match，search，findal区别lmatch只从字符串的开头开始,只匹配开头search从字符串开头开始搜索直到成功匹配上字符就停止findall从头开始搜索直到把所有符合条件的符号都找出来为止，返回列表（.*）和（.*?）(.*)是贪婪匹配，会把满足正则的尽可能多的往后匹配...
复制链接

扫一扫