Python正则表达式详解

Melody~M

已于 2023-09-20 14:37:55 修改

阅读量6.8k

点赞数 8

分类专栏： Python 文章标签：正则表达式 python

于 2022-08-25 11:44:29 首次发布

本文链接：https://blog.csdn.net/m1992222/article/details/126430051

版权

3-2-1 Pattern.search(string,pos,endpos)

3-2-2 Pattern.match(string,pos,endpos)

3-2-3 Pattern.fullmatch(string,pos,endpos)

3-2-4 Pattern.findall(string,pos,endpos)

3-2-5 pattern.split(string, maxsplit=0)

3-2-6 pattern.sub(repl, string, count=0, flags=0)

3-3 正则表达式对象属性：

3-3-1 pattern.flags

3-4 re模块自带的函数

1.正则表达式简介

-》regular expression简单翻译为”有规则的表达式“，即该表达式是一条规则，正则表达式引擎能够根据这条规则,在字符串中寻找所有符合规则的部分

-》正则表达式是一个特殊的字符序列，方便检查一个字符串是否与某种模式匹配

-》Python 自1.5版本起增加了re 模块(标准库)，Python通过re模块提供对正则表达式的支持

2. 正则表达式语法：

2-1 普通字符：

#在字符串中查找匹配的子串 (普通字符子串)

import re

print(re.findall("hello","hahwiehellowordhahah")) #['hello'], 查找子串'hello'

print(re.findall("hello","hahwiewordhahhah")) #[]

print(re.findall("hello","hahwiehellowordhahhelloah")) #['hello', 'hello']

2-2 特殊字符:

正则表达式使用反斜线字符 \ 作为Escape(转义)符,让特殊字符失去含义

.匹配除换行符以外的任何单个字符

import re
print(re.findall("hellowo.d","hahwiehellowordhahah")) #['helloword']

\d匹配非数字，相当于[0-9]

import re

print(re.findall("\d","aahhhjjj211hh1276833a")) #['2', '1', '1', '1', '2', '7', '6', '8', '3', '3'],找出数字

print(re.findall("\d\d","aahhhjjj211hh1276833a")) #['21', '12', '76', '83']

\D匹配数字，相当于[^0-9]

import re

print(re.findall("\D","aahhhjjj211hh1276833a")) #['a', 'a', 'h', 'h', 'h', 'j', 'j', 'j', 'h', 'h', 'a']

print(re.findall("\D\D","aahhhjjj211hh1276833a")) #['aa', 'hh', 'hj', 'jj', 'hh']

print(re.findall("\D\d","aahhhjjj211hh1276833a")) #['j2', 'h1']

[...] 匹配[]中的任意单个字符，[abc]匹配a或b或c

import re

print(re.findall("hel[abcd]oword","hahwiehelaowordhahah")) #['helaoword']

[^...]匹配不在[]中的单个字符

import re

print(re.findall("hel[^abc]oword","hahwiehelbowordhahah")) #[]

print(re.findall("hel[^abc]oword","hahwieheldowordhahah")) #['heldoword']

[-] 匹配指定字符范围中的单个字符

print(re.findall("hel[a-c]oword","hahwiehelbowordhahah")) #['helboword'] , 匹配a,b,c字符

注意：

如果要匹配反斜杠,需要输入正则表达式\\ (两个反斜杠)

如果要匹配两个反斜杠，需要输入正则表达式\\\\ (四个反斜杠)

在以 'r' 为前缀的字符串文字中不以任何特殊方式处理反斜杠

2-3 数量限定

字符	描述
X?	匹配前面的子表达式X零次或一次，要匹配？字符使用 \？
X*	匹配前面的子表达式X零次或多次，要匹配 * 字符使用 \*
X+	匹配前面的子表达式X一次或多次,要匹配 + 字符使用 \+
X{n}	匹配前面的子表达式X n次
X{n,}	匹配前面的子表达式X最少n次
X{n,m}	匹配前面的子表达式X最少n次,不超过m次

*控制前面的字符，匹配0到多个

import re

print(re.findall("hel*oword","hahwieheowordhahah")) #['heoword'], 匹配到0个l

print(re.findall("hel*oword","hahwiehelowordhahah")) #['heloword'], 匹配到1个l

print(re.findall("hel*oword","hahwiehellowordhahah")) #['helloword'], 匹配到两个l

+控制前面的字符，匹配1到多个

import re

print(re.findall("hel+oword","hahwieheowordhahah")) #[], 0个l匹配失败

print(re.findall("hel+oword","hahwiehelowordhahah")) #['heloword'], 匹配到1个l

print(re.findall("hel+oword","hahwiehellowordhahah")) #['helloword'], 匹配到两个l

?匹配前面的字符0到1个,有多个只匹配1个

最低0.47元/天解锁文章

Melody~M

关注

8
点赞
踩
49

收藏

觉得还不错? 一键收藏
2
评论
Python正则表达式详解

1. 正则表达式简介on简单翻译为”有规则的表达式“，即该表达式是一条规则，正则表达式引擎能够根据这条规则,在字符串中寻找所有符合规则的部分-》正则表达式是一个特殊的字符序列，方便检查一个字符串是否与某种模式匹配-》Python 自1.5版本起增加了re 模块(标准库)，Python通过re模块提供对正则表达式的支持。
复制链接

扫一扫

专栏目录