Python—re正则表达式入门，适合初学者的讲解

小伙儿.

已于 2024-08-19 21:07:30 修改

阅读量277

点赞数 9

分类专栏： Python 文章标签：正则表达式 python 开发语言

于 2024-08-19 21:03:33 首次发布

本文链接：https://blog.csdn.net/qq_62757859/article/details/141334301

版权

Python 专栏收录该内容

19 篇文章 0 订阅

订阅专栏

正则表达式（Regular Expressions，简称 regex 或 regexp）是一种强大的字符串匹配工具，广泛应用于文本搜索、替换和解析等场景。Python 通过 re 模块提供了对正则表达式的支持。以下是一个面向初学者的 Python 正则表达式入门指南。

1. 导入 `re` 模块

在 Python 中使用正则表达式，首先需要导入 re 模块：

import re

2. 基本匹配

2.1 匹配单个字符

. ：匹配除换行符外的任意单个字符。

\d：匹配任意一个数字（0-9）。

\D：匹配任意一个非数字字符。

\w：匹配任意一个字母、数字或下划线（a-z, A-Z, 0-9, _）。

\W：匹配任意一个非字母、数字或下划线字符。

\s：匹配任意一个空白字符（空格、制表符、换行符等）。

\S：匹配任意一个非空白字符。

案例：匹配除换行符外的任意单个字符。

import re

matches = re.findall(r'.', "a1 B\tC")
print("Matches:", matches)  # 输出: ['a', '1', ' ', 'B', '\t', 'C']

案例：匹配任意一个数字（0-9）。

import re

matches = re.findall(r'\d', "a1b2c3")
print("Matches:", matches)  # 输出: ['1', '2', '3']

案例：匹配任意一个非数字字符。

import re

matches = re.findall(r'\D', "a1b2c3")
print("Matches:", matches)  # 输出: ['a', 'b', 'c']

案例：匹配任意一个字母、数字或下划线（a-z, A-Z, 0-9, _）。

import re

matches = re.findall(r'\w', "a1_ B@C")
print("Matches:", matches)  # 输出: ['a', '1', '_', 'B', 'C']

案例：匹配任意一个非字母、数字或下划线字符。

import re

matches = re.findall(r'\W', "a1_ B@C")
print("Matches:", matches)  # 输出: [' ', '@']

案例：匹配任意一个空白字符（空格、制表符、换行符等）。

import re

matches = re.findall(r'\s', "a1_ B\tC\nD")
print("Matches:", matches)  # 输出: [' ', '\t', '\n']

案例：匹配任意一个非空白字符。

import re

matches = re.findall(r'\S', "a1_ B\tC\nD")
print("Matches:", matches)  # 输出: ['a', '1', '_', 'B', 'C', 'D']

2.2 匹配多个字符

*：匹配前面的字符零次或多次。

+：匹配前面的字符一次或多次。

?：匹配前面的字符零次或一次。

{n}：匹配前面的字符恰好 n 次。

{n,}：匹配前面的字符至少 n 次。

{n,m}：匹配前面的字符至少 n 次，但不超过 m 次。

案例：匹配前面的字符零次或多次。

import re

matches = re.findall(r'a*', "aabaaac")
print("Matches:", matches)  # 输出: ['aa', '', 'aaa', '', '']

案例：匹配前面的字符一次或多次。

import re

matches = re.findall(r'a+', "aabaaac")
print("Matches:", matches)  # 输出: ['aa', 'aaa']

案例：匹配前面的字符零次或一次。

import re

matches = re.findall(r'a?', "aabaaac")
print("Matches:", matches)  # 输出: ['a', 'a', 'b', 'a', 'a', 'a', 'c']

案例：匹配前面的字符恰好 n 次。

import re

matches = re.findall(r'a{3}', "aabaaac")
print("Matches:", matches)  # 输出: ['aaa']

案例：匹配前面的字符至少 n 次。

import re

matches = re.findall(r'a{2,}', "aabaaac")
print("Matches:", matches)  # 输出: ['aa', 'aaa']

案例：匹配前面的字符至少 n 次，但不超过 m 次。

import re

matches = re.findall(r'a{2,3}', "aabaaac")
print("Matches:", matches)  # 输出: ['aa', 'aaa']

3.常用函数

3.1 `re.search()`

re.search()函数扫描整个字符串并返回第一个成功的匹配。

例子：

result = re.search(r"abc", "abcabcdabcde")
print(result, result.start(), result.end())

3.2 `re.findall()`

re.findall() 函数用于返回字符串中所有非重叠匹配的子串。它返回一个包含所有匹配结果的列表。如果没有找到匹配，则返回一个空列表。

例子：

result = re.findall(r"ab", "ababcAbcde", re.I)
print(result)

3.3 `re.finditer()`

re.finditer() 函数与 re.findall() 类似，但它返回一个迭代器，该迭代器生成所有匹配的 Match 对象，而不是一个匹配字符串的列表。这对于处理大文本时尤其有用，因为它不会一次性将所有匹配结果加载到内存中。

例子：

result = re.finditer(r"ab", "ababcAbcde", 2)
for data in result:
    print(data)

3.4 `re.match()`

re.match() 函数用于从字符串的起始位置匹配正则表达式。如果匹配成功，返回一个匹配对象；否则返回 None。这个函数只检查字符串的开头部分，不会扫描整个字符串。

例子：

result = re.match(r"ab", "ababcAbcde")
print(result)

3.5 `re.fullmatch()`

re.fullmatch() 函数用于检查整个字符串是否与正则表达式完全匹配。如果整个字符串与正则表达式匹配，返回一个匹配对象；否则返回 None。

例子：

result = re.fullmatch(r"ab", "ab", re.I)
print(result)

3.6 `re.split()`

re.split() 函数用于根据匹配的子串将字符串分割成列表。它可以根据正则表达式匹配的子串来分割字符串，并返回一个包含分割结果的列表。

例子：

result = re.split(r"ab", "ababcAbcde")
print(result)

result = re.split(r"ab", "ababcAbcde", maxsplit=1)
print(result)

3.6 `re.sub()`

re.sub() 函数用于替换字符串中所有匹配的子串。它可以根据正则表达式匹配的子串来替换字符串中的内容，并返回替换后的新字符串。

例子：

result = re.sub(r"abc", "啦啦", "ababcAbcde")
print(result)

正则表达式是一个强大的工具，但也需要一定的学习和实践才能熟练掌握。

小伙儿.

关注

9
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
Python—re正则表达式入门，适合初学者的讲解

常用函数
复制链接

扫一扫

专栏目录

Python—re正则表达式入门，适合初学者的讲解

1. 导入 re 模块

2. 基本匹配

2.1 匹配单个字符

2.2 匹配多个字符

3.常用函数

3.1 re.search()

3.2 re.findall()

3.3 re.finditer()

3.4 re.match()

3.5 re.fullmatch()

3.6 re.split()

3.6 re.sub()