python：正则表达式使用

最新推荐文章于 2022-05-07 20:17:14 发布

国民老公六哥

最新推荐文章于 2022-05-07 20:17:14 发布

阅读量238

点赞数

分类专栏： python基础文章标签：正则表达式

本文链接：https://blog.csdn.net/weixin_44941728/article/details/105641227

版权

python基础专栏收录该内容

9 篇文章 0 订阅

订阅专栏

场景：替换很多动态数据的时候，会重复很多的条件判断if,replace。
作用：完成多步，同时去匹配符合特定规则的字符串，完成通用的正则匹配
正则表达式是一种通用的字符串匹配技术，是不会因为编程语言不同发生变化。
想要查找某种特征的，具有一定规则的字符串，都是可以尝试使用正则表达式
jsonpath,xpath解析相关

如何进行匹配？
匹配的方式：只是python当中的封装，re库，三种模式
-match
-search
-findall

语法：

. 匹配任意一个字符
{} 匹配多个
*匹配0次或者任意次
? 匹配0次或者1次 - -非贪念模式
python内置的是贪念模式

match：表示匹配开头

# 匹配特定的字符串"abc"
import re
re_pattern = r"abc"
# 从"abcdefabc" 这个字符串中匹配是否包含正则表达式re_pattern所包含的这个字符串
res = re.match(re_pattren, "abcdefabc")
# <re.Match object; span=(0, 3), match='abc'> 对象表示匹配的范围不包含3，找不到返回None

search：表示全文匹配

import re
re_pattern = r"abc"
res = re.match(re_pattren, "abcdefabc")
# <re.Match object; span=(0, 3), match='abc'>只匹配一次
res = re.search(re_pattren, "abdefabc")
# <re.Match object; span=(5, 8), match='abc'>

findall：表示全部匹配

import re
re_pattern = r"abc"
res = re.findall(re_pattren, "abcdefabc")
# ['abc', 'abc'] 缺点 不知道位置

[] - 匹配[abc]中的任意一个字符

import re
re_parttern = r"[abc]"
res = re.findall(re_parttern, "abcdefabc")
# ['a', 'b', 'c', 'a', 'b', 'c']

扩展：[0-9] - 匹配0-9范围中的任意一个数字 - [a-z]、[A-Z]

re_pattern = r"[0-9]"
res = re.findall(re_pattern, "123_abc0")
# ['1', '2', '3', '0']

. 匹配任意一个字符。除了\n

import re
re_pattern = r"."
res = re.findall(re_pattern, "abcdefabc\n")
# ['a', 'b', 'c', 'd', 'e', 'f', 'a', 'b', 'c']

{} 匹配任意一个字符。除了\n

\d 表示匹配任意一个数字 - 扩展[0-9]表示范围
\D 表示匹配任意一个非数字

import re
re_pattern = r"\d"
res = re.findall(re_pattern, "123_abc") 
# ['1', '2', '3']

\w 表示匹配任意一个字母，数字，下划线。等价于[A-Za-z0-9]
\W 表示匹配非字母数字下划线

import re
re_pattern = r"\w"
res = re.findall(re_parttern, "123_abcdefabc\n")
# ['1', '2', '3', '_', 'a', 'b', 'c', 'd', 'e', 'f', 'a', 'b', 'c']

组合

\w{m} 表示匹配字母数字下划线，匹配m次
\d{m} 表示匹配数字，匹配m次

import re
re_pattern = r"\w{2}"
res = re.findall(re_parttern, "12@3@_abcdef@a")
# ['12', '_a', 'bc', 'de'] 匹配不到会断开重新匹配

** {2,} 表示匹配至少两次两次以上**

# 贪婪模式  python当中默认是贪婪模式
re_pattern = r"\w{2,}"
res = re.findall(re_pattern, "aa#b##123dfs_") # 
# ['aa', '123dfs_']

** {,2} 表示匹配最多两次 – 包括0次**

# 贪婪模式  python当中默认是贪婪模式
re_pattern = r"\w{2,}"
res = re.findall(re_pattern, "aa#b##123dfs_") # 
# ['aa', '', 'b', '', '', '12', '3d', 'fs', '_', '']

** {2,4} 表示匹配2-4次**

# 匹配2-4次
re_pattern = r"\w{2,4}"
res = re.findall(re_pattern, "aa#b##123dfs_") # ['aa', '123dfs_']
print(res) # ['aa', '123d', 'fs_']

** 表示匹配0次或者任意次，通配符 – 数据库 discover**

re_pattern = r"\d*"
res = re.findall(re_pattern, "aa#b#18511111111#123dfs_") 
print(res) # ['', '', '', '', '', '18511111111', '', '123', '', '', '', '', '']

** +表示匹配0次或者任意次**

re_pattern = r"\d+"
res = re.findall(re_pattern, "a1a#b#18511111111#123dfs_") 
print(res) # ['1', '18511111111', '123']

** 组合 \d. **

re_pattern = r"\d."
res = re.findall(re_pattern, "a1a#b#18511111111#123dfs_") # ['aa', '123dfs_']
print(res)  # ['1a', '18', '51', '11', '11', '11', '1#', '12', '3d']

? 表达式后面加？可以表示非贪念模式，尽量少的匹配，包括0

re_pattern = r"\d?"
res = re.findall(re_pattern, "a1a18#12dfs_")
print(res)  # ['', '1', '', '1', '8', '', '1', '2', '', '', '', '', '']

**^开头**
```python
re_pattern = r"^\d"
res = re.findall(re_pattern, "aa#b#18511111111#123dfs_") # ['aa', '123dfs_']
print(res) # [''] 以\d开头 但是没有,所以为 ['']

结尾$

re_pattern = r"\d*$"
res = re.findall(re_pattern, "aa#b#18511111111#123dfs_22") # ['aa', '123dfs_']
print(res) # ['22', '']  这个代了*号了 有0次

如何去匹配一个手机号

import re

re_pattern = r"1[35789]\d{9}"
res = re.findall(re_pattern, "aa#b#18511111111#123dfs_")
print(res) # ["18511111111"]

邮箱正则

国民老公六哥

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
python：正则表达式使用

场景：替换很多动态数据的时候，会重复很多的条件判断if,replace。作用：完成多步，同时去匹配符合特定规则的字符串，完成通用的正则匹配正则表达式是一种通用的字符串匹配技术，是不会因为编程语言不同发生变化。想要查找某种特征的，具有一定规则的字符串，都是可以尝试使用正则表达式jsonpath,xpath解析相关如何进行匹配？匹配的方式：只是python当中的封装，re库-match...
复制链接

扫一扫

专栏目录