正则表达式常见匹配内容

IT之一小佬

已于 2023-06-03 17:04:26 修改

阅读量7.5k

点赞数 2

分类专栏：正则表达式文章标签：正则表达式

于 2022-10-23 23:56:30 首次发布

本文链接：https://blog.csdn.net/weixin_44799217/article/details/127474189

版权

正则表达式专栏收录该内容

14 篇文章 20 订阅

订阅专栏

1、匹配中文字符

示例代码：

import re

s = '人生苦短，我学python！'
# s = s.encode('utf-8').decode('utf-8')
# s = s.decode('utf-8').encode('utf-8')

# 匹配中文
# 方法一
s_ch = ''
for i in s:
    if '\u4e00' <= i <= '\u9fa5':
        s_ch += i
print(s_ch)

# 方法二
aa = re.compile('[\u4e00-\u9fa5]')
bb = aa.findall(s)
print(bb)
cc = ''.join(bb)
print(cc)

# 方法三
dd = re.findall('[\u4e00-\u9fa5]', s)
print(dd)
ff = ''.join(dd)
print(ff)

运行结果：

2、匹配双字节字符（包括汉字在内）

示例代码：

import re

s = '人生苦短，我学python！'
# s = s.encode('utf-8').decode('utf-8')
# s = s.decode('utf-8').encode('utf-8')

# 匹配双字节字符（包括汉字在内）
# 方法一
aa = re.compile('[^\x00-\xff]')
bb = aa.findall(s)
print(bb)
cc = ''.join(bb)
print(cc)

# 方法二
dd = re.findall('[^\x00-\xff]', s)
print(dd)
ff = ''.join(dd)
print(ff)

运行结果：

3、匹配Email地址

示例代码：

import re

s = "人生苦短，我学python！my email is:123456789@qq.com"

# 匹配Email地址
# 方法一
aa = re.compile("[\w!#$%&'*+/=?^_`{|}~-]+(?:\.[\w!#$%&'*+/=?^_`{|}~-]+)*@(?:[\w](?:[\w-]*[\w])?\.)+[\w](?:[\w-]*[\w])?")
bb = aa.findall(s)
print(bb)

# 方法二
cc = re.findall("[\w!#$%&'*+/=?^_`{|}~-]+(?:\.[\w!#$%&'*+/=?^_`{|}~-]+)*@(?:[\w](?:[\w-]*[\w])?\.)+[\w](?:[\w-]*[\w])?",
                s)
print(cc)

运行结果：

4、匹配网址URL

示例代码：

import re

s = "人生苦短，我学python！my website is:https://www.baidu.com"

# 匹配网址URL
# 方法一
aa = re.compile("[a-zA-z]+://[^\s]*")
bb = aa.findall(s)
print(bb)

# 方法二
cc = re.findall("[a-zA-z]+://[^\s]*",
                s)
print(cc)

运行结果：

5、匹配网站title

示例代码：

import requests
import re

url = 'https://pz.wendu.com/'

response = requests.get(url)
data = response.text
# print(data)
res = re.findall(r'<title>(.*?)</title>', data)[0]
print(res)

运行结果：

6、匹配国内电话号码

示例代码：

import re

s = "人生苦短，我学python！my phone is:0101-8758521"

# 匹配国内电话号码
# 方法一
aa = re.compile("\d{3}-\d{8}|\d{4}-\d{7,8}")
bb = aa.findall(s)
print(bb)

# 方法二
cc = re.findall("\d{3}-\d{8}|\d{4}-\d{7,8}",
                s)
print(cc)

运行结果：

7、匹配手机号

示例代码：

import re

s1 = 'num:12345678900,name:dgw,phone:19876543210,age:25'
s2 = 'num:12345678900,name:dgw,phone:119876543210,age:25'

aa = re.compile(r'(?<=\D)1[3456789]\d{9}', re.S)
bb = aa.findall(s1)
print(bb)

cc = re.compile(r'(?<=\D)1[3456789]\d{9}', re.S)
dd = cc.findall(s2)
print(dd)

ee = re.compile(r'1[3456789]\d{9}', re.S)
ff = ee.findall(s2)
print(ff)

gg = re.compile(r'(?<=\d)1[3456789]\d{9}', re.S)
hh = gg.findall(s2)
print(hh)

运行结果：

8、判断一个字符串中是否包含数值

示例代码：

import re


def has_number(string):
    pattern = re.compile(r'\d+')
    return bool(pattern.search(string))


# 测试
print(has_number('hello123'))  # True
print(has_number('hello'))  # False

其中，\d 表示匹配任意数字，+ 表示匹配一个或多个数字。search 方法返回第一个匹配的对象，如果匹配成功，则返回 True。

运行结果;

9、判断一个字符串中是否包含非标准字符

示例代码：

import re


def has_nonstandard_char(string):
    pattern = re.compile(r'[&@#]')
    return bool(pattern.search(string))


# 测试
print(has_nonstandard_char('hello#world'))  # True
print(has_nonstandard_char('hello, world'))  # False

其中，[&@#] 表示匹配字符集中的任意一个字符，即匹配 &、@ 或 # 中的任意一个。search 方法返回第一个匹配的对象，如果匹配成功，则返回 True。

运行结果：