python中常见re正则表达式(整数、小数、邮箱、号码、车牌、x开头y结尾)大合集(值得收藏)

最新推荐文章于 2024-07-20 17:12:48 发布

一晌小贪欢

最新推荐文章于 2024-07-20 17:12:48 发布

阅读量521

点赞数 16

分类专栏： Python100个库分享自己的笔记文章标签： python 正则表达式 mysql python办公 python学习模糊匹配 re正则

本文链接：https://blog.csdn.net/weixin_42636075/article/details/139743863

版权

自己的笔记同时被 2 个专栏收录

47 篇文章 5 订阅

订阅专栏

Python100个库分享

29 篇文章 5 订阅

订阅专栏

专栏导读

🌸 欢迎来到Python办公自动化专栏—Python处理办公问题，解放您的双手

🏳️‍🌈 博客主页：请点击——> 一晌小贪欢的博客主页求关注

👍 该系列文章专栏：请点击——>Python办公自动化专栏求订阅

🕷 此外还有爬虫专栏：请点击——>Python爬虫基础专栏求订阅

📕 此外还有python基础专栏：请点击——>Python基础学习专栏求订阅

文章作者技术和水平有限，如果文中出现错误，希望大家能指正🙏

❤️ 欢迎各位佬关注！ ❤️

库的介绍

库的安装

无需安装

1、匹配整数

\d+

import re

a = 'xxxx573DCxxxxx'
# 匹配数字
b = re.findall('\d+', a)
print(b)

2、匹配某几位整数

\d+,匹配字符中10位数字

import re

res = re.findall(r'\d{10}',"hasdh8523697400ffsfsd")
print(res)

3、匹配小数

import re
res = re.findall(r'-?\d+\.?\d*e?-?\d*?', "dasda100.025dasda")
print(res)

4、匹配电话

格式1：11位数字

import re
def match_phone_number(text):
    pattern = r"1[3-9]\d{9}"  # 正则表达式，匹配中国手机号码

    match = re.search(pattern, text)
    if match:
        print("找到的手机号码是:", match.group())
    else:
        print("未找到手机号码")

phone_number = "我的电话号码是18712341234"
match_phone_number(phone_number)

格式2：187-12341234 或者187-1234-1234

import re

def find_phone_numbers(text):

    # 正则表达式，精确匹配这两种格式的中国手机号码
    pattern = r"1[3-9]\d{1}-(\d{8}|\d{4})(-\d{4})?"  # 第一个'-'后跟4位数字，第二个'-'为可选，其后也是4位数字
    match = re.search(pattern, text)
    if match:
        return match.group()
    return None

# 定义可能的手机号码格式
text1 = "我的电话号码是187-12341234"
text2 = "联系方式更新为187-1234-1234"

for text in [text1, text2]:
    print('匹配到的收集号码是：',find_phone_numbers(text))

格式3：(123) 456-7890，或者+86 123-456-7890

import re


def find_phone_numbers(text):
    # 这个正则表达式尝试匹配多种格式的电话号码
    pattern = r'\b(?:\+\d{1,3}[-. ]?)?(?:\(\d{3}\)|\d{3})[-. ]?\d{3}[-. ]?\d{4}\b'
    matches = re.findall(pattern, text)
    return matches

text = "我的电话号码是(123) 456-7890，或者+1 123-456-7890，你的呢？"
text2 = "我的电话号码是+86 187-123-1234"
print(find_phone_numbers(text))  # 输出可能包括: ['(123) 456-7890', '+1 123-456-7890']
print(find_phone_numbers(text2))

5、匹配邮箱

import re

prog = re.compile(r'[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+')
text = '你好123@163.com'
text2 = 'xiaoming@sf.com再见'
text3 = '123456789@qq.com哈哈'
text4 = '123456789@QQ.com哦哦'
for i in [text,text2,text3,text4]:
    res = prog.findall(i)
    if res:
        print(res)
    else:
        print('无匹配')

6、匹配车牌

import re

car_search = r'[京津沪渝冀豫云辽黑湘皖鲁新苏浙赣鄂桂甘晋蒙陕吉闽贵粤青藏川宁台琼使领军北南成广沈济空海]{1}[A-Z]{1}[A-Z0-9]{4}[A-Z0-9挂领学警港澳]{1}(?!\d)'
text = "24年3月运费 起运地：上海 目的地：全国 车种车号：厢式冷藏车浙沪DQ4557 货物信息：食品"  # 需要抽取的文本


def car_ID_extract(text):
    all_car_id = re.findall(car_search, text)
    car_id = []
    car_id1 = ""
    if all_car_id:
        for i in all_car_id:
            if not i in car_id:
                car_id.append(i)
        for i in car_id:
            car_id1 = car_id1 + ' ' + "".join(tuple(i))  # 将列表转字符串
    return car_id1.strip()  # 返回字符串


print(car_ID_extract(text))

7、xx为开头yy为结尾

下面的代码是：匹配

以【起运地或出发地或发出地】为开头以【目的地或到达地】为结尾，中间的字符串

import re

def start(text):

    pattern = r"(?:起运地|出发地|发出地)(.*?)(?=目的地|到达地)"

    match = re.search(pattern, text)

    if match:
        result = match.group()
        # print("匹配到的内容：", result)
        return result
    else:
        # print("没有找到匹配的内容。")
        return '：'

res = start("出发地：由上海出发到达江苏淮安涟水中专技术学院-目的地：")
print(res)

9、匹配中文

import re

text = "舟山96L*4"
chinese_characters = re.findall(r'[\u4e00-\u9fff]+', text)
print(chinese_characters)

10、匹配非中文

import re
def remove_chinese(text):
    pattern = re.compile(r'[\u4e00-\u9fa5]')
    return re.sub(pattern, '', text)


print(remove_chinese('哈哈哈123-145苹果'))