正则表达式

最新推荐文章于 2024-02-15 16:21:07 发布

Desire..

最新推荐文章于 2024-02-15 16:21:07 发布

阅读量252

点赞数 1

文章标签：正则表达式 python

本文链接：https://blog.csdn.net/qq_47122804/article/details/121364754

版权

该实验旨在通过Python的re模块练习正则表达式的使用，包括匹配网址的基础部分和提取IP地址。实验内容包括：1) 使用正则表达式截取网址的域名部分；2) 验证并提取合法的IP地址格式；3) 检查电子邮件地址的合法性；4) 从文本文件中筛选并提取数字。实验要求对正则表达式的灵活性和精确性有深入理解。

摘要由CSDN通过智能技术生成

一、实验目的
1.掌握元字符的使用方法。
2.理解正则表达式re模块。
二、实验环境
计算机及Python3.X和Pycharm软件。
三、实验内容与要求
1、匹配网址
有一批网址：
http://www.interoem.com/messageinfo.asp?id=35
http://3995503.com/class/class09/news_show.asp?id=14
http://lib.wzmc.edu.cn/news/onews.asp?id=769
http://www.zy-ls.com/alfx.asp?newsid=377&id=6http://www.fincm.com/newslist.asp?id=415

需要正则后为：
http://www.interoem.com/
http://3995503.com/
http://lib.wzmc.edu.cn/
http://www.zy-ls.com/
http://www.fincm.com/
源码：

a='http://www.interoem.com/messageinfo.asp?id=35'\
'http://3995503.com/class/class09/news_show.asp?id=14'\
'http://lib.wzmc.edu.cn/news/onews.asp?id=769'\
'http://www.zy-ls.com/alfx.asp?newsid=377&id=6http://www.fincm.com/newslist.asp?id=415'
# pattern12 = r"^[a-zA-Z0-9]{4,5}://[a-zA-Z0-9]*.[a-zA-Z0-9]{0,100}.[a-zA-Z0-9]{0,3}/"
# pattern3 = r"^[a-zA-Z0-9]{4,5}://[a-zA-Z0-9]*.[a-zA-Z0-9]{0,100}.[a-zA-Z0-9]{0,3}.[a-zA-Z0-9]{0,3}\.*[cn]*/"
pattern=r'[a-zA-Z0-9]{4,5}://[a-zA-Z0-9]*.[a-zA-Z0-9]{0,100}.[a-zA-Z0-9]{0,3}.\.*[com]*\.*[cn]*/'
# m2= re.findall(pattern1, a)
# m3 = re.findall(pattern1, c)
# m4 = re.findall(pattern1, d)
# # m5 = re.findall(pattern1, e)
# pattern=r'http:\/\/.+?\/'
str=re.findall(pattern, a)
for m in str:
    print(m)

2、匹配合法的ip地址
（格式为：pattern=’正则表达式’
example=input(‘请输入一个IP地址’)
print(re.findall(pattern,example))
）
源码：

pattern="((\d{1,2})|(1\d{1,2})|(2[0-4]\d)|(25[0-5]))\.{3}(\d{1,2})|(1\d{1,2})|(2[0-4]\d)|(25[0-5])(\d{1,2})|(1\d{1,2})|(2[0-4]\d)|(25[0-5])(\d{1,2})|(1\d{1,2})|(2[0-4]\d)|(25[0-5])"
pattern1=r"(((\d{1,2})|(1\d{1,2})|(2[0-4]\d)|(25\d[0-4]))\.*){6,7}"
pattern1=r"^((2(5[0-5]|[0-4]\d))|[0-1]?\d{1,2})(\.((2(5[0-5]|[0-4]\d))|[0-1]?\d{1,2})){3}$"
example = input('请输入一个ip地址：')
# example = "255.255.1.1"
print(re.search(pattern1, example))

3、匹配所有合法的电子邮件地址（格式如上）

pattern=r"^[0-9a-zA-Z_]{0,19}@[0-9a-zA-Z]{1,13}\.[com,cn,net]{1,3}$"
#mailbox = "1765211652@qq.com"
mailbox = input('请输入一个邮箱地址：')
print(re.search(pattern, mailbox))

4、打开test.txt文本，将里边得文本使用正则表达式筛选出数字，再存入test1.txt文件中。
（存入test1.txt文本内容格式为：
29384845
223444444422
323455111）
源码：

import re
# pattern = r'(\d{1,10})[-,*,/](\d{1,11})'
pattern = r'\d{1,11}'
f2 = open('test1.txt', 'w+')
with open('test.txt', encoding='utf-8') as f:
    d = f.read()
    print(d)
    match = re.findall(pattern, d)
    # for i in range(0,6):
i = 0
for out in match:
    # f2.write(str(out[0]))
    f2.write(str(out))
    i = i + 1
    if i % 2 == 0:
        f2.write('\n')