python正则表达式-re模块

最新推荐文章于 2024-04-07 17:11:26 发布

老王笔记

最新推荐文章于 2024-04-07 17:11:26 发布

阅读量302

点赞数

分类专栏： Python

本文链接：https://blog.csdn.net/JSWANGCHANG/article/details/100038110

版权

Python 专栏收录该内容

65 篇文章 0 订阅

订阅专栏

re模块： Python匹配正则表达式时需要导入该模块

模块函数：
compile(patter, flags = 0):
   功能：获取正则表达式对象
   参数： patter:正则表达式
           flags: 功能标志位，提供更丰富的匹配(大小写等)；默认不添加；
   返回值：正则表达式对象，代表该正则表达式

obj.findall(string,start_pos,end_pos)
   功能：通过正则表达式匹配字符串， obj是正则表达式对象
   string:目标字符串
   start_pos: 匹配起始位置
   end_pos: 匹配结尾位置

   obj = re.compile(pattern="(?P<IP>(\d{1,3}\.){3}\d{1,3})")
   print(obj.findall(s))

obj.split(string)
   功能：按照正则表达式切割目标字符串
   参数：目标字符串
   返回值：列表形式返回切割后内容

obj.sub(replaceStr, string, max)
   功能：替换正则表达式匹配到的内容
   参数：replaceStr:要替换的内容
       string:   目标字符串
       max:       最多替换几处
   返回值：返回替换后的字符串

ojb.subn(replaceStr, string, count)
   功能：替换正则表达式匹配到的内容
   参数：replaceStr:要替换的内容
       string:   目标字符串
       max:       最多替换几处
   返回值：返回替换后的字符串和替换的次数

obj.finditer(string)
   功能：使用正则表达式匹配目标内容
   参数：目标字符串
   返回值：迭代对象

obj.match(pattern,string)
   功能：匹配一个字符串的开头
   返回值：如果匹配到，返回match obj
           如果没有匹配到，返回None

obj.search(pattern, string)
   功能：只能匹配一个字符串
   参数：目标字符串
   返回值：如果匹配到，返回match obj
           如果没有匹配到，返回None

*note: match 和search: match只能匹配字符串的开头位置，search可以匹配任意位置，但也只能匹配一处
       由于search和match 调用属性的时候，有匹配不到的风险，如果匹配不到会返回异常；需要做try -exception异常捕获

flag位： S/DOTALL：可以让.匹配\n
       I/IGNORECASE: 忽略大小写
       M/MULTILINE:   对^/$生效，可以让其以\n为单位来识别开头和结尾，而不是以全字符串来识别

多个flag同用，则中间以|隔开，按位或； re.X | re.I

作业：
读取一个文件的内容，将文件中所有的以大写字母开头的单词匹配出来；

import re

s = """ \
    "0be1-0f2b-11e9-85c7-48d539409366.
2019-01-03 15:44:56 18075 [Note] Server hostname (bind-address): '172.20.18.13'; port: 3358
2019-01-03 15:44:56 18075 [Note]   - '172.20.18.13' resolves to '172.20.18.13';
2019-01-03 15:44:56 18075 [Note] Server socket created on IP: '172.20.18.13'.
2019-01-03 15:44:56 18075 [Warning] 'user' entry 'root@a02-r05-i18-13-a003335.jd.local' ignored in --skip-name-resolve mode.
2019-01-03 15:44:56 18075 [Warning] 'user' entry '@a02-r05-i18-13-a003335.jd.local' ignored in --skip-name-resolve mode.
2019-01-03 15:44:56 18075 [Warning] 'proxies_priv' entry '@ root@a02-r05-i18-13-a003335.jd.local' ignored in --skip-name-resolve mode.
"""

s2 = 'Beautiful is better than ugly.'

#findall
pattern_match = r"((\d{1,3}\.){3}\d{1,3})"
obj_IP = re.compile(pattern=pattern_match)
print(obj_IP.findall(s))
print('-'*30)


#split拆分字符串
pattern_split = r"\s+"
obj = re.compile(pattern=pattern_split)
print(obj.split(s2))


#obj.sub
pattern_sub = r'\s+'
obj = re.compile(pattern=pattern_sub)
print(obj.sub('#','This is china. It is a slept lion'))

#obj.subn
print(obj.subn('#','This is china.'))


#ojb.finditer(string)
pattern_iter = r'\d+'
obj = re.compile(pattern = pattern_iter)
it = obj.finditer('百年未有之大变局，改革开放， 苏联解体1991， 亚洲金融风暴1997，1997/1999港澳回归，2001中国入世， 2002非典 ， 2008金融危机，2012中国PPP 第一， 2014GDP第二, 2018中美贸易战 ')
for I in it:
    print(I.group())


#obj.match
pattern_match = r'this'
obj  = re.compile(pattern=pattern_match)
it = obj.match('this is this, is this right?')
print(it.group())


#obj.search
pattern_match = r'this'
obj  = re.compile(pattern=pattern_match)
it = obj.search('This is this, is this right?')
print(it.group())

print('='*60)
obj = re.compile(pattern=r"((\d{1,3}\.){3}\d{1,3})")
obj_match = obj.search(s)
print(obj_match.group())
print('+'*60)

obj = re.compile(r'(?P<IP>ip_address)')
#compile特有属性
print('flags:',obj.flags)
print('pattern:',obj.pattern)
print('groupindex:',obj.groupindex)
print('group', obj.groups)



#作业读取一个文件的内容，将文件中所有的以大写字母开头的单词匹配出来；
try:
    f = open('D:\secureCRT_Log\mysql_run.err','r')
except IOError:
    print('文件打开失败')

s = f.read()
pattern1 = r'[A-Z]\w*'
obj = re.findall(pattern1,s)
print(obj)

import re


s = '''Hello World
hello kitty
Hi China
Great china'''

print(re.findall('.+',s))
print(re.findall('.+',s,re.S))

print(re.findall('H\w+',s))
print(re.findall('H\w+',s, re.IGNORECASE))

print(re.findall('^hello',s))
print(re.findall('^hello',s,re.MULTILINE))

作业2：设备网卡文件，输入指定网卡和里面内容，获取具体地址；如

ens4f1: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether a0:36:9f:e0:df:79 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device memory 0x90b00000-90bfffff

输入： ens4f1 ether 获取MAC地址