python笔记⑧ 12.5

最新推荐文章于 2020-11-23 18:47:00 发布

绵逸

最新推荐文章于 2020-11-23 18:47:00 发布

阅读量205

点赞数

分类专栏： python笔记

本文链接：https://blog.csdn.net/qq_45669465/article/details/103439443

版权

python笔记专栏收录该内容

8 篇文章 0 订阅

订阅专栏

一、正则表达式

正则表达式通常被用来检索、替换那些符合某个模式(规则)的文本

给定的字符串是否符合正则表达式的过滤逻辑（称作“匹配”）；
可以通过正则表达式，从字符串中获取我们想要的特定部分。

list1 = ['hello', 'python', 'pyinfo', 'pygame', 'china', 'zero', 'apple', 'open']
s = []
for i in list1:
    if i[0:2] == 'py':
        s.append(i)
print(s)

li = [1, 2, 3,4,5,6]
p = {i[1]: list1[i[0]] for i in enumerate(li) }
print(p)

p = {li[i[0]]: list1[i[0]] for i in enumerate(list1) if i[1][0:2] == 'py'}
print(p)

在这里插入图片描述

二、re模块

➢ 在Python中，re 模块使 Python 语言拥有全部的正则表达式功能。
在这里插入图片描述

三、匹配单个字符

在这里插入图片描述

import re
# 需要找的字符串要放在最前面
result = re.match('www', 'wwwdadaiowwwdk9oa')
print(result.group())
ret = re.match('.', 'M')
print(ret.group())
ret = re.match('t.o', 'two')  # .是任意字符
print(ret.group())
ret = re.match('t..o', 'tweo')
print(ret.group())
ret = re.match('[hH]', 'Hhello')
print(ret.group())
ret = re.match('[Hh]el', 'Hello')
print(ret.group())
t = re.findall('\d', '34vsdgt3423')  # 匹配数字
print(t)
t = re.findall('\D', 'dsi34mianyi**')  # 匹配非数字
print(t)

pi = re.findall('\D', '21df75')
if pi:
    # 获取匹配结果
    print(pi)
else:
    print('匹配失败')
    
# \s只匹配一个空格，多个空格匹配不到
ma = re.match('hello\sworld', 'hello world')
if ma:
    print(ma.group())
else:
    print('匹配失败')

a = "21uh]*42'3dc"  # 有单引号的时候，外边用双引号
ma = re.findall('\w', a)  # 非特殊字符
ma1 = re.findall('\W', a)  # 特殊字符
print(ma)
print(ma1)

在这里插入图片描述

四、匹配多个字符，匹配开头结尾，特殊

在这里插入图片描述

# *匹配一个字符出现0次或n次都可以
ret = re.match('[A-Z][a-z]*', 'MmnnQ')  # match和group搭配
print(ret.group())
ret = re.findall('[A-Z][a-z]*', 'KMmnnQ')  # findall不用group
print(ret)

ma = re.match('t.+o', 'to')  # t.*o t.+o结果不一样
if ma:
    print(ma.group())
else:
    print('匹配失败')

在这里插入图片描述

ma = re.match('https?', 'http')  # ？和s连，s出现0-1次
if ma:
    print(ma.group())
else:
    print('匹配失败')

ret = re.findall('[a-zA-Z0-9_]{6}', '12da345fg678')
print(ret)
ret = re.match('[a-zA-Z0-9_]{2,5}', '12d:a345fg678_')
print(ret.group())
# ^以什么开头
ma = re.findall('^\d.*', '334hello')  # 以数字开头，后接任意字符多个
if ma:
    print(ma)
else:
    print('匹配失败')

# $以什么结尾
ma = re.match('.*\d$', '334hello5')  # 以数字结尾，任意字符多个开头
if ma:
    print(ma.group())
else:
    print('匹配失败')


import re
# 匹配除了aeiou的其他字符
match_obj = re.findall('[^aeiou]', 'hello')
if match_obj:
    print(match_obj)
else:
    print('匹配失败')

在这里插入图片描述

五、匹配分组

在这里插入图片描述
匹配Apple和pear

import re
fruit_list = ['apple', 'banana', 'pear', 'peach']
for i in fruit_list:
    match_obj = re.match('apple|pear', i)
    if match_obj:
        print('%s是我想要的'%match_obj.group())
    else:
        print('%s不是我想要的'%i)

在这里插入图片描述
匹配163邮箱和qq

match_obj = re.match('([a-zA-Z0-9_]{4,20})@163|qq|126|sina|yahoo\.com','hello@163.com')
if match_obj:
    print(match_obj.group())
    print(match_obj.group(1))
else:
    print('匹配失败')

match_obj = re.match('(qq):([1-9]\d{4,10})', 'qq:386149176')
if match_obj:
    print(match_obj.group())
    print(match_obj.group(1))  # 分组，默认是1一个分组，多个分组从左到右依次加1
    print(match_obj.group(2))  # 提取第二个分组数据
else:
    print('匹配失败')

在这里插入图片描述
匹配hh

# 匹配出<html>hh</html>
match_obj = re.match('<[a-zA-Z1-6]+>.*</[a-zA-Z1-6]+>', '<html>hh</div>')
if match_obj:
    print(match_obj.group())
else:
    print('匹配失败')
# \num引用分组num匹配到的字符串
match_obj = re.match('<([a-zA-Z1-6]+)>.*</\\1>', '<html>hh</html>')
if match_obj:
    print(match_obj.group())
else:
    print('匹配失败')

在这里插入图片描述

match_obj = re.match('<(?P<name1>[a-zA-Z1-6]+)><(?P<name2>[a-zA-Z1-6]+)>.*</(?P=name2)></(?P=name1)>',
                      '<html><h1>www.youyong.cn</h1></html>')
if match_obj:
    print(match_obj.group())
else:
    print('匹配失败')

在这里插入图片描述

六、re模块的高级用法

1、search
2、findall

# 只查找一次数据 匹配出水果的个数
match_obj = re.search('\d+', '水果有20个 其中苹果有10个')
if match_obj:
    print(match_obj.group())
else:
    print('匹配失败')
# 匹配多种水果的个数
result = re.findall('\d+', '苹果10个 鸭梨5个 总共15个水果')
print(result)

在这里插入图片描述
3、finditer
➢ 返回一个顺序访问每一个匹配结果（Match对象）的迭代器。找到 RE 匹配的所有子串，并把
它们作为一个迭代器返回
4、sub 将匹配到的数据进行替换

七、贪婪和非贪婪

贪婪：再整个表达式匹配成功的前提下，尽可能多的匹配
非贪婪：在整个表达式匹配成功的前提下，尽可能少的匹配

总结

在这里插入图片描述

绵逸

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python笔记⑧ 12.5

一、正则表达式正则表达式通常被用来检索、替换那些符合某个模式(规则)的文本给定的字符串是否符合正则表达式的过滤逻辑（称作“匹配”）；可以通过正则表达式，从字符串中获取我们想要的特定部分。list1 = ['hello', 'python', 'pyinfo', 'pygame', 'china', 'zero', 'apple', 'open']s = []for i in li...
复制链接

扫一扫

专栏目录