正则表达式匹配分组(|、(ab)、\num、(?P＜name＞)(?P=name))

IT之一小佬

已于 2022-11-15 23:13:11 修改

阅读量1.1k

点赞数 2

分类专栏：正则表达式文章标签：正则表达式 python

于 2021-02-05 23:29:09 首次发布

本文链接：https://blog.csdn.net/weixin_44799217/article/details/113705525

版权

正则表达式专栏收录该内容

14 篇文章 20 订阅

订阅专栏

匹配分组相关正则表达式

代码	功能
\|	匹配左右任意一个表达式
(ab)	将括号中字符作为一个分组
`\num`	引用分组num匹配到的字符串
`(?P<name>)`	分组起别名【大写字母P】
(?P=name)	引用别名为name分组匹配到的字符串【大写字母P】

【(分组数据)：分组数是从左到右的方式进行分配的，最左边的是第一个分组，依次类推】

注意：正则表达式中不能随意添加空格，空格具有真实存在意义的

示例1：|

需求：在列表中["apple", "banana", "orange", "pear"]，匹配apple和pear

import re

# 水果列表
fruit_list = ["apple", "banana", "orange", "pear"]

# 遍历数据
for value in fruit_list:
    # |    匹配左右任意一个表达式
    match_obj = re.match("apple|pear", value)
    if match_obj:
        print("%s是我想要的" % match_obj.group())
    else:
        print("%s不是我要的" % value)

执行结果:

示例2：( )

需求：匹配出163、126、qq等邮箱

import re

match_obj = re.match("[a-zA-Z0-9_]{4,20}@(163|126|qq|sina|yahoo)\.com", "hello@163.com")  #  对.进行了转义
if match_obj:
    print(match_obj.group())  #  print(match_obj.group(0))  和前面效果是一样的
    # 获取分组数据
    print(match_obj.group(1))
else:
    print("匹配失败")

执行结果:

需求: 匹配qq:10567这样的数据，提取出来qq文字和qq号码

import re

match_obj = re.match("(qq):([1-9]\d{4,10})", "qq:10567")

if match_obj:
    print(match_obj.group())
    # 分组:默认是1一个分组，多个分组从左到右依次加1
    print(match_obj.group(1))
    # 提取第二个分组数据
    print(match_obj.group(2))
else:
    print("匹配失败")

执行结果:

示例3：\num

需求：匹配出<html>hh</html>

import re

match_obj = re.match("<[a-zA-Z1-6]+>.*</[a-zA-Z1-6]+>", "<html>hh</div>")

if match_obj:
    print(match_obj.group())
else:
    print("匹配失败")

match_obj = re.match("<([a-zA-Z1-6]+)>.*</\\1>", "<html>hh</html>")

if match_obj:
    print(match_obj.group())
else:
    print("匹配失败")

运行结果：

需求：匹配出<html><h1>www.itcast.cn</h1></html>

import re

match_obj = re.match("<([a-zA-Z1-6]+)><([a-zA-Z1-6]+)>.*</\\2></\\1>", "<html><h1>www.itcast.cn</h1></html>")  #  顺序要对应上

if match_obj:
    print(match_obj.group())
else:
    print("匹配失败")

match_obj2 = re.match("<([a-zA-Z1-6]+)><([a-zA-Z1-6]+)>.*</\\1></\\2>", "<html><h1>www.itcast.cn</h1></html>")

if match_obj2:
    print(match_obj2.group())
else:
    print("匹配失败")

运行结果：

示例4：`(?P<name>)` `(?P=name)`

需求：匹配出<html><h1>www.itcast.cn</h1></html>

import re

match_obj = re.match("<(?P<name1>[a-zA-Z1-6]+)><(?P<name2>[a-zA-Z1-6]+)>.*</(?P=name2)></(?P=name1)>", "<html><h1>www.itcast.cn</h1></html>")  #  名字是可以随便起的，但是顺序还是要对应的

if match_obj:
    print(match_obj.group())
else:
    print("匹配失败")

match_obj2 = re.match("<(?P<name1>[a-zA-Z1-6]+)><(?P<name2>[a-zA-Z1-6]+)>.*</(?P=name1)></(?P=name2)>", "<html><h1>www.itcast.cn</h1></html>")

if match_obj2:
    print(match_obj2.group())
else:
    print("匹配失败")

运行结果：