Python正则表达式基础

最新推荐文章于 2024-04-19 18:00:00 发布

爱吃年糕的刺猬

最新推荐文章于 2024-04-19 18:00:00 发布

阅读量152

点赞数

本文链接：https://blog.csdn.net/weixin_51705319/article/details/117903697

版权

首先，我们要了解什么是正则表达式？

所谓的正则表达式，就相当于大海捞针，在字符串集中，通过定义规则，捞出我们想要的内容

正则表达式的特殊符号：

\d：数字 \D:任意非数字
\w: 任何字母数字下划线字符 \W 非字母数字下划线字符
\s 空格字符 \S 非空字符
\A 匹配字符串的起始
\Z 匹配字符串的结束
\. 空格 字母数字
{3} 取三个  {1,5}范围  *此处千万记得不可以空格{1, 5}

要在python中运行首先应该导入re：

import re

常用语句：

search():搜索字符串中第一次出现的正则表达式模式

re.search 搭配groups（）和group（）使用

例如：

str_data = 'hello xiaomu, this is a good day!'
result = re.search('h[a-zA-Z]s', str_data)
print(result.group())

re.findall

例如：

str_math = ' 结果为: 10, 11, 12, 16 '
math = re.findall('(\d+, \d+, \d+, \d+ )', str_math)
print(math)

简单例题：判断url合法性

import re


def check_url(url):
    result = re.findall('[a-zA-Z]{4,5}://\w+\.\w+\.\w+',url)
    if len(result) != 0:
        return True
    else:
        return False


def get_url(url):
    result = re.findall('https://(\w*\.*\w+\.*\w+)',url)
    if result != 0:
        return result[0]
    else:
        return ''


if __name__ == '__main__':
    result = check_url('https://www.imooc.com')
    print(result)
    result = get_url('https://www.imooc.com')
    print(result)