Python基础之正则表达式

最新推荐文章于 2021-03-24 18:13:51 发布

尘世中迷途小码农

最新推荐文章于 2021-03-24 18:13:51 发布

阅读量153

点赞数

分类专栏： python 文章标签： python

本文链接：https://blog.csdn.net/funnyrand/article/details/108360713

版权

python 专栏收录该内容

29 篇文章 0 订阅

订阅专栏

Python基础之正则表达式

本节将介绍Python中正则表达式最基本的用法，正则表达式本身不做太多介绍。

Python中正则表达式的内置模块是re，最基本的用法是判断某个字符串是否符合某个表达式，分组，找出一个字符串中所有符合某个表达式的列表。

判断字符串是否符合某个表达式

可通过search()函数和match()函数来实现，不同之处是match函数是从字符串的起始字符开始判断，而search函数是从任意位置开始判断。例如：

search:

import re

# Check if a string matches a regexp
str = "www.google.com"
match = re.search('g.*e', str)
print(match)
print(match.span())
print(str[match.start():match.end()])

print()
match = re.search('ag.*e', str)
if match:
    print("Match")
else:
    print("Not match")

print()
# Search and group
match = re.search('(g.*e)\.(com)', str)
if match:
    print(match.group(1))
    print(match.group(2))
else:
    print("Not match.")

运行结果：

D:\work\python_workspace\python_study\venv\Scripts\python.exe D:/work/python_workspace/python_study/basic_11/search.py
<re.Match object; span=(4, 10), match='google'>
(4, 10)
google

Not match

google
com

Process finished with exit code 0

match:

import re

str = "www.google.com"
match = re.match("www", str)
print(match)

match = re.match("google", str)
print(match)

运行结果：

D:\work\python_workspace\python_study\venv\Scripts\python.exe D:/work/python_workspace/python_study/basic_11/match.py
<re.Match object; span=(0, 3), match='www'>
None

Process finished with exit code 0

找出字符串中所有符合某个表达式的列表

这也是一个非常有用的操作，在网络爬虫方面应用广泛。例如：

import re

str = "www.google.com, https://www.baidu.com/, www.qq.com, https://www.amazon.com"
result = re.findall('www\..*?\.com', str)
for r in result:
    print(r)

运行结果：

D:\work\python_workspace\python_study\venv\Scripts\python.exe D:/work/python_workspace/python_study/basic_11/findall.py
www.google.com
www.baidu.com
www.qq.com
www.amazon.com

Process finished with exit code 0

当然，re模块中还有其它的函数，如split，sub等，由于不太常用，这里就不过多介绍。