python 正则表达式

最新推荐文章于 2024-04-23 12:21:55 发布

警言

最新推荐文章于 2024-04-23 12:21:55 发布

阅读量166

点赞数

分类专栏： Python 文章标签： python 正则表达式

本文链接：https://blog.csdn.net/xj1009420846/article/details/90484846

版权

Python 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

学习网址：

http://www.runoob.com/regexp/regexp-syntax.html

import re

page_hero = '''<ul class="herolist clearfix"><li><a href="herodetail/194.shtml" target="_blank"><img src="http://game.gtimg.cn/images/yxzj/img201606/heroimg/194/194.jpg" width="91px" alt="苏烈">苏烈</a></li><li><a href="herodetail/195.shtml" target="_blank"><img src="http://game.gtimg.cn/images/yxzj/img201606/heroimg/195/195.jpg" width="91px" alt="百里玄策">百里玄策</a></li><li><a href="herodetail/196.shtml" target="_blank"><img src="http://game.gtimg.cn/images/yxzj/img201606/heroimg/196/196.jpg" width="91px" alt="百里守约">百里守约</a></li><li><a href="herodetail/193.shtml" target="_blank"><img src="http://game.gtimg.cn/images/yxzj/img201606/heroimg/193/193.jpg" width="91px" alt="铠">铠</a></li></ul>
'''

href_pattern = r'\bhref=(.*?) .*?'
href_regex = re.compile(href_pattern, re.IGNORECASE)
for match in href_regex.finditer(page_hero):
    print("index=%s,href:%s\n"%(match.start(), match.group(1)))

name_pattern = r'.*?>(\w+?)<.*?'
name_regex = re.compile(name_pattern, re.IGNORECASE)    
for match in name_regex.finditer(page_hero):
    print("index=%s,name:%s\n"%(match.start(), match.group(1)))

jpg_pattern = r'\bsrc=(.*?) .*?'
jpg_regex = re.compile(jpg_pattern, re.IGNORECASE)    
for match in jpg_regex.finditer(page_hero):
    print("index=%s,name:%s\n"%(match.start(), match.group(1)))

? 匹配前面的子表达式零次或一次，或指明一个非贪婪限定符。要匹配 ? 字符，请使用 \?。

+ 匹配前面的子表达式一次或多次。要匹配 + 字符，请使用 \+。

警言

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python 正则表达式

学习网址：http://www.runoob.com/regexp/regexp-syntax.htmlimport repage_hero = '''<ul class="herolist clearfix"><li><a href="herodetail/194.shtml" target="_blank"><img src="ht...
复制链接

扫一扫

专栏目录