直接上示例代码
import re
strs='''</li>
<li class="new-search-works-item">
<a title="2021高考宣传海报" href="/show/34840284.html"
target="_blank" class="search-works-thumb relative">
<img src="http://pic3.ntimg.cn/pic/20210331/3680455_171255068085_4.jpg" alt="2021高考宣传海报">
</a>
<div class="search-works-info">
<a href="/show/34840284.html" class="search-works-name ellipsis" title="2021高考宣传海报">2021高考宣传海报</a>
<span class="search-works-price">非商售价:<span>38</span></span>
</div>
</li>
<li class="new-search-works-item">
<a title="高考加油校园励志海报" href="/show/34843187.html"
target="_blank" class="search-works-thumb relative">
<img src="http://pic3.ntimg.cn/pic/20210401/24611550_080435886108_4.jpg" alt="高考加油校园励志海报">
</a>
<div class="search-works-info">
<a href="/show/34843187.html" class="search-works-name ellipsis" title="高考加油校园励志海报">高考加油校园励志海报</a>
<span class="search-works-price">非商售价:<span>40</span></span>
</div>
</li>'''
strs=re.findall('img src="(.*?)"', strs,re.S)
print(strs)
如上一段html代码,是我随便找了个图片网站复制的,现在是要取出源码中的两个jpg图片的链接,使用正则re.S轻松搞定,而且输出结果为列表
['http://pic3.ntimg.cn/pic/20210331/3680455_171255068085_4.jpg', 'http://pic3.ntimg.cn/pic/20210401/24611550_080435886108_4.jpg']
正则中,(.*?) 正则匹配的是图片链接,img src=" 是图片链接前面的字符," 是链接后面的字符。