python网络爬虫，抓取网页图片

最新推荐文章于 2022-07-08 11:20:38 发布

互联网极客

最新推荐文章于 2022-07-08 11:20:38 发布

阅读量1.3k

点赞数 4

分类专栏： ----python 文章标签： python 爬虫抓取网页图片

本文链接：https://blog.csdn.net/jsqfengbao/article/details/44620449

版权

----python 专栏收录该内容

13 篇文章 2 订阅

订阅专栏

今天写了个实例，用于抓取网页中的图片，要保证抓取到自己想要的图片

首先图片需要遵循一定的规则

<span style="font-size:14px;">#-*-coding:utf-8 -*-

import re
import urllib

def get_content(url):                            #获取图片网页的源代码
	'''doc,'''
	html=urllib.urlopen(url)
	content=html.read()
	html.close()
	return content



def get_images(info):
	'''doc,
 <img class="BDE_Image" src="http://imgsrc.baidu.com/forum/w%3D580/sign=1b143d447f899e
 51788e3a1c72a6d990/a65049086e061d952495d9817ff40ad163d9ca0d.jpg"
	'''
	regex=r'class="BDE_Image" src="(.+?\.jpg)"'               #正则表达式，定义图片规则

	pet=re.compile(regex)<span style="white-space:pre">					</span>  #编译python加快速度

	Image_code=re.findall(pet,info)

	l=0
	for image_url in Image_code:
		print image_url

		urllib.urlretrieve(image_url,'%s.jpg' %l)           #重命名循环输出图片
		l+=1
	print len(Image_code)
info= get_content("http://tieba.baidu.com/p/2772656630")
print get_images(info)	
</span>