python爬虫得图片并保存到文件夹里

最新推荐文章于 2024-08-29 04:16:38 发布

wzcyy2121

最新推荐文章于 2024-08-29 04:16:38 发布

阅读量9.5k

点赞数 2

转载自https://www.cnblogs.com/xuchao/p/6087676.html

1.观察网页，找到img标签

2.通过requests和BS库来提取网页中的img标签

3.抓取img标签后，再把里面的src给提取出来，接下来就可以下载图片了

4.通过urllib的urllib.urlretrieve来下载图片并且放进文件夹里面（第一之前的准备工作就是获取当前路径然后新建一个文件夹）

5.如果有多张图片，不断的重复3-4

由于爬虫写得少，通过自己的调试，终于写了出来了

下面直接上代码：

#coding = 'utf-8'
import requests
from bs4 import BeautifulSoup
import urllib
import os
import sys
reload(sys)
sys.setdefaultencoding("utf-8")

if __name__ == '__main__':
url = 'http://www.qiushibaike.com/'
res = requests.get(url)
res.encoding = 'utf-8'
soup = BeautifulSoup(res.text, 'html.parser')
imgs = soup.find_all("img")

_path = os.getcwd()
new_path = os.path.join(_path , 'pictures')
if not os.path.isdir(new_path):
os.mkdir(new_path)
new_path += '\ '

try:
x = 1
if imgs == []:
print "Done!"
for img in imgs:
link = img.get('src')
if 'http' in link:
print "It's downloading %s" %x + "th's piture"
urllib.urlretrieve(link, new_path + '%s.jpg' %x)
x += 1

except Exception, e:
print e
else:
pass
finally:
if x :
print "It's Done!!!"