转载自https://www.cnblogs.com/xuchao/p/6087676.html
1.观察网页,找到img标签
2.通过requests和BS库来提取网页中的img标签
3.抓取img标签后,再把里面的src给提取出来,接下来就可以下载图片了
4.通过urllib的urllib.urlretrieve来下载图片并且放进文件夹里面(第一之前的准备工作就是获取当前路径然后新建一个文件夹)
5.如果有多张图片,不断的重复3-4
由于爬虫写得少,通过自己的调试,终于写了出来了
下面直接上代码:
#coding = 'utf-8' import requests from bs4 import BeautifulSoup import urllib import os import sys reload(sys) sys.setdefaultencoding("utf-8") if __name__ == '__main__': url = 'http://www.qiushibaike.com/' res = requests.get(url) res.encoding = 'utf-8' soup = BeautifulSoup(res.text, 'html.parser') imgs = soup.find_all("img") _path = os.getcwd() new_path = os.path.join(_path , 'pictures') if not os.path.isdir(new_path): os.mkdir(new_path) new_path += '\ ' try: x = 1 if imgs == []: print "Done!" for img in imgs: link = img.get('src') if 'http' in link: print "It's downloading %s" %x + "th's piture" urllib.urlretrieve(link, new_path + '%s.jpg' %x) x += 1 except Exception, e: print e else: pass finally: if x : print "It's Done!!!" |