今天学习的是 如何 下载 豆瓣首页的图片 然后保存到本地
豆瓣首页如下:
抓取代码如下
import urllib.request import re import os imagePath = '/Users/touna/Desktop/image' #保存文件的方法 def saveFile(path): #检测路径是否存在 if不存在 就创建 if not os.path.isdir(imagePath): os.mkdir(imagePath) #rindex() 返回子字符串 str 在字符串中最后出现的位置 str = path.rindex('/') print('---%s' % str) p = os.path.join(imagePath,path[str+1:]) print('++++%s' % p) print('++++%s' % path[str+1:]) return p url = 'https://www.douban.com/' header = {'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36'} req = urllib.request.Request(url=url,headers=header) res = urllib.request.urlopen(req) data = res.read() #data = data.decode('utf-8') pattern = re.compile(r'(https:[^s]*?(jpg|png|gif))') for imageUrl,t in set(re.findall(pattern,str(data))): print(imageUrl) #urlretrieve()方法直接将远程数据下载到本地 urllib.request.urlretrieve(imageUrl,saveFile(imageUrl))
打印日志如下:
https://img3.doubanio.com/view/photo/albumcover/public/p2497540936.jpg ---54 ++++/Users/touna/Desktop/image/p2497540936.jpg ++++p2497540936.jpg https://img3.doubanio.com/icon/g83759-2.jpg ---30 ++++/Users/touna/Desktop/image/g83759-2.jpg ++++g83759-2.jpg https://img3.doubanio.com/icon/g109498-1.jpg ---30 ++++/Users/touna/Desktop/image/g109498-1.jpg ++++g109498-1.jpg https://img1.doubanio.com/view/dianpu_product_item/medium/public/p1982227.jpg ---64 ++++/Users/touna/Desktop/image/p1982227.jpg ++++p1982227.jpg https://img1.doubanio.com/view/photo/albumcover/public/p2498359159.jpg ---54 ++++/Users/touna/Desktop/image/p2498359159.jpg ++++p2498359159.jpg https://img3.doubanio.com/view/dianpu_product_item/medium/public/p270364.jpg ---64 ++++/Users/touna/Desktop/image/p270364.jpg ++++p270364.jpg https://img3.doubanio.com/view/dianpu_product_item/medium/public/p458880.jpg ---64 ++++/Users/touna/Desktop/image/p458880.jpg ++++p458880.jpg https://img1.doubanio.com/view/dianpu_product_item/medium/public/p509169.jpg ---64 ++++/Users/touna/Desktop/image/p509169.jpg ++++p509169.jpg https://img1.doubanio.com/icon/g37688-27.jpg ---30 ++++/Users/touna/Desktop/image/g37688-27.jpg ++++g37688-27.jpg https://img3.doubanio.com/view/dianpu_product_item/medium/public/p377790.jpg ---64 ++++/Users/touna/Desktop/image/p377790.jpg ++++p377790.jpg https://img3.doubanio.com/view/ark_article_cover/large/public/20165020.jpg保存到本地的图片如下:
如有不妥 请大神多多指点