最近需要一些高分电影海报,用Python写了一个爬虫程序,源代码直接分享给大家。
直接复制粘贴到本地,下载bs4模块和requests模块即可运行,批量爬取电影海报到本地。
pip install bs4
pip install requests
import requests
from bs4 import BeautifulSoup
headers = {"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36"}
#我这爬取的时豆瓣top电影前六页的海报。range()里面的数字根据需要修改
for i in range(0,6) :
page=i*25
url="https://movie.douban.com/top250?start="+str(page)+"&filter="
response = requests.get(url, headers=headers)
html = response.text
soup = BeautifulSoup(html, "lxml")
contents= soup.find_all(class_="pic")
for content in contents:
#查找img标签
imgContent = content.find(name="img")
# 使用.attrs获取alt对应的属性值
imgName = imgContent.attrs["alt"]
imgUrl = imgContent.attrs["src"]
#高清图一般都需要比对缩略图的url,然后替换字符串,换成高清图的url
imgUrlHd = imgUrl.replace("s_ratio_poster", "m")
imgResponse = requests.get(imgUrlHd)
img = imgResponse.content
#把爬取的海报保存到程序同一路径下(mac系统)
with open("/Users/图片.jpg", "wb") as f
#如果是Windows系统的话,需要添加转义符,以我的本地为例:
#with open(rf"D:\电影海报\爬取的图片\{imgName}.jpg","wb") as f :
f.write(img)
最终效果图: