介绍
该文只是一个小爬虫,实现的是下载网页的图片【不过,没有加翻页功能,这个可以自己加上】。
大概1-2秒的时间,就全部下载好了。
代码展示
#请求过程
import requests
from bs4 import BeautifulSoup
url = "https://re.jd.com/search?keyword=%E7%94%B5%E8%84%91&enc=utf-8"
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36"}
res = requests.get(url,headers=headers)
html = res.text
soup = BeautifulSoup(html,"lxml")
content_all = soup.find_all(class_="item")
#爬取过程
count = 1
for content in content_all:
imgContent = content.find(name="img")
imgUrl = imgContent.attrs["data-src"]
imgfinal = "https://re.jd.com/search?keyword=%E7%94%B5%E8%84%91&enc=utf-8"+imgUrl
response =requests.get(imgfinal)
img = response.content
#保存文件过程
with open(f"{count}.jpg","wb") as f:
f.write(img)
count = count +1
需要注意的是,这个count 一定要提前设置好,否则每张图片将是一样的。