html分页读取本地图片,爬虫爬取网页图片（分页）

最新推荐文章于 2023-07-07 12:24:11 发布

信息门下跑狗

最新推荐文章于 2023-07-07 12:24:11 发布

阅读量189

点赞数

文章标签： html分页读取本地图片

爬虫爬取网页图片(分页)

不分页源码：

import requests

import re

url = 'https://www.qiushibaike.com/imgrank/'

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36'}

res = requests.get(url,headers = headers).text

# print(res)

urls = re.findall(' .*? ',res)

print(urls)

for url1 in urls:

filename = url1.split('/')[-1]

urll = 'https:'+url1

response = requests.get(urll,headers = headers)

with open(filename,'wb') as f:

f.write(response.content)

分页的话需要设置一个通用的url模板

url = 'https://www.qiushibaike.com/imgrank/page/%d/'

for page in range(1,4):

newurl = format(url%page)

import requests

import re

url = 'https://www.qiushibaike.com/imgrank/page/%d/'

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36'}

for page in range(1,4):

newurl = format(url%page)

res = requests.get(newurl,headers = headers).text

# print(res)

urls = re.findall(' .*? ',res)

print(urls)

for url1 in urls:

filename = url1.split('/')[-1]

urll = 'https:'+url1

response = requests.get(urll,headers = headers)

with open(filename,'wb') as f:

f.write(response.content)

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

信息门下跑狗

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
html分页读取本地图片,爬虫爬取网页图片（分页）

爬虫爬取网页图片(分页)不分页源码：import requestsimport reurl = 'https://www.qiushibaike.com/imgrank/'headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrom...
复制链接

扫一扫