python爬取壁纸教程01 --wallheaven

最新推荐文章于 2023-10-06 17:13:06 发布

Plutoyer Blog

最新推荐文章于 2023-10-06 17:13:06 发布

阅读量1.1k

点赞数

分类专栏： Python 文章标签： python

本文链接：https://blog.csdn.net/qq_31446159/article/details/103888926

版权

Python 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

因为写博客总喜欢配些高质量的美图，鉴于惰性，不想去网上找各种素材，于是利用爬虫获取了一批壁纸到本地，以下是抓取的网址：https://wallhaven.cc/，具体代码实现如下 :

#-- coding:utf-8 --

import requests
from  lxml import etree

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
}

filepath = "C:\\Users\金少\Desktop\壁纸\wallhaven"  # 文件路径

for i in range(1, 20):  # 爬取页数
    kv = {"page": i}

    url = "https://wallhaven.cc/toplist"
    try:
        r = requests.get(url, headers=headers, params=kv, timeout=20)

        # 开始解析
        html = etree.HTML(r.text)
        srcs = html.xpath(".//li//a[@class='preview']/@href")  # 获取到跳转网页

        for src in srcs:
            r = requests.get(src, headers=headers, timeout=20)
            html = etree.HTML(r.text)
            img_src = html.xpath(".//img[@id='wallpaper']/@src")
            for src in img_src:
                filename_1 = src.split('/')[-1]  # 获取文件名
                response = requests.get(src, headers=headers)

                with open(filepath + filename_1, 'wb') as file:
                    file.write(response.content)
                    print(filename_1)
                print("Succeed")

    except:
        continue
        print("跳过")
print("Triumph")

抓取的图片分享百度云链接如下：高清壁纸提取码：0z07

以下是壁纸鉴赏环节：

在这里插入图片描述

Plutoyer Blog

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
python爬取壁纸教程01 --wallheaven

因为写博客总喜欢配些高质量的美图，鉴于惰性，不想去网上找各种素材，于是利用爬虫获取了一批壁纸到本地，以下是抓取的网址：https://wallhaven.cc/，具体代码实现如下 :#-- coding:utf-8 --import requestsfrom lxml import etreeheaders = { "User-Agent": "Mozilla/5.0 (Win...
复制链接

扫一扫

专栏目录