Python爬虫案例Demo——某网站壁纸的爬取

最新推荐文章于 2022-09-02 00:13:21 发布

「已注销」

最新推荐文章于 2022-09-02 00:13:21 发布

阅读量476

点赞数

分类专栏： Python教程文章标签： python url

本文链接：https://blog.csdn.net/weixin_43862765/article/details/103981214

版权

Python教程专栏收录该内容

31 篇文章 2 订阅

订阅专栏

这是当时第二天的案例，是一个著名的高清壁纸网站：

import requests
import re
url = "https://wallhaven.cc/"
headers = {
    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36"
}
response = requests.get(url,headers = headers)
response.encoding = response.apparent_encoding
html = response.text


# https://wallhaven.cc/w/j5k825
# <a href="https://wallhaven.cc/w/j5k825"><img src="https://th.wallhaven.cc/small/j5/j5k825.jpg" width="300px" alt="" /></a>
result = re.findall('<a href="(.*?)"><img src="(.*?)" width="(.*?)" alt="" /></a>',html)
for url in result:
    new_url = str(url).split(',')[1]
    urls = eval(new_url)
    print(urls)
    image_response = requests.get(urls, headers=headers)
    # image = image_response.replace("/'",'')
    filename = new_url.split('/')[-1].split("'")[0]
    with open(str(filename), mode="wb") as t:
        t.write(image_response.content)

截止到目前为止，这些代码是可以运行的，大家可以看下Pycharm中的运行结果：
在这里插入图片描述
爬取的照片结果是：

这是第二天的案例，大家可以加上一些换页的操作等等！