王者荣耀英雄图片爬取(保存到一个文件夹里)

这里我直接使用的正则匹配

import requests

import re
import os

headers = {
    "accept": "application/json, text/javascript, */*; q=0.01",
    "accept-language": "zh-CN,zh;q=0.9,oc;q=0.8",
    "cache-control": "no-cache",
    "content-type": "application/json; charset=utf-8",
    "pragma": "no-cache",
    "priority": "u=1, i",
    "referer": "https://pvp.qq.com/web201605/herolist.shtml",
    "sec-ch-ua": "\"Not(A:Brand\";v=\"99\", \"Google Chrome\";v=\"133\", \"Chromium\";v=\"133\"",
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": "\"Windows\"",
    "sec-fetch-dest": "empty",
    "sec-fetch-mode": "cors",
    "sec-fetch-site": "same-origin",
    "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36",
    "x-requested-with": "XMLHttpRequest"
}
url = "https://pvp.qq.com/web201605/herolist.shtml"
response = requests.get(url, headers=headers)
response.encoding = "gbk"
# 假设 response.text 包含 HTML 内容
html_content = response.text
print(response.text)
# 正则表达式匹配图片链接和 alt 属性
pattern = r'<img\s+src="([^"]+)"\s+[^>]*alt="([^"]+)"'

# 使用 re.findall 提取所有匹配结果
matches = re.findall(pattern, html_content)
    # if not os.path.exists(folder_path):
    #     os.makedirs(folder_path)
    # with open(f"{folder_path}/凡人修仙传.txt", "a+", encoding="utf-8") as f:
    #     f.write(head.center(80)+"\n\n")
    #     for i in title:
    #         f.write(i.text+"\n")
# 打印提取结果
path="D:/小说/王者荣耀壁纸"
if not os.path.exists(path):
    os.makedirs(path)
for img_url, alt_name in matches:
    with open(f"{path}/{alt_name}.jpg", "wb") as f:
        f.write(requests.get("https:"+img_url).content)
    print(f"人物名称: {alt_name}, 图片链接: {img_url}")
    
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

shix .

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值