2 三种解析方式爬取王者荣耀英雄图片

链接:英雄资料列表页-英雄介绍-王者荣耀官方网站-腾讯游戏

分析方法参考:1 爬取7K小说网用户书架信息-CSDN博客

因某种不知名原因,有些英雄爬取不到,就先不管。

代码:

# 爬取王者荣耀英雄图片,并保存在同目录的 heroPhoto 中,给每张图片命名为英雄.jpg
# https://pvp.qq.com/web201605/herolist.shtml

import requests
from lxml import etree
from bs4 import BeautifulSoup
from pyquery import PyQuery as pq


url = 'https://pvp.qq.com/web201605/herolist.shtml'
response = requests.get(url)
content = response.content


# Xpath 解析
# html = etree.HTML(content)
# image_urls = html.xpath('//ul[@class="herolist clearfix"]/li/a/img/@src')
# print(image_urls)
# hero_list = html.xpath('//ul[@class="herolist clearfix"]/li/a/text()')
# print(hero_list)
# for i in range(len(image_urls)):  # Xpath不可以对节点进行解析吗?
#     image_url = image_urls[i]
#     name = hero_list[i]
#     url = f'https:{image_url}'
#     jpg_content = requests.get(url).content
#     with open(f'heroPhoto/{name}.jpg', 'wb') as file:
#         file.write(jpg_content)
#     print(f'图片{i}存储完毕')


# BeautifulShop 解析
# soup = BeautifulSoup(content, 'lxml')
# image_a = soup.select('.herolist.clearfix li a')  # 找到a节点
# for a in image_a:
#     href = a.img.attrs['src']
#     hero_name = a.get_text()
#     hero_url = f'https:{href}'
#     jpg_content = requests.get(hero_url).content
#     with open(f'heroPhoto/{hero_name}.jpg', 'wb') as f:
#         f.write(jpg_content)
#     print(f'图片{hero_name}存储完毕')


# pyquery
doc = pq(content)
image_as = doc('.herolist.clearfix li a').items()
print(image_as, type(image_as))

for a in image_as:
    href = a.find('img').attr('src')
    hero_name = a.text()
    hero_url = f'https:{href}'
    jpg_content = requests.get(hero_url).content
    with open(f'heroPhoto/{hero_name}.jpg', 'wb') as f:
        f.write(jpg_content)
    print(f'图片{hero_name}存储完毕')

爬取到的图片如下:

文章到此结束,本人新手,若有错误,欢迎指正;若有疑问,欢迎讨论。若文章对你有用,点个小赞鼓励一下,谢谢大家,一起加油吧!

  • 12
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值