【图片爬虫】40行代码用Python爬取"王者农药","英雄脸萌"英雄皮肤

写在前面:

写这篇博客的原因,看到一个3万到30万的it程序猿的博客就是这里,发现文中有些内容挺有意思,但是代码整体上并不整洁,新手入门可能还欠些内容(其实就是想玩一下)

直接贴代码

王者农药
import requests

from fake_useragent import UserAgent
ua = UserAgent()
url = 'http://pvp.qq.com/web201605/js/herolist.json'


head = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.109 Safari/537.36'}
response = requests.get(url, headers=head)
hero_list = response.json()

# 提取英雄名字和数字
hero_name=list(map(lambda x:x['cname'], hero_list))

hero_number=list(map(lambda x:x['ename'], hero_list))

hero_name_title=list(map(lambda x:x['title'], hero_list))

h_l='http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/'


for n,i in enumerate(hero_number):
    headers = {
        "User-Agent": ua.random,
        "referer": "https://pvp.qq.com/web201605/herodetail/%s.shtml"%i
    }
    # 逐一遍历皮肤,此处假定一个英雄最多有15个皮肤
    for sk_num in range(15):
        hsl = h_l + str(i)+'/'+str(i)+'-bigskin-'+str(sk_num)+'.jpg'
        hl = requests.get(hsl,headers=headers)
        filepath = "./img/"+hero_name[n]+ "_" + hero_name_title[n]+str(sk_num) + '.jpg'
        if hl.status_code == 200:
            with open(filepath, 'wb') as f:
                f.write(hl.content)
                print(hero_name[n] + "ok" + str(sk_num))
        else:
            break

以上代码多数都是拷贝上面提到的那个小哥的,这个咱不避讳,抄了就是抄了
简单的做了点处理,其中存储图片路径,一定要记得在python文件目录下创建一个img目录,不然文件老多了。
做了请求结果状态的判断,增加爬虫效率,减少没必要的时间浪费。
添加了反爬虫的简单配置。(user-agent)需要安装fake_useragent 直接用这个命令安装就行

pip3 install fake_useragent
想了又想,灵光乍现

既然能爬虫农药的,为什么不能看看脸萌的嘞?
找了一圈他的json文件,发现竟然维护在页面的js里面?而且英雄的中文名字竟然还用unicode转码的?算了,这个不重要,咱需要的图片

英雄脸萌
import requests
from fake_useragent import UserAgent
ua = UserAgent()

info_list = {
		"266": "Aatrox",
        "103": "Ahri",
        "84": "Akali",
        "12": "Alistar",
        "32": "Amumu",
        "34": "Anivia",
        "1": "Annie",
        "22": "Ashe",
        "136": "AurelionSol",
        "268": "Azir",
        "432": "Bard",
        "53": "Blitzcrank",
        "63": "Brand",
        "201": "Braum",
        "51": "Caitlyn",
        "164": "Camille",
        "69": "Cassiopeia",
        "31": "Chogath",
        "42": "Corki",
        "122": "Darius",
        "131": "Diana",
        "119": "Draven",
        "36": "DrMundo",
        "245": "Ekko",
        "60": "Elise",
        "28": "Evelynn",
        "81": "Ezreal",
        "9": "Fiddlesticks",
        "114": "Fiora",
        "105": "Fizz",
        "3": "Galio",
        "41": "Gangplank",
        "86": "Garen",
        "150": "Gnar",
        "79": "Gragas",
        "104": "Graves",
        "120": "Hecarim",
        "74": "Heimerdinger",
        "420": "Illaoi",
        "39": "Irelia",
        "427": "Ivern",
        "40": "Janna",
        "59": "JarvanIV",
        "24": "Jax",
        "126": "Jayce",
        "202": "Jhin",
        "222": "Jinx",
        "145": "Kaisa",
        "429": "Kalista",
        "43": "Karma",
        "30": "Karthus",
        "38": "Kassadin",
        "55": "Katarina",
        "10": "Kayle",
        "141": "Kayn",
        "85": "Kennen",
        "121": "Khazix",
        "203": "Kindred",
        "240": "Kled",
        "96": "KogMaw",
        "7": "Leblanc",
        "64": "LeeSin",
        "89": "Leona",
        "127": "Lissandra",
        "236": "Lucian",
        "117": "Lulu",
        "99": "Lux",
        "54": "Malphite",
        "90": "Malzahar",
        "57": "Maokai",
        "11": "MasterYi",
        "21": "MissFortune",
        "62": "MonkeyKing",
        "82": "Mordekaiser",
        "25": "Morgana",
        "267": "Nami",
        "75": "Nasus",
        "111": "Nautilus",
        "518": "Neeko",
        "76": "Nidalee",
        "56": "Nocturne",
        "20": "Nunu",
        "2": "Olaf",
        "61": "Orianna",
        "516": "Ornn",
        "80": "Pantheon",
        "78": "Poppy",
        "555": "Pyke",
        "133": "Quinn",
        "497": "Rakan",
        "33": "Rammus",
        "421": "RekSai",
        "58": "Renekton",
        "107": "Rengar",
        "92": "Riven",
        "68": "Rumble",
        "13": "Ryze",
        "113": "Sejuani",
        "35": "Shaco",
        "98": "Shen",
        "102": "Shyvana",
        "27": "Singed",
        "14": "Sion",
        "15": "Sivir",
        "72": "Skarner",
        "37": "Sona",
        "16": "Soraka",
        "50": "Swain",
        "517": "Sylas",
        "134": "Syndra",
        "223": "TahmKench",
        "163": "Taliyah",
        "91": "Talon",
        "44": "Taric",
        "17": "Teemo",
        "412": "Thresh",
        "18": "Tristana",
        "48": "Trundle",
        "23": "Tryndamere",
        "4": "TwistedFate",
        "29": "Twitch",
        "77": "Udyr",
        "6": "Urgot",
        "110": "Varus",
        "67": "Vayne",
        "45": "Veigar",
        "161": "Velkoz",
        "254": "Vi",
        "112": "Viktor",
        "8": "Vladimir",
        "106": "Volibear",
        "19": "Warwick",
        "498": "Xayah",
        "101": "Xerath",
        "5": "XinZhao",
        "157": "Yasuo",
        "83": "Yorick",
        "350": "Yuumi",
        "154": "Zac",
        "238": "Zed",
        "115": "Ziggs",
        "26": "Zilean",
        "142": "Zoe",
        "143": "Zyra"
}
id_list = []
name_list = []
for id in info_list:
    base_url = "https://ossweb-img.qq.com/images/lol/web201310/skin/big%s"%id
    base_name = info_list[id]
    headers = {
        "Referer":"https://lol.qq.com/data/info-defail.shtml?id=%s"%base_name,
        "User-Agent":ua.random
    }
    for num in range(0,150):
        n = ''
        if 100>num>10:
            n = "0"+str(num)
        elif num >=100:
            n = str(num)
        else:
            n = "00"+str(num)
        get_url = base_url+n+".jpg"
        page = requests.get(get_url,headers= headers)
        if page.status_code == 200:
            filepath = "./yxlm/"+base_name+"_"+n+".jpg"
            with open(filepath,'wb') as f:
                f.write(page.content)
                print(base_name," ok")
        else:
                break

那么大一段,竟然都是id和名字关联的信息,哎,没办法呀。总不能请求一个js文件再解析吧,这种复制粘贴的事,越简单越好。
结构和上面农药的差不多,下载速度还挺快的呢。
记得创建yxlm目录

Python中,你可以使用爬虫技术来爬取王者荣耀英雄图片。以下是一个简单的示例代码来演示如何使用Python爬取王者荣耀英雄图片: ```python import requests import os def download_image(url, save_path): response = requests.get(url) with open(save_path, 'wb') as f: f.write(response.content) def crawl_hero_images(): # 创建保存图片的文件夹 if not os.path.exists('hero_images'): os.makedirs('hero_images') # 发送请求获取英雄列表 hero_list_url = 'https://api.example.com/heroes' response = requests.get(hero_list_url) hero_list = response.json() # 遍历英雄列表,爬取每个英雄皮肤图片 for hero in hero_list: hero_name = hero['name'] skin_list = hero['skins'] for skin in skin_list: skin_name = skin['name'] image_url = skin['image_url'] save_path = f'hero_images/{hero_name}_{skin_name}.jpg' download_image(image_url, save_path) print(f'Successfully downloaded {hero_name} - {skin_name} image.') crawl_hero_images() ``` 上述代码中,我们首先创建了一个`download_image`函数,用于下载图片。然后,我们定义了一个`crawl_hero_images`函数,用于爬取英雄皮肤图片。在该函数中,我们首先发送请求获取英雄列表,然后遍历英雄列表,对于每个英雄,再遍历其皮肤列表,获取皮肤图片的URL,并使用`download_image`函数下载图片到本地。 请注意,上述代码中的URL和文件路径仅作为示例,请根据实际情况进修改。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值