DOTA2利雅得大师赛利用api多线程对选手数据和战队数据爬取与分析

本文链接：https://blog.csdn.net/m0_63551999/article/details/125968909

本文详细分析了LGD战队在利雅得大师赛中的关键数据，包括选手KDA、参战率、英雄上场及胜率，以及战队治疗量等，揭示了比赛战术与选手表现

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

首先恭喜中国战队LGD夺下本次利雅得大师赛冠军！

数据的爬取

选手数据以及战队数据爬取

数据分析

数据（因为原数据的英雄都是id，这里进行连接将英雄id替换为英雄名）

选手KDA前十名

选手参战率，参葬率，战死率

选手场均十分钟补刀数前十,以及他们的GPM,XPM

计算先摧毁对方优势路，中路，劣势路一塔的胜率

最后看看LGD的使用英雄以及胜率

数据的爬取

数据来源于opendota，对opendota的api进行调用获取数据，想要找到利雅得的所有比赛id，那就要先找到这个比赛，首先咱们来看看opendota的api页面，网址为OpenDota API

打开到match页面会看到请求数据的api介绍以及对应的url里面的json数据

我们找一场利雅得大师赛的比赛，比如这场，比赛编号为6676393091

利用api提供的网站打开看看

一大串json数据，这样看不方便，按F12打开抓包工具在里面看

看得出这场比赛里面的数据有很多分组，对应的分组是什么意思都可以在api文档查看，其中league中的leagueid就是这次利雅得大师赛的大赛id为14391，这就是dota2不同的大赛之间的区别，然后通过leagueid获取到利雅得大师赛的所有比赛id，再进行数据爬取就行，下面我直接上代码

首先是导入需要的模块，爬取利雅得的所有比赛id，参数id就是利雅得大师赛id 14391

import os.path
import time
import pandas as pd
import requests
import threading


def pro_game(id):
    pro_match_id = []
    pro_url = f'https://api.opendota.com/api/leagues/{id}/matches'
    text = requests.get(url=pro_url,headers=headers).json()
    for item in text:
        pro_match_id.append(item['match_id'])
    print(pro_match_id)
    return pro_match_id

获取的比赛id之后我们需要对每个比赛的数据进行爬取，在爬取之前我们需要对比赛id进行post请求，因为如果进行post的话，opendota的内置功能就会把录像进一步解析，从而我们可以获得更多数据，比如推塔时间，团战爆发的次数，这些都是直接get得不到的，必须先post请求一下

def jiexi(match_id):
    post_url = f'https://api.opendota.com/api/request/{match_id}'
    pos = requests.post(url=post_url, headers=headers)
    match_url = f'https://api.opendota.com/api/matches/{match_id}'
    match_text = requests.get(url=match_url, headers=headers).json()
    return match_text

get得到每场比赛数据之后就可以选择你想要的数据进行爬取，选手数据爬取与存储如下：

本来我是想看看能不能定位选手是哪条路的，但是因为dota2打法很多样，选手每场分路也不一样，导致分路这个数据不是很准确，所以我在后面的数据分析也没用上他

def role(lr):
    if lr == 1:
        return '优势路'
    if lr == 2:
        return '中路'
    if lr == 3:
        return '劣势路'

根据选手所处的哪一方以及哪一方胜利来判断选手是否赢得这场比赛

选手数据以及战队数据爬取

def xuanshou(match_text):
    print('开始')
    for item in match_text['players']:
        row_dic = {}
        try:
            row_dic['match_id'] = item['match_id']
            row_dic['name'] = item['name']
            row_dic['ID'] = item['account_id']
            row_dic['英雄id'] = item['hero_id']
            l_role = item['lane_role']
            row_dic['位置'] = role(l_role)
            row_dic['10分钟补刀'] = item['benchmarks']['lhten']['raw']
            row_dic['杀人数'] = item['kills']
            row_dic['死亡数'] = item['deaths']
            row_dic['助攻'] = item['assists']
            row_dic['反眼数'] = item['observer_kills']
            row_dic['总金钱'] = item['total_gold']
            row_dic['治疗量'] = item['hero_healing']
            row_dic['GPM'] = item['benchmarks']['gold_per_min']['raw']
            row_dic['XPM'] = item['benchmarks']['xp_per_min']['raw']
            row_dic['每分钟补刀数'] = item['benchmarks']['last_hits_per_min']['raw']
            row_dic['每分钟伤害'] = item['benchmarks']['hero_damage_per_min']['raw']
            row_dic['总伤害'] = item['hero_damage']
            row_dic['天辉方战队'] = match_text['radiant_team']['name']
            row_dic['天辉杀人数'] = match_text['radiant_score']
            row_dic['夜宴方战队'] = match_text['dire_team']['name']
            row_dic['夜宴杀人数'] = match_text['dire_score']
            place = item['isRadiant']

            if place == True:
                row_dic['所在方'] = '天辉'
                if item['win'] == 0:
                    row_dic['输赢'] = '输'
                if item['win'] == 1:
                    row_dic['输赢'] = '赢'
            if place == False:
                row_dic['所在方'] = '夜宴'
                if item['win'] == 0:
                    row_dic['输赢'] = '输'
                if item['win'] == 1:
                    row_dic['输赢'] = '赢'
        except:
            print('出错')
        row_list.append(row_dic)


    time.sleep(0.5)

def paqu(match_text):
    dicq = {}
    dicq['比赛id'] = match_text['match_id']
    dicq['持续时间'] = match_text['duration']/60
    dicq['天辉方战队'] = match_text['radiant_team']['name']
    dicq['天辉杀人数'] = match_text['radiant_score']
    dicq['夜宴方战队'] = match_text['dire_team']['name']
    dicq['夜宴杀人数'] = match_text['dire_score']
    dicq['胜利方'] = '天辉' if match_text['radiant_win'] == True else '夜宴'
    dicq['团战次数'] = len(match_text['teamfights'])
    for item in match_text['objectives']:
        if item['type']=='building_kill' and item['key'] =='npc_dota_goodguys_tower1_mid':
            dicq['天辉中路一塔'] = item['time']/60
        if item['type']=='building_kill' and item['key'] =='npc_dota_goodguys_tower1_top':
            dicq['天辉劣势路一塔'] = item['time']/60
        if item['type']=='building_kill' and item['key'] =='npc_dota_goodguys_tower1_bot':
            dicq['天辉优势路一塔'] = item['time']/60
        if item['type'] == 'building_kill' and item['key'] =='npc_dota_badguys_tower1_mid':
            dicq['夜宴中路1塔'] = item['time'] / 60
        if item['type']=='building_kill' and item['key'] =='npc_dota_badguys_tower1_top':
            dicq['夜宴优势路一塔'] = item['time']/60
        if item['type']=='building_kill' and item['key'] =='npc_dota_badguys_tower1_bot':
            dicq['夜宴劣势路一塔'] = item['time']/60
    quanju_list.append(dicq)
    print(f"{dicq['天辉方战队']}vs{dicq['夜宴方战队']}完成")

主函数里面使用双线程进行数据的爬取

if __name__ =='__main__':
    # account_id = 1150728771
    # m = match(account_id,100)
    where = input('请输入文件存储地址:')
    name = 'DOTA2利雅得.xlsx'
    dizhi = os.path.join(where,name)
    id = pro_game(14391)
    print(id)
    row_list = []
    quanju_list = []
    for mid in id:
        shuju = jiexi(mid)
        thread1 = threading.Thread(target=xuanshou,args=(shuju,))
        thread2 = threading.Thread(target=paqu,args=(shuju,))
        thread1.start()
        thread2.start()
    zd_d = pd.DataFrame(quanju_list)
    r_d = pd.DataFrame(row_list)
    r_d.to_excel(dizhi,sheet_name='选手数据')
    with pd.ExcelWriter(dizhi,mode='a',engine='openpyxl')as writer:
        zd_d.to_excel(writer,sheet_name='战队数据')

运行程序后结果大概是这样