python dota2数据 3 下载胜负数据
目标:下载每场比赛获胜方或战败方的5个英雄,用于后续分析。
由于get_match_history()返回的内容中并不包括比赛胜负,还需要对每一场比赛使用get_match_details()查询,效率较低。
使用另一个查询比赛的API,get_match_history_by_seq_num():
{
status
1 - Success
8 - Matches_requested must be greater than 0
statusDetail - Message explaining a status that is not equal to 1
[matches] - See get_match_details()
}
该函数可直接查询从指定sequence_id开始的比赛,返回的内容包含了get_match_details()中的比赛详细信息。(sequence_id与match_id不同)
取起始比赛sequence_id为3100000000,经查询为2017年11月16日,7.07版本发布后几天。
将该数字存在一个文件latest.dat中,用于每次查询前后读取和更新,便于中断和继续下载。
获取数据后对其进行处理,使每一行先存储获胜方5个英雄的ID,再存储战败方5个英雄的ID,最后为比赛的sequence_id,以这种形式存储在matches.dat中。
import dota2api
import time
api = dota2api.Initialise()
counter = 0
while 1:
#读取已下载的最新比赛的sequence_id
f_latest = open('latest.dat', 'r')
latest = int(f_latest.readline())
latest += 1
f_latest.close()
#调用API获取比赛信息
while(1):
try:
data_got = api.get_match_history_by_seq_num(matches_requested = 50, start_at_match_seq_num = latest)
except (Exception):
print('error, wait 3 seconds.******************************************')
time.sleep(3)
continue
else:
break
matches = data_got['matches']
#写入文件
f_data = open('matches.dat', 'a')
#对获取的每场比赛
for m in matches:
#比赛ID
id = m['match_id']
heroes = []
#获胜方
win = m['radiant_win']
#所有玩家
players = m['players']
#排除人数不为10的,如solo
if len(players) != 10:
print('player num error: ', len(players))
continue
#获取所有上场英雄
for p in players:
hero_id = p['hero_id']
heroes.append(hero_id)
#天辉英雄
radiant_team = heroes[0:5]
#夜魇英雄
dire_team = heroes[5:10]
#按ID排序
radiant_team.sort()
dire_team.sort()
#排除英雄ID错误
if radiant_team[0] == 0:
print('player hero error: 0')
continue
if dire_team[0] == 0:
print('player hero error: 0')
continue
#按获胜方和战败方存储英雄ID
if win:
win_team = radiant_team
lose_team = dire_team
else:
win_team = dire_team
lose_team = radiant_team
#当前比赛的sequence_id
latest = m['match_seq_num']
counter += 1
print(counter, id, win, win_team, lose_team, latest)
#写入文件
for h in win_team:
f_data.write(str(h) + ' ')
for h in lose_team:
f_data.write(str(h) + ' ')
f_data.write(str(latest))
f_data.write('\n')
#更新latest.dat
f_update = open('latest.dat', 'w')
f_update.writelines(str(latest))
f_update.close()
f_data.close()
运行后的matches.dat: