Python大数据分析LOL游戏胜率

基于数据的LOL游戏胜利预测

在这里插入图片描述

目录:

1.背景

a.LOL简介:

英雄联盟(LOL)是一个MOBA(多人在线战斗竞技场),其中2支队伍(蓝色和红色)对峙。有3条车道,一个丛林和5个角色。目标是击倒敌方水晶以赢得比赛.

在这里插入图片描述

  • 名词解释:
  • Warding totem:(视野眼) 玩家可以放置在地图上以显示附近区域的物品。对于地图/目标控制非常有用。
  • Minions: (小兵)属于两个团队的NPC。当被玩家杀死时,他们给予金币。
  • Jungle minions: 丛林NPC。当被玩家杀死时,他们会给予金币和增益。
  • Elite monsters: 具有很高的血量和伤害的怪兽,在被团队杀死时会给予巨额奖励(金币/ XP /属性)。
  • Dragons: 精英怪兽,被杀死后会给予团队加成。被团队杀死的第四条龙给予了巨大的属性加值。第五龙(长者龙)为球队提供了巨大的优势
  • Herald: 精英怪物,被玩家杀死后会给予属性加成。它有助于推开车道并破坏建筑物
  • Towers:(防御塔) 您必须摧毁的结构才能到达敌方水晶。他们给金币。
  • Level:等级,从1开始,最大为18。

游戏规则简介:

玩家—>杀死小兵和野怪—>获得金币和buff–>摧毁敌方防御塔------>摧毁敌方水晶(获得胜利)

​ | | |

​ | | 击杀敌人

​ 放置视野眼,获得视野优势 | |

​ |__—>购买装备,提升实力

b.对其进行数据分析的可行性和目的:

随着网络游戏在年轻人中的盛行,电子竞技也变得越来越流行.LOL(英雄联盟)作为电子竞技的代表游戏之一,受到越来越多年轻人的关注和喜爱

  • 可行性:在每一次比赛的过程中,最终的胜利受到许许多多因素的影响,不可否认,玩家的操作技术和意识是决定比赛输赢的关键因素,但玩家的所有自身实力都会反映在游戏中的数据里面,而LOL又是一款团队竞技游戏,因此,一局比赛进行到后期时的数据能够比较好的反映出操作者的水平和团队间的配合.此时的数据就可以比较准确的用来对比赛的输赢进行预测.

  • 目的:通过对LOL数据的分析,不仅可以用来对比赛的输赢进行预测,同时,也可以发现对一局比赛输赢影响较大的因素,从而对现实生活中的比赛具有指导意义.

2.数据简介和展示:

简介:

  • 数据来源:网络

  • **数据集简介:**此数据集包含前10分钟大约统计 从高ELO(钻石I到大师)的10k次排位游戏。玩家的水平大致相同。游戏开始10分钟后,每支队伍收集了19项数据(总共38项)。其中包括杀戮,死亡,金钱,经验,等级……

数据展示:

data = pd.read_csv('high_diamond_ranked_10min.csv', index_col=0)
print(data.head())
          gameId  blueWins  blueWardsPlaced  blueWardsDestroyed  blueFirstBlood  \
0  4519157822         0               28                   2               1   
1  4523371949         0               12                   1               0   
2  4521474530         0               15                   0               0   
3  4524384067         0               43                   1               0   
4  4436033771         0               75                   4               0   
   blueKills  blueDeaths  blueAssists  blueEliteMonsters  blueDragons  \
0          9           6           11                  0            0   
1          5           5            5                  0            0   
2          7          11            4                  1            1   
3          4           5            5                  1            0   
4          6           6            6                  0            0   
   blueHeralds  blueTowersDestroyed  blueTotalGold  blueAvgLevel  \
0            0                    0          17210           6.6   
1            0                    0          14712           6.6   
2            0                    0          16113           6.4   
3            1                    0          15157           7.0   
4            0                    0          16400           7.0   
   blueTotalExperience  blueTotalMinionsKilled  blueTotalJungleMinionsKilled  \
0                17039                     195                            36   
1                16265                     174                            43   
2                16221                     186                            46   
3                17954                     201                            55   
4                18543                     210                            57   
   blueGoldDiff  blueExperienceDiff  blueCSPerMin  blueGoldPerMin  \
0           643                  -8          19.5          1721.0   
1         -2908               -1173          17.4          1471.2   
2         -1172               -1033          18.6          1611.3   
3         -1321                  -7          20.1          1515.7   
4         -1004                 230          21.0          1640.0   
   redWardsPlaced  redWardsDestroyed  redFirstBlood  redKills  redDeaths  \
0              15                  6              0         6          9   
1              12                  1              1         5          5   
2              15                  3              1        11          7   
3              15                  2              1         5          4   
4              17                  2              1         6          6   
   redAssists  redEliteMonsters  redDragons  redHeralds  redTowersDestroyed  \
0           8                 0           0           0                   0   
1           2                 2           1           1                   1   
2          14                 0           0           0                   0   
3          10                 0           0           0                   0   
4           7                 1           1           0                   0   
   redTotalGold  redAvgLevel  redTotalExperience  redTotalMinionsKilled  \
0         16567          6.8               17047                    197   
1         17620          6.8               17438                    240   
2         17285          6.8               17254                    203   
3         16478          7.0               17961                    235   
4         17404          7.0               18313                    225   
   redTotalJungleMinionsKilled  redGoldDiff  redExperienceDiff  redCSPerMin  \
0                           55         -643                  8         19.7   
1                           52         2908               1173         24.0   
2                           28         1172               1033         20.3   
3                           47         1321                  7         23.5   
4                           67         1004               -230         22.5   
   redGoldPerMin  
0         1656.7  
1         1762.0  
2         1728.5  
3         1647.8  
4         1740.4  

数据结构:

print(data.shape)
数据形状: (9879, 40)    #集合共包含9879个元数据,每个数据有40列
print(data.describe)
数据概览:   blueWins  blueWardsPlaced  blueWardsDestroyed  blueFirstBlood  \
count   9879.000000      9879.000000         9879.000000     9879.000000   
mean      0.499038        22.288288            2.824881        0.504808   
std       0.500024        18.019177            2.174998        0.500002   
min       0.000000         5.000000            0.000000        0.000000   
25%       0.000000        14.000000            1.000000        0.000000   
50%       0.000000        16.000000            3.000000        1.000000   
75%       1.000000        20.000000            4.000000        1.000000   
max       1.000000       250.000000           27.000000        1.000000   
         blueKills   blueDeaths  blueAssists  blueEliteMonsters  blueDragons  \
count  9879.000000  9879.000000  9879.000000        9879.000000  9879.000000   
mean      6.183925     6.137666     6.645106           0.549954     0.361980   
std       3.011028     2.933818     4.064520           0.625527     0.480597   
min       0.000000     0.000000     0.000000           0.000000     0.000000   
25%       4.000000     4.000000     4.000000           0.000000     0.000000   
50%       6.000000     6.000000     6.000000           0.000000     0.000000   
75%       8.000000     8.000000     9.000000           1.000000     1.000000   
max      22.000000    22.000000    29.000000           2.000000     1.000000   
       blueHeralds  blueTowersDestroyed  blueTotalGold  blueAvgLevel  \
count  9879.000000          9879.000000    9879.000000   9879.000000   
mean      0.187974             0.051422   16503.455512      6.916004   
std       0.390712             0.244369    1535.446636      0.305146   
min       0.000000             0.000000   10730.000000      4.600000   
25%       0.000000             0.000000   15415.500000      6.800000   
50%       0.000000             0.000000   16398.000000      7.000000   
75%       0.000000             0.000000   17459.000000      7.200000   
max       1.000000             4.000000   23701.000000      8.000000   
       blueTotalExperience  blueTotalMinionsKilled  \
count          9879.000000             9879.000000   
mean          17928.110133              216.699565   
std            1200.523764               21.858437   
min           10098.000000               90.000000   
25%           17168.000000              202.000000   
50%           17951.000000              218.000000   
75%           18724.000000              232.000000   
max           22224.000000              283.000000   
       blueTotalJungleMinionsKilled  blueGoldDiff  blueExperienceDiff  \
count                   9879.000000   9879.000000         9879.000000   
mean                      50.509667     14.414111          -33.620306   
std                        9.898282   2453.349179         1920.370438   
min                        0.000000 -10830.000000        -9333.000000   
25%                       44.000000  -1585.500000        -1290.500000   
50%                       50.000000     14.000000          -28.000000   
75%                       56.000000   1596.000000         1212.000000   
max                       92.000000  11467.000000         8348.000000   
       blueCSPerMin  blueGoldPerMin  redWardsPlaced  redWardsDestroyed  \
count   9879.000000     9879.000000     9879.000000        9879.000000   
mean      21.669956     1650.345551       22.367952           2.723150   
std        2.185844      153.544664       18.457427           2.138356   
min        9.000000     1073.000000        6.000000           0.000000   
25%       20.200000     1541.550000       14.000000           1.000000   
50%       21.800000     1639.800000       16.000000           2.000000   
75%       23.200000     1745.900000       20.000000           4.000000   
max       28.300000     2370.100000      276.000000          24.000000   
       redFirstBlood     redKills    redDeaths   redAssists  redEliteMonsters  \
count    9879.000000  9879.000000  9879.000000  9879.000000       9879.000000   
mean        0.495192     6.137666     6.183925     6.662112          0.573135   
std         0.500002     2.933818     3.011028     4.060612          0.626482   
min         0.000000     0.000000     0.000000     0.000000          0.000000   
25%         0.000000     4.000000     4.000000     4.000000          0.000000   
50%         0.000000     6.000000     6.000000     6.000000          0.000000   
75%         1.000000     8.000000     8.000000     9.000000          1.000000   
max         1.000000    22.000000    22.000000    28.000000          2.000000   
        redDragons   redHeralds  redTowersDestroyed  redTotalGold  \
count  9879.000000  9879.000000         9879.000000   9879.000000   
mean      0.413098     0.160036            0.043021  16489.041401   
std       0.492415     0.366658            0.216900   1490.888406   
min       0.000000     0.000000            0.000000  11212.000000   
25%       0.000000     0.000000            0.000000  15427.500000   
50%       0.000000     0.000000            0.000000  16378.000000   
75%       1.000000     0.000000            0.000000  17418.500000   
max       1.000000     1.000000            2.000000  22732.000000   
       redAvgLevel  redTotalExperience  redTotalMinionsKilled  \
count  9879.000000         9879.000000            9879.000000   
mean      6.925316        17961.730438             217.349226   
std       0.305311         1198.583912              21.911668   
min       4.800000        10465.000000             107.000000   
25%       6.800000        17209.500000             203.000000   
50%       7.000000        17974.000000             218.000000   
75%       7.200000        18764.500000             233.000000   
max       8.200000        22269.000000             289.000000   
       redTotalJungleMinionsKilled   redGoldDiff  redExperienceDiff  \
count                  9879.000000   9879.000000        9879.000000   
mean                     51.313088    -14.414111          33.620306   
std                      10.027885   2453.349179        1920.370438   
min                       4.000000 -11467.000000       -8348.000000   
25%                      44.000000  -1596.000000       -1212.000000   
50%                      51.000000    -14.000000          28.000000   
75%                      57.000000   1585.500000        1290.500000   
max                      92.000000  10830.000000        9333.000000   
       redCSPerMin  redGoldPerMin  
count  9879.000000    9879.000000  
mean     21.734923    1648.904140  
std       2.191167     149.088841  
min      10.700000    1121.200000  
25%      20.300000    1542.750000  
50%      21.800000    1637.800000  
75%      23.300000    1741.850000  
max      28.900000    2273.200000  

数据名词详解:

pd.set_option('display.width', 10)      #设置Console每一行展示的最大宽度,屏幕一行显示满之后才会进行换行
print("数据列名:",data.columns)
Index(['gameId',                         #每局游戏的唯一ID。
       #--------------------------------------------------------------------
       'blueWins',                       #蓝方是否获得胜利  1:胜利  0:失败   *****因变量****
       #--------------------------------------------------------------------19项
       'blueWardsPlaced',                #蓝色团队在地图上放置的视野眼数量
       'blueWardsDestroyed',             #蓝队摧毁的敌方视野眼数量
       'blueFirstBlood',                 #蓝方是否获得一血(游戏的第一杀)   1:获得 0:未获得
       'blueKills',                      #蓝队杀死的敌人数量
       'blueDeaths',                     #死亡人数(蓝队)
       'blueAssists',                    #击杀助攻数(蓝队)
       'blueEliteMonsters',              #蓝队杀死的精锐怪物数量(龙与先驱队)
       'blueDragons',                    #蓝队杀死的龙数量
       'blueHeralds',                    #蓝队杀死的精英怪物数量
       'blueTowersDestroyed',            #蓝队摧毁防御塔数量
       'blueTotalGold',                  #蓝队总的金币数量
       'blueAvgLevel',                   #蓝队平均等级
       'blueTotalExperience',            #蓝队总的经验
       'blueTotalMinionsKilled',         #蓝队杀死的小兵总数
       'blueTotalJungleMinionsKilled',   #蓝队杀死的野怪总数
       'blueGoldDiff',                   #蓝队金币与红队差值
       'blueExperienceDiff',             #蓝队经验差值
       'blueCSPerMin',                   #蓝队每分钟摧毁视野眼数量
       'blueGoldPerMin',                 #蓝队每分钟获得金币数量
       #红方与蓝方相同--------------------------------------------------------19项
       'redWardsPlaced',           
       'redWardsDestroyed',
       'redFirstBlood',
       'redKills',
       'redDeaths',
       'redAssists',
       'redEliteMonsters',
       'redDragons',
       'redHeralds',
       'redTowersDestroyed',
       'redTotalGold',
       'redAvgLevel',
       'redTotalExperience',
       'redTotalMinionsKilled',
       'redTotalJungleMinionsKilled',
       'redGoldDiff',
       'redExperienceDiff',
       'redCSPerMin',
       'redGoldPerMin'],

数据清洗处理:

1.检查数据是否规范和有缺失:

删除有空值的行:

data.dropna(axis=0, how='any', inplace=True)

查看数据是否规范:

print("数据概览:",data.info())      
<class 'pandas.core.frame.DataFrame'>
Int64Index: 9879 entries, 0 to 9878
Data columns (total 40 columns):
 #   Column                        Non-Null Count  Dtype  
---  ------                        --------------  -----  
 0   gameId                        9879 non-null   int64  
 1   blueWins                      9879 non-null   int64  
 2   blueWardsPlaced               9879 non-null   int64  
 3   blueWardsDestroyed            9879 non-null   int64  
 4   blueFirstBlood                9879 non-null   int64  
 5   blueKills                     9879 non-null   int64  
 6   blueDeaths                    9879 non-null   int64  
 7   blueAssists                   9879 non-null   int64  
 8   blueEliteMonsters             9879 non-null   int64  
 9   blueDragons                   9879 non-null   int64  
 10  blueHeralds                   9879 non-null   int64  
 11  blueTowersDestroyed           9879 non-null   int64  
 12  blueTotalGold                 9879 non-null   int64  
 13  blueAvgLevel                  9879 non-null   float64
 14  blueTotalExperience           9879 non-null   int64  
 15  blueTotalMinionsKilled        9879 non-null   int64  
 16  blueTotalJungleMinionsKilled  9879 non-null   int64  
 17  blueGoldDiff                  9879 non-null   int64  
 18  blueExperienceDiff            9879 non-null   int64  
 19  blueCSPerMin                  9879 non-null   float64
 20  blueGoldPerMin                9879 non-null   float64
 21  redWardsPlaced                9879 non-null   int64  
 22  redWardsDestroyed             9879 non-null   int64  
 23  redFirstBlood                 9879 non-null   int64  
 24  redKills                      9879 non-null   int64  
 25  redDeaths                     9879 non-null   int64  
 26  redAssists                    9879 non-null   int64  
 27  redEliteMonsters              9879 non-null   int64  
 28  redDragons                    9879 non-null   int64  
 29  redHeralds                    9879 non-null   int64  
 30  redTowersDestroyed            9879 non-null   int64  
 31  redTotalGold                  9879 non-null   int64  
 32  redAvgLevel                   9879 non-null   float64
 33  redTotalExperience            9879 non-null   int64  
 34  redTotalMinionsKilled         9879 non-null   int64  
 35  redTotalJungleMinionsKilled   9879 non-null   int64  
 36  redGoldDiff                   9879 non-null   int64  
 37  redExperienceDiff             9879 non-null   int64  
 38  redCSPerMin                   9879 non-null   float64
 39  redGoldPerMin                 9879 non-null   float64
dtypes: float64(6), int64(34)

2.对数据进行处理:

预处理:

由于gameId与游戏胜利无关,因此删去

data=data.drop(['gameId'], axis=1)

热力图分析处理:

在相关性矩阵的热力图中可以发现存在高度相关的变量,这些变量解释了相同的事物。因此,如果它们显示的数据与另一列相同,则它们对分类没有帮助。例如在列 RedKills(红色团队击杀的次数)和BlueDeaths(蓝队被击杀的人数)中。红队的击杀人数就是蓝队的死亡人数。因此,正确的做法是删除一个列。:(有时在游戏redkills和bluedeaths不一定相等,因为玩家可能会被野怪和防御塔杀死,但我们的数据来自于高段位玩家,这种情况可以忽略不记)

plt.figure(figsize=(20,15))
sns.heatmap(round(data.corr(),1), cmap="coolwarm", annot=True, linewidths=.5)
plt.savefig('热力图相关性分析.jpg', bbox_inches='tight')
# data.corr():计算列与列之间的相关系数,返回相关系数矩阵
# sns.heatmap():利用seaborn绘制变量之间相关性的热力图

在这里插入图片描述

将数据中相关性较高的数据删去,降低分析难度

#定义一个函数 作用:找出相关系数矩阵中相关性大的一组数据,同时返回其中一列数据
def remove_redundancy(r):
    to_remove = []
    for i in range(len(r.columns)):
        for j in range(i):
            if (abs(r.iloc[i,j]) >= 1 and (r.columns[j] not in to_remove)):
                print("相关性:",r.iloc[i,j], r.columns[j], r.columns[i])
                to_remove.append(r.columns[i])
    return to_remove

clean_data = data.drop(remove_redundancy(data.corr()), axis=1) #删去相关性较高项

这几组数据的本质是一样的,故删去

相关性: 1.000000000000002 blueTotalMinionsKilled blueCSPerMin
相关性: 1.0000000000000013 blueTotalGold blueGoldPerMin
相关性: -1.0 blueFirstBlood redFirstBlood
相关性: 1.0 blueDeaths redKills
相关性: 1.0 blueKills redDeaths
相关性: -1.0 blueGoldDiff redGoldDiff
相关性: -1.0 blueExperienceDiff redExperienceDiff
相关性: 1.0000000000000042 redTotalMinionsKilled redCSPerMin
相关性: 1.0000000000000049 redTotalGold redGoldPerMin

数据分析和建模(Logistic Regression):

1.数据分析处理:

1.初步处理后数据:

print("初步处理后的数据:",clean_data.columns)blueEliteMonsters redEliteMonsters
初步处理后的数据: Index(['blueWins',
       'blueWardsPlaced',
       'blueWardsDestroyed',
       'blueFirstBlood',
       'blueKills',
       'blueDeaths',
       'blueAssists',
       'blueEliteMonsters',
       'blueDragons',
       'blueHeralds',
       'blueTowersDestroyed',
       'blueTotalGold',
       'blueAvgLevel',
       'blueTotalExperience',
       'blueTotalMinionsKilled',
       'blueTotalJungleMinionsKilled',
       'blueGoldDiff',
       'blueExperienceDiff',
       #----------------------------------------------------------          
       'redWardsPlaced',
       'redWardsDestroyed',
       'redAssists',
       'redEliteMonsters',
       'redDragons',
       'redHeralds',
       'redTowersDestroyed',
       'redTotalGold',
       'redAvgLevel',
       'redTotalExperience',
       'redTotalMinionsKilled',
       'redTotalJungleMinionsKilled'],
      dtype='object')


2.一般分析处理:

  • 由于在游戏中击杀野怪和小兵都是获得经验和金币,因此将野怪和小兵的击杀数合并起来
clean_data['blueMinionsTotales'] = clean_data['blueTotalMinionsKilled'] + clean_data['blueTotalJungleMinionsKilled']
clean_data['redMinionsTotales'] = clean_data['redTotalMinionsKilled'] + clean_data['redTotalJungleMinionsKilled']
clean_data=clean_data.drop(['blueTotalMinionsKilled'], axis=1)
clean_data=clean_data.drop(['blueTotalJungleMinionsKilled'], axis=1)
clean_data=clean_data.drop(['redTotalMinionsKilled'], axis=1)
clean_data=clean_data.drop(['redTotalJungleMinionsKilled'], axis=1)
  • 由热力图分析可知,等级和经验的相关性较高,故进行分析:

    #等级和经验分析:
    plt.figure(figsize=(12,12))
    plt.subplot(121)
    sns.scatterplot(x='blueAvgLevel', y='blueTotalExperience', hue='blueWins', data=clean_data)
    plt.title('blue')
    plt.xlabel('blueAvgLevel')
    plt.ylabel('blueTotalExperience')
    plt.grid(True)
    plt.subplot(122)
    sns.scatterplot(x='redAvgLevel', y='redTotalExperience', hue='blueWins', data=clean_data)
    plt.title('red')
    plt.xlabel('redAvgLevel')
    plt.ylabel('redTotalExperience')
    plt.grid(True)
    plt.savefig('等级和经验分析.jpg', bbox_inches='tight')
    

在这里插入图片描述

可看出等级和经验呈线性关系,并且具有很强的相关性(见热力图),同时由于等级的差异不明显,故删去等级

#删去等级列
clean_data=clean_data.drop(['blueAvgLevel'], axis=1)
clean_data=clean_data.drop(['redAvgLevel'], axis=1)
  • 数据可视化分析:

    sns.set(font_scale=1.5)
    plt.figure(figsize=(20,20))
    sns.set_style("whitegrid")
    
    # 击杀和被击杀数绘制散点图
    plt.subplot(321)
    sns.scatterplot(x='blueKills', y='blueDeaths', hue='blueWins', data=clean_data)
    plt.title('blueKills&&blueDeaths')
    plt.xlabel('blueKills')
    plt.ylabel('blueDeaths')
    plt.grid(True)
    
    # 助攻数绘制散点图
    plt.subplot(322)
    sns.scatterplot(x='blueAssists', y='redAssists', hue='blueWins', data=clean_data)
    plt.title('Assists')
    plt.xlabel('blueAssists')
    plt.ylabel('redAssists')
    plt.tight_layout(pad=1.5)
    plt.grid(True)
    
    #双方金币数绘制散点图
    plt.subplot(323)
    sns.scatterplot(x='blueTotalGold', y='redTotalGold', hue='blueWins', data=clean_data)
    plt.title('TotalGold')
    plt.xlabel('blueTotalGold')
    plt.ylabel('redTotalGold')
    plt.tight_layout(pad=1.5)
    plt.grid(True)
    
    #双方经验绘制散点图
    plt.subplot(324)
    sns.scatterplot(x='blueTotalExperience', y='redTotalExperience', hue='blueWins', data=clean_data)
    plt.title('Experience')
    plt.xlabel('blueTotalExperience')
    plt.ylabel('redTotalExperience')
    plt.tight_layout(pad=1.5)
    plt.grid(True)
    
    # 双方插眼数量绘制散点图
    plt.subplot(325)
    sns.scatterplot(x='blueWardsPlaced', y='redWardsPlaced', hue='blueWins', data=clean_data)
    plt.title('WardsPlaced')
    plt.xlabel('blueWardsPlaced')
    plt.ylabel('redWardsPlaced')
    plt.tight_layout(pad=1.5)
    plt.grid(True)
    
    
    # 击杀的小兵和野怪总数绘制散点图
    plt.subplot(326)
    sns.scatterplot(x='blueMinionsTotales', y='redMinionsTotales', hue='blueWins', data=clean_data)
    plt.title('MinionsTotales')
    plt.xlabel('Equipo Azul')
    plt.ylabel('Equipo Rojo')
    plt.tight_layout(pad=1.5)
    plt.grid(True)
    plt.savefig('数据分析.jpg', bbox_inches='tight')
    

在这里插入图片描述

  • 由于在游戏中blueWardsPlacedredWardsPlaced, blueWardsDestroyedredWardsDestroyed, blueEliteMonstersredEliteMonsters等数据的值不大,并且比赛的胜利多与其之间的差值有关,同时数据与比赛双方息息相关,因此,将两个值用他们之间的差值来展示(减小数据量).
#将一些数据转换为它们的差值:
clean_data['WardsPlacedDiff'] = clean_data['blueWardsPlaced'] - clean_data['redWardsPlaced']
clean_data['WardsDestroyedDiff'] = clean_data['blueWardsDestroyed'] - clean_data['redWardsDestroyed']
clean_data['AssistsDiff'] = clean_data['blueAssists'] - clean_data['redAssists']
clean_data['blueHeraldsDiff'] = clean_data['blueHeralds'] - clean_data['redHeralds']
clean_data['blueDragonsDiff'] = clean_data['blueDragons'] - clean_data['redDragons']
clean_data['blueTowersDestroyedDiff'] = clean_data['blueTowersDestroyed'] - clean_data['redTowersDestroyed']
clean_data['EliteMonstersDiff'] = clean_data['blueEliteMonsters'] - clean_data['redEliteMonsters']
clean_data=clean_data.drop(['blueWardsPlaced'], axis=1)
clean_data=clean_data.drop(['redWardsPlaced'], axis=1)
clean_data=clean_data.drop(['blueWardsDestroyed'], axis=1)
clean_data=clean_data.drop(['redWardsDestroyed'], axis=1)
clean_data=clean_data.drop(['blueAssists'], axis=1)
clean_data=clean_data.drop(['redAssists'], axis=1)
clean_data=clean_data.drop(['blueHeralds'], axis=1)
clean_data=clean_data.drop(['redHeralds'], axis=1)
clean_data=clean_data.drop(['blueTowersDestroyed'], axis=1)
clean_data=clean_data.drop(['redTowersDestroyed'], axis=1)
clean_data=clean_data.drop(['blueDragons'], axis=1)
clean_data=clean_data.drop(['redDragons'], axis=1)
clean_data=clean_data.drop(['blueEliteMonsters'], axis=1)
clean_data=clean_data.drop(['redEliteMonsters'], axis=1)
clean_data=clean_data.drop(['redTotalGold'], axis=1)#红队金币数可由蓝队金币数与差值一起得到,故删去
clean_data=clean_data.drop(['redTotalExperience'], axis=1)#红队经验可由蓝队经验与差值一起得到,故删去
  • blueFirstBlood,blueDragonsDiff ,EliteMonstersDiff分析:

    #一血,龙与精英怪物分析
    sns.catplot(x="blueWins", y="blueGoldDiff", hue="blueFirstBlood", data=clean_data)
    plt.savefig('一血.jpg', bbox_inches='tight')
    sns.catplot(x="blueWins", y="blueGoldDiff", hue="blueDragonsDiff", data=clean_data)
    plt.savefig('龙.jpg', bbox_inches='tight')
    sns.catplot(x="blueWins", y="blueGoldDiff", hue="EliteMonstersDiff", data=clean_data)
    plt.savefig('精英怪物.jpg', bbox_inches='tight')
    
    

    一血:

在这里插入图片描述

击杀龙的差值:

在这里插入图片描述

击杀精英怪物的差值:

在这里插入图片描述

2.数据建模:

处理后数据:

最终数据: Index(['blueWins',       #蓝方是否获得胜利  1:胜利  0:失败   *****因变量****
       'blueFirstBlood',          #蓝方是否获得一血(游戏的第一杀)   1:获得 0:未获得
       'blueKills',               #蓝队杀死的敌人数量
       'blueDeaths',              #死亡人数(蓝队)
       'blueTotalGold',           #蓝队总的金币数量
       'blueTotalExperience',     #蓝队总的经验
       'blueGoldDiff',            #蓝队与红队金币差值
       'blueExperienceDiff',      #蓝队与红队经验差值
       'blueMinionsTotales',      #蓝队杀死的野怪和小兵总数量
       'redMinionsTotales',       #红队杀死的野怪和小兵总数量
       'WardsPlacedDiff',         #两队在地图上放置的视野眼数量差异
       'WardsDestroyedDiff',      #两队在地图上摧毁的视野眼数量差异
       'AssistsDiff',             #两队助攻差异
       'blueHeraldsDiff',         #两队杀死的精英怪物数量差异
       'blueDragonsDiff',         #两队杀死的龙数量差异
       'blueTowersDestroyedDiff', #两队摧毁防御塔数量差异
       'EliteMonstersDiff'],      #两队杀死的精锐怪物数量(龙与先驱队)差异
      dtype='object')
  
      blueWins  blueFirstBlood  blueKills  blueDeaths  blueTotalGold  \
0            0               1          9           6          17210   
1            0               0          5           5          14712   
2            0               0          7          11          16113   
3            0               0          4           5          15157   
4            0               0          6           6          16400   
        ...             ...        ...         ...            ...   
9874         1               1          7           4          17765   
9875         1               0          6           4          16238   
9876         0               0          6           7          15903   
9877         0               1          2           3          14459   
9878         1               1          6           6          16266   
      blueTotalExperience  blueGoldDiff  blueExperienceDiff  \
0                   17039           643                  -8   
1                   16265         -2908               -1173   
2                   16221         -1172               -1033   
3                   17954         -1321                  -7   
4                   18543         -1004                 230   
                   ...           ...                 ...   
9874                18967          2519                2469   
9875                19255           782                 888   
9876                18032         -2416               -1877   
9877                17229          -839               -1085   
9878                17321           927                 -58   
      blueMinionsTotales  redMinionsTotales  WardsPlacedDiff  \
0                    231                252               13   
1                    217                292                0   
2                    232                231                0   
3                    256                282               28   
4                    267                292               58   
                  ...                ...              ...   
9874                 280                263              -29   
9875                 281                262               42   
9876                 255                321                9   
9877                 272                287              -52   
9878                 251                247                9   
      WardsDestroyedDiff  AssistsDiff  blueHeraldsDiff  blueDragonsDiff  \
0                     -4            3                0                0   
1                      0            3               -1               -1   
2                     -3          -10                0                1   
3                     -1           -5                1                0   
4                      2           -1                0               -1   
                  ...          ...              ...              ...   
9874                  -1           -2                0                1   
9875                 -21            5                0                1   
9876                   1           -6                0               -1   
9877                   0            2                0                1   
9878                  -2            1                0               -1   
      blueTowersDestroyedDiff  EliteMonstersDiff  
0                           0                  0  
1                          -1                 -2  
2                           0                  1  
3                           0                  1  
4                           0                 -1  
                       ...                ...  
9874                        0                  1  
9875                        0                  1  
9876                        0                 -1  
9877                        0                  1  
9878                        0                 -1  

逻辑回归简介:

logistic回归又称logistic回归分析,是一种广义的线性回归分析模型,常用于数据挖掘,疾病自动诊断,经济预测等领域。例如,探讨引发疾病的危险因素,并根据危险因素预测疾病发生的概率等。以胃癌病情分析为例,选择两组人群,一组是胃癌组,一组是非胃癌组,两组人群必定具有不同的体征与生活方式等。因此因变量就为是否胃癌,值为“是”或“否”,自变量就可以包括很多了,如年龄、性别、饮食习惯、幽门螺杆菌感染等。自变量既可以是连续的,也可以是分类的。然后通过logistic回归分析,可以得到自变量的权重,从而可以大致了解到底哪些因素是胃癌的危险因素。同时根据该权值可以根据危险因素预测一个人患癌症的可能性。

分析:

在对LOL游戏胜利预测的分析中,有非常多的自变量,而应变量只有blueWins,即游戏是否取得胜利这一个应变量,值为“是”或“否”,因此,宜采用Logistic Regression模型进行分析.

标准化数据:

简介:数据的标准化(normalization)是将数据按比例缩放,使之落入一个小的特定区间。在某些比较和评价的指标处理中经常会用到,去除数据的单位限制,将其转化为无量纲的纯数值,便于不同单位或量级的指标能够进行比较和加权。其中最典型的就是数据的归一化处理,即将数据统一映射到[0,1]区间上。数据集的标准化对于众多机器学习评估器来说是必须的;如果各独立特征不进行标准化,结果标准正态分布数据差距很大:比如使用均值为0、方差为1的高斯分布.

标准化的流程简单来说可以表达为:将数据按其属性(按列进行)减去其均值,然后除以其方差。最后得到的结果是,对每个属性/每列来说所有数据都聚集在0附近,方差值为1

# 创建自定义缩放器类(标准化)
class CustomScaler(BaseEstimator, TransformerMixin):

    # 声明一些基本内容和信息
    def __init__(self, columns, copy=True, with_mean=True, with_std=True):
        # scaler是Standard Scaler对象
        self.scaler = StandardScaler(copy, with_mean, with_std)
        self.columns = columns
        self.mean_ = None
        self.var_ = None

    # 基于StandardScale的拟合方法

    def fit(self, X, y=None):
        self.scaler.fit(X[self.columns], y)
        self.mean_ = np.mean(X[self.columns])
        self.var_ = np.var(X[self.columns])
        return self

    # 进行实际缩放的变换方法

    def transform(self, X, y=None, copy=None):
        # 记录列的初始顺序
        init_col_order = X.columns

        # 缩放创建类实例时选择的所有功能
        X_scaled = pd.DataFrame(self.scaler.transform(X[self.columns]), columns=self.columns)

        # 声明一个包含所有未缩放信息的变量
        X_not_scaled = X.loc[:, ~X.columns.isin(self.columns)]

        # 返回包含所有已缩放要素和所有未缩放要素的数据框
        return pd.concat([X_not_scaled, X_scaled], axis=1)[init_col_order]


# 数据缩放要忽略的列
columns_to_omit = ['blueFirstBlood']  # 忽略一血,因为它是分类变量

# 根据要缩放的列创建列表
columns_to_scale = [x for x in unscaled_inputs.columns.values if x not in columns_to_omit]
blue_scaler = CustomScaler(columns_to_scale)
blue_scaler.fit(unscaled_inputs)
scaled_inputs = blue_scaler.transform(unscaled_inputs)
pd.set_option('display.width', 80)  # 设置Console每一行展示的最大宽度,屏幕一行显示满之后才会进行换行
print("标准化处理后的数据:", scaled_inputs)
标准化处理后的数据:blueFirstBlood  blueKills  blueDeaths  blueTotalGold  \
0                  1   0.935301   -0.046926       0.460179   
1                  0  -0.393216   -0.387796      -1.166792   
2                  0   0.271042    1.657424      -0.254307   
3                  0  -0.725346   -0.387796      -0.876959   
4                  0  -0.061087   -0.046926      -0.067382   
              ...        ...         ...            ...   
9874               1   0.271042   -0.728666       0.821656   
9875               0  -0.061087   -0.728666      -0.172894   
9876               0  -0.061087    0.293944      -0.391082   
9877               1  -1.389604   -1.069536      -1.331573   
9878               1  -0.061087   -0.046926      -0.154657   
      blueTotalExperience  blueGoldDiff  blueExperienceDiff  \
0               -0.740639      0.256228            0.013342   
1               -1.385391     -1.191254           -0.593342   
2               -1.422043     -0.483614           -0.520436   
3                0.021567     -0.544350            0.013863   
4                0.512211     -0.415133            0.137283   
                   ...           ...                 ...   
9874             0.865408      1.020936            1.303263   
9875             1.105315      0.312888            0.479942   
9876             0.086541     -0.990702           -0.959957   
9877            -0.582367     -0.347874           -0.547516   
9878            -0.505730      0.371994           -0.012696   
      blueMinionsTotales  redMinionsTotales  WardsPlacedDiff  \
0              -1.419968          -0.651842         0.503853   
1              -1.968987           0.912988         0.003069   
2              -1.380753          -1.473378         0.003069   
3              -0.439577           0.521780         1.081682   
4              -0.008205           0.912988         2.237338   
                  ...                ...              ...   
9874            0.501598          -0.221514        -1.114066   
9875            0.540814          -0.260635         1.620988   
9876           -0.478793           2.047489         0.349766   
9877            0.187873           0.717384        -2.000069   
9878           -0.635655          -0.847446         0.349766   
      WardsDestroyedDiff  AssistsDiff  blueHeraldsDiff  blueDragonsDiff  \
0              -1.436801     0.523196        -0.047412         0.058162   
1              -0.035635     0.523196        -1.744448        -1.079624   
2              -1.086510    -1.731206        -0.047412         1.195948   
3              -0.385927    -0.864129         1.649624         0.058162   
4               0.664947    -0.170466        -0.047412        -1.079624   
                  ...          ...              ...              ...   
9874           -0.385927    -0.343882        -0.047412         1.195948   
9875           -7.391756     0.870027        -0.047412         1.195948   
9876            0.314656    -1.037544        -0.047412        -1.079624   
9877           -0.035635     0.349780        -0.047412         1.195948   
9878           -0.736218     0.176365        -0.047412        -1.079624   
      blueTowersDestroyedDiff  EliteMonstersDiff  
0                   -0.025866           0.021707  
1                   -3.104510          -1.851163  
2                   -0.025866           0.958142  
3                   -0.025866           0.958142  
4                   -0.025866          -0.914728  
                       ...                ...  
9874                -0.025866           0.958142  
9875                -0.025866           0.958142  
9876                -0.025866          -0.914728  
9877                -0.025866           0.958142  
9878                -0.025866          -0.914728  

FutureWarning警告不影响代码运行,可忽略

数据切片:
#数据切片
x_train, x_test, y_train, y_test = train_test_split(scaled_inputs, target, train_size=0.8, random_state=2)
print("训练数据:",x_train.shape,y_train.shape,"测试数据:",x_test.shape,y_test.shape)
训练数据: (7903, 16) (7903, 1) 测试数据 (1976, 16) (1976, 1)
模型训练分析:
#模型训练
reg = LogisticRegression()
reg.fit(x_train, y_train)
#创建一个汇总表以可视化变量以及各自的系数和几率
variables = unscaled_inputs.columns.values
summary_table = pd.DataFrame(columns=['Variables'], data = variables)
summary_table['Coef'] = np.transpose(reg.coef_)
# add the intercept at index 0
summary_table.index = summary_table.index + 1
summary_table.loc[0] = ['Intercept', reg.intercept_[0]]
# calculate the Odds Ratio and add to the table
summary_table['Odds Ratio'] = np.exp(summary_table.Coef)
summary_table.sort_values(by=['Odds Ratio'], ascending=False)

可视化变量:

模型变量评价:       Variables      Coef  Odds Ratio
6              blueGoldDiff  1.211278    3.357772
7        blueExperienceDiff  0.473859    1.606180
14          blueDragonsDiff  0.181283    1.198754
9         redMinionsTotales  0.156352    1.169237
16        EliteMonstersDiff  0.143389    1.154178
3                blueDeaths  0.071080    1.073667
4             blueTotalGold  0.064979    1.067136
1            blueFirstBlood  0.062815    1.064830
11       WardsDestroyedDiff  0.031517    1.032019
10          WardsPlacedDiff  0.002818    1.002822
5       blueTotalExperience -0.002409    0.997594
0                 Intercept -0.029890    0.970552
13          blueHeraldsDiff -0.045495    0.955524
8        blueMinionsTotales -0.079767    0.923332
12              AssistsDiff -0.092587    0.911570
15  blueTowersDestroyedDiff -0.110353    0.895518
2                 blueKills -0.114022    0.892239
数据测试:
# 模型测试
print("训练数据评分:", reg.score(x_train, y_train))
print("训练数据评分:", reg.score(x_test, y_test))
#将测试结果写入到原始数据集中
predicted_prob = reg.predict_proba(x_test)
data['predicted'] = reg.predict_proba(scaled_inputs)[:, 1]
print("经过预测后的包含预测结果的完整数据集:", data)
#原始数据和胜率分析对比
col_n = ['blueWins','predicted']
a = pd.DataFrame(data,columns = col_n)
print("原始数据和胜率分析对比:", a)
训练数据评分: 0.7327597115019613
训练数据评分: 0.7358299595141701

可见两个数据的模型评分都非常相似,说明模型拟合得还不错

结果分析:

在LOL等MOBA(多人在线战斗竞技场)中,一局游戏的胜利受到非常多因素的影响,在这类团队竞技游戏中,游戏的胜利与否非常考验玩家的操作,意识和相互之间的配合,由于游戏的参与者是人,因此会存在许多未知因素并会受到许多不确定因素的影响,如玩家的心情,状态甚至是网络情况等.

因此,对此类游戏的胜利预测不可能达到100%的准确度,可见以上模型的拟合结果还过得去

结论及规律分析:

由以上的分析可看出,一血,龙,小兵野怪等都对经济有一定的贡献率.在一局比赛的第10分钟时,影响双方胜率最大的因素是blueGoldDiff(经济差),并且当双方的经济差增加一个标准化单位的时候,胜率增加235%,同时,双方的经验差blueExperienceDiff对数据也有很大影响,当双方经验差增加一个标准化单位的时候,胜率增加60.6%

在比赛中击杀一条龙的时候,可以为你的胜率增加20%左右,同时击杀EliteMonstersDiff(精英怪物)也可以为比赛带来比较大的助力.但在分析结果中也出现了一些反常现象,在一局比赛中,如果对方击杀的野怪和小兵数redMinionsTotales越高,我方击杀的野怪和小兵数blueMinionsTotales越少,则我方的胜率越高,我猜测可能是由于胜率高的一方在前期更喜欢进行团战取得经济优势的原因,在比赛中blueKills击杀数反而与胜率呈反比,我猜测可能是由于这是一个推塔游戏,而击杀数比较高的一些队伍可能更关注于击杀而忽略了推塔

因此,在游戏时要注意与对方拉开经济差,经济差越大,则胜率越高.

原始数据集及胜率预测结果:

经过预测后的包含预测结果的完整数据集:       blueWins  blueWardsPlaced  blueWardsDestroyed  blueFirstBlood  \
0            0               28                   2               1   
1            0               12                   1               0   
2            0               15                   0               0   
3            0               43                   1               0   
4            0               75                   4               0   
        ...              ...                 ...             ...   
9874         1               17                   2               1   
9875         1               54                   0               0   
9876         0               23                   1               0   
9877         0               14                   4               1   
9878         1               18                   0               1   
      blueKills  blueDeaths  blueAssists  blueEliteMonsters  blueDragons  \
0             9           6           11                  0            0   
1             5           5            5                  0            0   
2             7          11            4                  1            1   
3             4           5            5                  1            0   
4             6           6            6                  0            0   
         ...         ...          ...                ...          ...   
9874          7           4            5                  1            1   
9875          6           4            8                  1            1   
9876          6           7            5                  0            0   
9877          2           3            3                  1            1   
9878          6           6            5                  0            0   
      blueHeralds  blueTowersDestroyed  blueTotalGold  blueAvgLevel  \
0               0                    0          17210           6.6   
1               0                    0          14712           6.6   
2               0                    0          16113           6.4   
3               1                    0          15157           7.0   
4               0                    0          16400           7.0   
           ...                  ...            ...           ...   
9874            0                    0          17765           7.2   
9875            0                    0          16238           7.2   
9876            0                    0          15903           7.0   
9877            0                    0          14459           6.6   
9878            0                    0          16266           7.0   
      blueTotalExperience  blueTotalMinionsKilled  \
0                   17039                     195   
1                   16265                     174   
2                   16221                     186   
3                   17954                     201   
4                   18543                     210   
                   ...                     ...   
9874                18967                     211   
9875                19255                     233   
9876                18032                     210   
9877                17229                     224   
9878                17321                     207   
      blueTotalJungleMinionsKilled  blueGoldDiff  blueExperienceDiff  \
0                               36           643                  -8   
1                               43         -2908               -1173   
2                               46         -1172               -1033   
3                               55         -1321                  -7   
4                               57         -1004                 230   
                            ...           ...                 ...   
9874                            69          2519                2469   
9875                            48           782                 888   
9876                            45         -2416               -1877   
9877                            48          -839               -1085   
9878                            44           927                 -58   
      blueCSPerMin  blueGoldPerMin  redWardsPlaced  redWardsDestroyed  \
0             19.5          1721.0              15                  6   
1             17.4          1471.2              12                  1   
2             18.6          1611.3              15                  3   
3             20.1          1515.7              15                  2   
4             21.0          1640.0              17                  2   
            ...             ...             ...                ...   
9874          21.1          1776.5              46                  3   
9875          23.3          1623.8              12                 21   
9876          21.0          1590.3              14                  0   
9877          22.4          1445.9              66                  4   
9878          20.7          1626.6               9                  2   
      redFirstBlood  redKills  redDeaths  redAssists  redEliteMonsters  \
0                 0         6          9           8                 0   
1                 1         5          5           2                 2   
2                 1        11          7          14                 0   
3                 1         5          4          10                 0   
4                 1         6          6           7                 1   
             ...       ...        ...         ...               ...   
9874              0         4          7           7                 0   
9875              1         4          6           3                 0   
9876              1         7          6          11                 1   
9877              0         3          2           1                 0   
9878              0         6          6           4                 1   
      redDragons  redHeralds  redTowersDestroyed  redTotalGold  redAvgLevel  \
0              0           0                   0         16567          6.8   
1              1           1                   1         17620          6.8   
2              0           0                   0         17285          6.8   
3              0           0                   0         16478          7.0   
4              1           0                   0         17404          7.0   
          ...         ...                 ...           ...          ...   
9874           0           0                   0         15246          6.8   
9875           0           0                   0         15456          7.0   
9876           1           0                   0         18319          7.4   
9877           0           0                   0         15298          7.2   
9878           1           0                   0         15339          6.8   
      redTotalExperience  redTotalMinionsKilled  redTotalJungleMinionsKilled  \
0                  17047                    197                           55   
1                  17438                    240                           52   
2                  17254                    203                           28   
3                  17961                    235                           47   
4                  18313                    225                           67   
                  ...                    ...                          ...   
9874               16498                    229                           34   
9875               18367                    206                           56   
9876               19909                    261                           60   
9877               18314                    247                           40   
9878               17379                    201                           46   
      redGoldDiff  redExperienceDiff  redCSPerMin  redGoldPerMin  predicted  
0            -643                  8         19.7         1656.7   0.549459  
1            2908               1173         24.0         1762.0   0.170271  
2            1172               1033         20.3         1728.5   0.387228  
3            1321                  7         23.5         1647.8   0.393685  
4            1004               -230         22.5         1740.4   0.356486  
           ...                ...          ...            ...        ...  
9874        -2519              -2469         22.9         1524.6   0.892957  
9875         -782               -888         20.6         1545.6   0.610317  
9876         2416               1877         26.1         1831.9   0.178771  
9877          839               1085         24.7         1529.8   0.433183  
9878         -927                 58         20.1         1533.9   0.511128  

原始数据和胜率分析对比:

       blueWins  predicted
0            0   0.549459
1            0   0.170271
2            0   0.387228
3            0   0.393685
4            0   0.356486
        ...        ...
9874         1   0.892957
9875         1   0.610317
9876         0   0.178771
9877         0   0.433183
9878         1   0.511128

1762.0 0.170271
2 1172 1033 20.3 1728.5 0.387228
3 1321 7 23.5 1647.8 0.393685
4 1004 -230 22.5 1740.4 0.356486
… … … … …
9874 -2519 -2469 22.9 1524.6 0.892957
9875 -782 -888 20.6 1545.6 0.610317
9876 2416 1877 26.1 1831.9 0.178771
9877 839 1085 24.7 1529.8 0.433183
9878 -927 58 20.1 1533.9 0.511128


**原始数据和胜率分析对比:**

```python
       blueWins  predicted
0            0   0.549459
1            0   0.170271
2            0   0.387228
3            0   0.393685
4            0   0.356486
        ...        ...
9874         1   0.892957
9875         1   0.610317
9876         0   0.178771
9877         0   0.433183
9878         1   0.511128

源码:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn.preprocessing import StandardScaler
from sklearn.base import BaseEstimator, TransformerMixin

plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False

pd.set_option('display.max_columns', None)  # 显示所有列
pd.set_option('mode.chained_assignment', None)  # 关闭警告
# pd.set_option('display.width', 100)        #设置Console每一行展示的最大宽度,屏幕一行显示满之后才会进行换行

data = pd.read_csv('high_diamond_ranked_10min.csv')

print("前五行数据:", data.head())
print("数据形状:", data.shape)
pd.set_option('display.width', 10)  # 设置Console每一行展示的最大宽度,屏幕一行显示满之后才会进行换行
print("数据列名:", data.columns)
pd.set_option('display.width', 80)  # 设置Console每一行展示的最大宽度,屏幕一行显示满之后才会进行换行
print("数据概览:", data.describe())
print("数据概览:", data.info())

data.dropna(axis=0, how='any', inplace=True)  # 删除有空值的行
data = data.drop(['gameId'], axis=1)  # 删除gameId

plt.figure(figsize=(18, 15))
sns.heatmap(round(data.corr(), 1), cmap="coolwarm", annot=True, linewidths=.5)  # 相关性(-1,1)
plt.savefig('热力图相关性分析.jpg', bbox_inches='tight')


# data.corr():计算列与列之间的相关系数,返回相关系数矩阵
# sns.heatmap():利用seaborn绘制个变量之间相关性的热力图

# 定义一个函数 作用:找出相关系数矩阵中相关性大的一组数据,同时返回其中一列数据
def remove_redundancy(r):
    to_remove = []
    for i in range(len(r.columns)):
        for j in range(i):
            if (abs(r.iloc[i, j]) >= 1 and (r.columns[j] not in to_remove)):
                print("相关性:", r.iloc[i, j], r.columns[j], r.columns[i])
                to_remove.append(r.columns[i])
    return to_remove


clean_data = data.drop(remove_redundancy(data.corr()), axis=1)  # 删去相关性较高项

pd.set_option('display.width', 10)  # 设置Console每一行展示的最大宽度,屏幕一行显示满之后才会进行换行
print("初步处理后的数据:", clean_data.columns)

# 将击杀野怪和小兵数合并:
clean_data['blueMinionsTotales'] = clean_data['blueTotalMinionsKilled'] + clean_data['blueTotalJungleMinionsKilled']
clean_data['redMinionsTotales'] = clean_data['redTotalMinionsKilled'] + clean_data['redTotalJungleMinionsKilled']
clean_data = clean_data.drop(['blueTotalMinionsKilled'], axis=1)
clean_data = clean_data.drop(['blueTotalJungleMinionsKilled'], axis=1)
clean_data = clean_data.drop(['redTotalMinionsKilled'], axis=1)
clean_data = clean_data.drop(['redTotalJungleMinionsKilled'], axis=1)

# 等级和经验分析:
plt.figure(figsize=(12, 12))
plt.subplot(121)
sns.scatterplot(x='blueAvgLevel', y='blueTotalExperience', hue='blueWins', data=clean_data)
plt.title('blue')
plt.xlabel('blueAvgLevel')
plt.ylabel('blueTotalExperience')
plt.grid(True)
plt.subplot(122)
sns.scatterplot(x='redAvgLevel', y='redTotalExperience', hue='blueWins', data=clean_data)
plt.title('red')
plt.xlabel('redAvgLevel')
plt.ylabel('redTotalExperience')
plt.grid(True)
plt.savefig('等级和经验分析.jpg', bbox_inches='tight')

# 删去等级列
clean_data = clean_data.drop(['blueAvgLevel'], axis=1)
clean_data = clean_data.drop(['redAvgLevel'], axis=1)

sns.set(font_scale=1.5)
plt.figure(figsize=(20, 20))
sns.set_style("whitegrid")

# 击杀和被击杀数绘制散点图
plt.subplot(321)
sns.scatterplot(x='blueKills', y='blueDeaths', hue='blueWins', data=clean_data)
plt.title('blueKills&&blueDeaths')
plt.xlabel('blueKills')
plt.ylabel('blueDeaths')
plt.grid(True)

# 助攻数绘制散点图
plt.subplot(322)
sns.scatterplot(x='blueAssists', y='redAssists', hue='blueWins', data=clean_data)
plt.title('Assists')
plt.xlabel('blueAssists')
plt.ylabel('redAssists')
plt.tight_layout(pad=1.5)
plt.grid(True)

# 双方金币数绘制散点图
plt.subplot(323)
sns.scatterplot(x='blueTotalGold', y='redTotalGold', hue='blueWins', data=clean_data)
plt.title('TotalGold')
plt.xlabel('blueTotalGold')
plt.ylabel('redTotalGold')
plt.tight_layout(pad=1.5)
plt.grid(True)

# 双方经验绘制散点图
plt.subplot(324)
sns.scatterplot(x='blueTotalExperience', y='redTotalExperience', hue='blueWins', data=clean_data)
plt.title('Experience')
plt.xlabel('blueTotalExperience')
plt.ylabel('redTotalExperience')
plt.tight_layout(pad=1.5)
plt.grid(True)

# 双方插眼数量绘制散点图
plt.subplot(325)
sns.scatterplot(x='blueWardsPlaced', y='redWardsPlaced', hue='blueWins', data=clean_data)
plt.title('WardsPlaced')
plt.xlabel('blueWardsPlaced')
plt.ylabel('redWardsPlaced')
plt.tight_layout(pad=1.5)
plt.grid(True)

# 击杀的小兵和野怪总数绘制散点图
plt.subplot(326)
sns.scatterplot(x='blueMinionsTotales', y='redMinionsTotales', hue='blueWins', data=clean_data)
plt.title('MinionsTotales')
plt.xlabel('blueMinionsTotales')
plt.ylabel('redMinionsTotales')
plt.tight_layout(pad=1.5)
plt.grid(True)
plt.savefig('数据分析.jpg', bbox_inches='tight')

# 将一些数据转换为它们的差值:
clean_data['WardsPlacedDiff'] = clean_data['blueWardsPlaced'] - clean_data['redWardsPlaced']
clean_data['WardsDestroyedDiff'] = clean_data['blueWardsDestroyed'] - clean_data['redWardsDestroyed']
clean_data['AssistsDiff'] = clean_data['blueAssists'] - clean_data['redAssists']
clean_data['blueHeraldsDiff'] = clean_data['blueHeralds'] - clean_data['redHeralds']
clean_data['blueDragonsDiff'] = clean_data['blueDragons'] - clean_data['redDragons']
clean_data['blueTowersDestroyedDiff'] = clean_data['blueTowersDestroyed'] - clean_data['redTowersDestroyed']
clean_data['EliteMonstersDiff'] = clean_data['blueEliteMonsters'] - clean_data['redEliteMonsters']
clean_data = clean_data.drop(['blueWardsPlaced'], axis=1)
clean_data = clean_data.drop(['redWardsPlaced'], axis=1)
clean_data = clean_data.drop(['blueWardsDestroyed'], axis=1)
clean_data = clean_data.drop(['redWardsDestroyed'], axis=1)
clean_data = clean_data.drop(['blueAssists'], axis=1)
clean_data = clean_data.drop(['redAssists'], axis=1)
clean_data = clean_data.drop(['blueHeralds'], axis=1)
clean_data = clean_data.drop(['redHeralds'], axis=1)
clean_data = clean_data.drop(['blueTowersDestroyed'], axis=1)
clean_data = clean_data.drop(['redTowersDestroyed'], axis=1)
clean_data = clean_data.drop(['blueDragons'], axis=1)
clean_data = clean_data.drop(['redDragons'], axis=1)
clean_data = clean_data.drop(['blueEliteMonsters'], axis=1)
clean_data = clean_data.drop(['redEliteMonsters'], axis=1)
clean_data = clean_data.drop(['redTotalGold'], axis=1)  # 红队金币数可由蓝队金币数与差值一起得到,故删去
clean_data = clean_data.drop(['redTotalExperience'], axis=1)  # 红队经验可由蓝队经验与差值一起得到,故删去

# 一血,龙与精英怪物分析
sns.catplot(x="blueWins", y="blueGoldDiff", hue="blueFirstBlood", data=clean_data)
plt.savefig('一血.jpg', bbox_inches='tight')
sns.catplot(x="blueWins", y="blueGoldDiff", hue="blueDragonsDiff", data=clean_data)
plt.savefig('龙.jpg', bbox_inches='tight')
sns.catplot(x="blueWins", y="blueGoldDiff", hue="EliteMonstersDiff", data=clean_data)
plt.savefig('精英怪物.jpg', bbox_inches='tight')

print("最终数据:", clean_data.columns)

# 数据标准化处理  标准化非分类数据
unscaled_inputs = clean_data.filter([
    'blueFirstBlood',
    'blueKills',
    'blueDeaths',
    'blueTotalGold',
    'blueTotalExperience',
    'blueGoldDiff',
    'blueExperienceDiff',
    'blueMinionsTotales',
    'redMinionsTotales',
    'WardsPlacedDiff',
    'WardsDestroyedDiff',
    'AssistsDiff',
    'blueHeraldsDiff',
    'blueDragonsDiff',
    'blueTowersDestroyedDiff',
    'EliteMonstersDiff'], axis=1)
target = clean_data.filter(['blueWins'])


# 创建自定义缩放器类
class CustomScaler(BaseEstimator, TransformerMixin):

    # 声明一些基本内容和信息
    def __init__(self, columns, copy=True, with_mean=True, with_std=True):
        # scaler是Standard Scaler对象
        self.scaler = StandardScaler(copy, with_mean, with_std)
        self.columns = columns
        self.mean_ = None
        self.var_ = None

    # 基于StandardScale的拟合方法

    def fit(self, X, y=None):
        self.scaler.fit(X[self.columns], y)
        self.mean_ = np.mean(X[self.columns])
        self.var_ = np.var(X[self.columns])
        return self

    # 进行实际缩放的变换方法

    def transform(self, X, y=None, copy=None):
        # 记录列的初始顺序
        init_col_order = X.columns

        # 缩放创建类实例时选择的所有功能
        X_scaled = pd.DataFrame(self.scaler.transform(X[self.columns]), columns=self.columns)

        # 声明一个包含所有未缩放信息的变量
        X_not_scaled = X.loc[:, ~X.columns.isin(self.columns)]

        # 返回包含所有已缩放要素和所有未缩放要素的数据框
        return pd.concat([X_not_scaled, X_scaled], axis=1)[init_col_order]


# 数据缩放要忽略的列
columns_to_omit = ['blueFirstBlood', 'blueDragonsDiff']  # 忽略一血,因为它是分类变量

# 根据要缩放的列创建列表
columns_to_scale = [x for x in unscaled_inputs.columns.values if x not in columns_to_omit]
blue_scaler = CustomScaler(columns_to_scale)
blue_scaler.fit(unscaled_inputs)
scaled_inputs = blue_scaler.transform(unscaled_inputs)
pd.set_option('display.width', 80)  # 设置Console每一行展示的最大宽度,屏幕一行显示满之后才会进行换行
print("标准化处理后的数据:", scaled_inputs)

# 数据切片
x_train, x_test, y_train, y_test = train_test_split(scaled_inputs, target, train_size=0.8, random_state=2)
print("训练数据:", x_train.shape, y_train.shape, "测试数据:", x_test.shape, y_test.shape)

# 模型训练
reg = LogisticRegression()
reg.fit(x_train, y_train)
# 创建一个汇总表以可视化变量以及各自的系数和几率
variables = unscaled_inputs.columns.values
intercept = reg.intercept_  # 截距
summary_table = pd.DataFrame(columns=['Variables'], data=variables)
summary_table['Coef'] = np.transpose(reg.coef_)
summary_table.index = summary_table.index + 1
summary_table.loc[0] = ['Intercept', reg.intercept_[0]]
summary_table['Odds Ratio'] = np.exp(summary_table.Coef)
summary_table.sort_values(by=['Odds Ratio'], ascending=False)
print("模型变量评价:", summary_table.sort_values(by=['Odds Ratio'], ascending=False))

# 模型测试
print("训练数据评分:", reg.score(x_train, y_train))
print("训练数据评分:", reg.score(x_test, y_test))
# 将测试结果写入到原始数据集中
predicted_prob = reg.predict_proba(x_test)
data['predicted'] = reg.predict_proba(scaled_inputs)[:, 1]
print("经过预测后的包含预测结果的完整数据集:", data)

# 原始数据和胜率分析对比
col_n = ['blueWins', 'predicted']
a = pd.DataFrame(data, columns=col_n)
print("原始数据和胜率分析对比:", a)

数据下载地址:

链接:https://pan.baidu.com/s/1PDG8DruKsZWex8xoGROZ2Q
提取码:h3qa

  • 16
    点赞
  • 103
    收藏
    觉得还不错? 一键收藏
  • 8
    评论
评论 8
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值