数据来源于kaggle上的FIFA 19 complete player dataset
查看数据
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
data = pd.read_csv('C://Users//Administrator//Desktop//data.csv', encoding = 'utf-8')
print(data.head())
查看各个字段的描述性统计
data.describe()
查看有哪些字段
data.columns
data.info()
字段太多了,截了两张图。。可以看到只有ID、Name等13个字段没有缺失值,其他字段都存在着缺失值。
接下来选取一些感兴趣的字段进行分析
# 选取一些感兴趣的列进行分析
columns = ['Name', 'Age', 'Nationality', 'Overall', 'Potential', 'Club', 'Value', 'Wage', 'Preferred Foot', 'International Reputation', 'Weak Foot','Skill Moves', 'Work Rate', 'Body Type', 'Position','Jersey Number', 'Height', 'Weight', 'Crossing','Finishing', 'HeadingAccuracy', 'ShortPassing', 'Volleys', 'Dribbling','Curve', 'FKAccuracy', 'LongPassing', 'BallControl', 'Acceleration','SprintSpeed', 'Agility', 'Reactions', 'Balance', 'ShotPower','Jumping', 'Stamina', 'Strength', 'LongShots', 'Aggression','Interceptions', 'Positioning', 'Vision', 'Penalties', 'Composure','Marking', 'StandingTackle', 'SlidingTackle', 'GKDiving', 'GKHandling','GKKicking', 'GKPositioning', 'GKReflexes']
df = pd.DataFrame(data, columns = columns)