描述数据的统计信息
- 准备数据
准备数据Training_Master.csv”,将数据文件Training_Master.csv放到Linux本地的/course/DataAnalyze/data目录。
- 描述统计
Describe函数能够一次性得出数据框所有数值型特征的非空值数目、均值、四分位数、标准差。具体实现代码和结果如代码 42所示。
In[4]: | print('P2P网络贷款主表数据的描述性统计为:\n',master.describe()) |
Out[4]: | P2P网络贷款主表数据的描述性统计为: Idx UserInfo_1 UserInfo_3 WeblogInfo_1 WeblogInfo_2 \ count 30000.000000 29994.000000 29993.000000 970.000000 28342.000000 mean 46318.673267 3.219911 4.694329 2.201031 0.131466 std 26640.397805 1.827684 1.321458 7.831679 0.358486 min 3.000000 0.000000 0.000000 1.000000 0.000000 25% 22924.250000 1.000000 4.000000 1.000000 0.000000 50% 46849.500000 3.000000 5.000000 1.000000 0.000000 75% 69447.250000 5.000000 5.000000 1.000000 0.000000 max 91703.000000 7.000000 7.000000 133.000000 4.000000 WeblogInfo_3 WeblogInfo_4 WeblogInfo_5 WeblogInfo_6 WeblogInfo_7 \ count 970.000000 28349.000000 28349.000000 28349.000000 30000.000000 mean 1.308247 3.025962 1.816960 2.948711 10.632800 std 7.866457 3.772421 1.701177 3.770300 16.097588 min 0.000000 1.000000 1.000000 1.000000 0.000000 25% 0.000000 1.000000 1.000000 1.000000 2.000000 50% 0.000000 2.000000 1.000000 2.000000 6.000000 75% 1.000000 3.000000 2.000000 3.000000 13.000000 max 133.000000 165.000000 73.000000 165.000000 722.000000 ... SocialNetwork_9 SocialNetwork_10 SocialNetwork_11 \ count ... 30000.000000 30000.000000 30000.000000 mean ... 35.516167 75.211233 -0.999267 std ... 135.954587 742.978305 0.052911 min ... -1.000000 -1.000000 -1.000000 25% ... -1.000000 -1.000000 -1.000000 50% ... -1.000000 -1.000000 -1.000000 75% ... -1.000000 -1.000000 -1.000000 max ... 3242.000000 71253.000000 6.000000 SocialNetwork_12 SocialNetwork_13 SocialNetwork_14 SocialNetwork_15 \ count 30000.000000 30000.000000 30000.000000 30000.000000 mean -0.745033 0.221167 0.062033 0.027967 std 0.441473 0.420545 0.242598 0.164880 min -1.000000 0.000000 0.000000 0.000000 25% -1.000000 0.000000 0.000000 0.000000 50% -1.000000 0.000000 0.000000 0.000000 75% 0.000000 0.000000 0.000000 0.000000 max 1.000000 2.000000 3.000000 1.000000 SocialNetwork_16 SocialNetwork_17 target count 30000.000000 30000.000000 30000.000000 mean 0.016633 0.253467 0.073267 std 0.127895 0.437296 0.260578 min 0.000000 0.000000 0.000000 25% 0.000000 0.000000 0.000000 50% 0.000000 0.000000 0.000000 75% 0.000000 1.000000 0.000000 max 1.000000 3.000000 1.000000 |