以下练习数据来源均为seaborn库中提供,通过网络获取,如果出现网络获取慢或者失败的情况,可以到GitHub上搜索seaborn-data,下载后传入读取路径即可
from matplotlib import pyplot as pltimport seaborn as snsimport numpy as np
泰坦尼克号海难幸存状况分析
data = sns.load_dataset("titanic", data_home='/Volumes/Code/notebooks/seaborn-data')data.head()
![3accc40d2986d6ec2ad4b453eede9d8e.png](https://i-blog.csdnimg.cn/blog_migrate/3786408f3d6e0a312f25129d05b45e9a.jpeg)
不同仓位等级中幸存和遇难的乘客比例
pclasses = [] surviveds = [[], []] # 按等级分组,然后计算不同幸存者的数量 for pclass, items in data.groupby(by=['class']): pclasses.append(pclass) count0 = items[items['survived'] == 0]['survived'].count() count1 = items[items['survived'] == 1]['survived'].count() surviveds[0].append(count0) surviveds[1].append(count1) # 分别绘制不同幸存者图 plt.bar(pclasses, surviveds[0], color='r', width=0.3) plt.bar(pclasses, surviveds[1], bottom=surviveds[0], color='g', width=0.3) # 增加文字说明 for i, pclass in enumerate(pclasses): total = surviveds[0][i] + surviveds[1][i] plt.text(pclass, surviveds[0][i] // 2, '%.2f%%' % ((surviveds[0][i] / total) * 100), ha='center') plt.text(pclass, surviveds[0][i] + surviveds[1][i] // 2, '%.2f%%' % ((surviveds[1][i] / total) * 100), ha='center') plt.xticks(pclasses, pclasses) plt.ylim([0, 600]) plt.legend(['die', 'survive'], loc='upper right') plt.grid(axis='y', color='gray', linestyle=':', linewidth=2) plt.show()
![7467923111a2151f892f7d320c0b5505.png](https://i-blog.csdnimg.cn/blog_migrate/718d345246493225742c7cd25ca5d2ee.jpeg)
从图中可以看出,低等舱的人死亡最多