这十套练习,教你如何用Pandas做数据分析(07)

9 篇文章 2 订阅
9 篇文章 0 订阅

练习7-可视化

探索泰坦尼克灾难数据

步骤1 导入必要的库

运行以下代码

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

%matplotlib inline
步骤2 从以下地址导入数据

运行以下代码

path7 = ‘…/input/pandas_exercise/pandas_exercise/exercise_data/train.csv’ # train.csv
步骤3 将数据框命名为titanic

运行以下代码

titanic = pd.read_csv(path7)
titanic.head()
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
0 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S
1 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th… female 38.0 1 0 PC 17599 71.2833 C85 C
2 3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S
3 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S
4 5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S
步骤4 将PassengerId设置为索引

运行以下代码

titanic.set_index(‘PassengerId’).head()
Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
PassengerId
1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S
2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th… female 38.0 1 0 PC 17599 71.2833 C85 C
3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S
4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S
5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S
步骤5 绘制一个展示男女乘客比例的扇形图

运行以下代码

sum the instances of males and females

males = (titanic[‘Sex’] == ‘male’).sum()
females = (titanic[‘Sex’] == ‘female’).sum()

put them into a list called proportions

proportions = [males, females]

Create a pie chart

plt.pie(
# using proportions
proportions,

# with the labels being officer names
labels = ['Males', 'Females'],

# with no shadows
shadow = False,

# with colors
colors = ['blue','red'],

# with one slide exploded out
explode = (0.15 , 0),

# with the start angle at 90%
startangle = 90,

# with the percent listed as a fraction
autopct = '%1.1f%%'
)

View the plot drop above

plt.axis(‘equal’)

Set labels

plt.title(“Sex Proportion”)

View the plot

plt.tight_layout()
plt.show()

步骤6 绘制一个展示船票Fare, 与乘客年龄和性别的散点图

运行以下代码

creates the plot using

lm = sns.lmplot(x = ‘Age’, y = ‘Fare’, data = titanic, hue = ‘Sex’, fit_reg=False)

set title

lm.set(title = ‘Fare x Age’)

get the axes object and tweak it

axes = lm.axes
axes[0,0].set_ylim(-5,)
axes[0,0].set_xlim(-5,85)
(-5, 85)

步骤7 有多少人生还?

运行以下代码

titanic.Survived.sum()
342
步骤8 绘制一个展示船票价格的直方图

运行以下代码

sort the values from the top to the least value and slice the first 5 items

df = titanic.Fare.sort_values(ascending = False)
df

create bins interval using numpy

binsVal = np.arange(0,600,10)
binsVal

create the plot

plt.hist(df, bins = binsVal)

Set the title and labels

plt.xlabel(‘Fare’)
plt.ylabel(‘Frequency’)
plt.title(‘Fare Payed Histrogram’)

show the plot

plt.show()

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值