Data Processing and Visulisation with Python
Python Exercise 27
The Data
We will be working with a famous titanic data set for these exercises. Later on in the Machine Learning course if we are able to have that course, we will revisit this data, and use it to predict survival rates of passengers. For now, we’ll just focus on the visualization of the data with seaborn.
Import seaborn and matplotlib.pyplot for later use
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
Set the style of seaborn to whitegrid
sns.set_style('whitegrid')
Load dataset titanic
Try more times if the connection is cut out. Be patient! It is not your fault.
# titanic = sns.load_dataset('titanic')
titanic = pd.read_csv("titanic.csv",index_col=0)
Check the head of the dataset
titanic.head()
Plots
Recreate the plots below using the titanic dataset. There are very few instructions since most of the plots can be done with just one or two lines of code and an instruction or a hint would basically give away the solution. Keep careful attention to the x and y labels for hints.
Do not worry about palette, cmap or color, chose anyone you like if you can not find the same with the output.
sns.jointplot(x = 'fare',y = 'age',data=titanic);
sns.histplot(titanic['fare'],kde=False,bins=30);
#粗细: bins
sns.boxplot(x='class',y='age',data=titanic,palette = 'rainbow');
# 顺序
sns.violinplot(x = 'class',y='age',data=titanic,palette = 'rainbow');
sns.countplot(x='sex',data=titanic);
sns.heatmap(titanic.corr(),cmap='coolwarm');
plt.title('titanic.corr()');
g = sns.FacetGrid(titanic,col='sex')
g.map(plt.hist,'age');