1 什么是Seaborn
Seaborn是基于matplotlib的图形可视化python包。它提供了一种高度交互式界面,便于用户能够做出各种有吸引力的统计图表。
Seaborn是在matplotlib的基础上进行了更高级的API封装,从而使得作图更加容易,在大多数情况下使用seaborn能做出很具有吸引力的图,而使用matplotlib就能制作具有更多特色的图。应该把Seaborn视为matplotlib的补充,而不是替代物。同时它能高度兼容numpy与pandas数据结构以及scipy与statsmodels等统计模式。
2 条形图
API:
seaborn.barplot(x=None, y=None, hue=None, data=None, order=None, hue_order=None, estimator=<function mean>, ci=95, n_boot=1000, units=None, orient=None, color=None, palette=None, saturation=0.75, errcolor='.26', errwidth=None, capsize=None, dodge=True, ax=None, **kwargs)
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(8)
y = np.array([1, 5, 3, 6, 8, 9, 4, 7])
df = pd.DataFrame({"X-axis": x, "Y-axis": y})
print(df)
sns.barplot("X-axis", "Y-axis", palette="muted", data=df)
plt.xticks(rotation=45)
plt.show()
结果:
palette改变样式,
例如:palette="RdBu_r"
2、属性 相关性的 热图
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
def correclation_map(df, columns, figsize=(15, 10)):
correclation = (df.loc[:, columns]).corr()
print(correclation)
fig, ax = plt.subplots(figsize=figsize)
sns.heatmap(correclation, annot=True, ax=ax)
plt.show()
x = np.arange(4)
y = np.array([1, 5, 3, 6])
z = np.array([1, 1, 1, 10])
D = np.array([1, 10, 110, 10])
df = pd.DataFrame({"R": x, "P": y, "F1": z, "D": D})
columns = ["R", "P", "F1", "D"]
correclation_map(df, columns)
3、计算target 相同标签的个数,画成条形图
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
y = np.array([1, 5, 3, 6, 3, 6, 9, 10, 6, 5, 1, 1, 1])
df = pd.DataFrame({"P": y})
sns.countplot(df['P'])
plt.show()
4、箱型图
箱形图(英文:Box plot),又称为盒须图、盒式图、盒状图或箱线图,是一种用作显示一组数据分散情况资料的统计图。因型状如箱子而得名。在各种领域也经常被使用,常见于品质管理,快速识别异常值。
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="ticks")
f, ax = plt.subplots(figsize=(7, 6))
# ax.set_xscale("log")
distance = np.array([10, 1, 2, 1, 3, 2, 1, 3, 4, 10])
method = np.array(["WA", "WB", "WA", "WA", "WA", "WA", "WA", "WA", "WA", "WB"])
planets = pd.DataFrame({"distance": distance, "method": method})
# Plot the orbital period with horizontal boxes
sns.boxplot(x="distance", y="method", data=planets,
whis="range", palette="vlag")
plt.show()