介绍(Introduction)
The goal of this article is to introduce you to the most common categorical plots using Seaborn’s
catplot()
function.本文的目的是使用Seaborn的
catplot()
函数向您介绍最常见的分类图。
While doing Exploratory or Explanatory data analysis, you will have to choose from a wide range of plot types. Choosing one which depicts the relationships in your data accurately can be tricky.
在进行探索性或解释性数据分析时,您将不得不从多种绘图类型中进行选择。 选择一个准确地描述数据中的关系的方法可能很棘手。
If you are working with data that involves any categorical variables like survey responses, your best tools to visualize and compare different features of your data would be categorical plots. Fortunately, a data visualization library Seaborn encompasses several types of categorical plots into a single function: catplot()
.
如果您使用的数据涉及任何分类变量,例如调查答复,那么可视化和比较数据不同特征的最佳工具就是分类图。 幸运的是,一个数据可视化库Seaborn在单个函数中包含了几种类型的分类图: catplot()
。
Seaborn library offers many advantages over other plotting libraries:
与其他绘图库相比,Seaborn库具有许多优势:
1. It is very easy to use and requires less code syntax
2. Works really well with `pandas` data structures, which is just what you need as a data scientist.
3. It is built on top of Matplotlib, another vast and deep data visualization library.
BTW, my golden rule for Data Visualization is “Do it in Seabron if you can do it in Seaborn”.
顺便说一句,我对数据可视化的黄金法则是“如果可以在Seaborn中做到,那就在Seabron中做到”。
In SB’s (I will be abbreviating from now on) documentation, it states that catplot()
function includes 8 different types of categorical plots. But in this guide, I will cover the three most common plots: count plots, bar plots, and box plots.
在SB的文档(我将从现在开始简称)中,该文档指出catplot()
函数包括8种不同类型的分类图。 但是在本指南中,我将介绍三个最常见的图:计数图,条形图和箱形图。
总览 (Overview)
I. Introduction II. SetupIII. Seaborn Count Plot
1. Changing the order of categories IV. Seaborn Bar Plot
1. Confidence intervals in a bar plot
2. Changing the orientation in bar plots V. Seaborn Box Plot
1. Overall understanding
2. Working with outliers
3. Working with whiskers VI. Conclusion
You can get the sample data and the notebook of the article on this GitHub repo.
您可以在此GitHub存储库上获得示例数据和文章的笔记本。
建立 (Setup)
If you have not SB already installed, you can install it using pip
along with other libraries we will be using:
如果尚未安装SB,则可以使用pip
以及我们将要使用的其他库来安装它:
pip install numpy pandas seaborn matplotlib
# Load necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
# Plotting pretty figures and avoid blurry images
%config InlineBackend.figure_format = 'retina'
# Larger scale for plots in notebooks
sns.set_context('notebook')
# Ignore warnings
import warnings
warnings.filterwarnings('ignore')
# Enable multiple cell outputs
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = 'all'
If you are wondering why we don’t alias Seaborn as sb
like a normal person, that's because the initials sns
were named after a fictional character Samuel Norm