Part 2 of Guide to statistics using Numpy series
使用Numpy系列进行统计指南的第2部分
介绍: (Introduction:)
When we first look at a dataset, we want to be able to quickly understand certain things about it:
当我们第一次查看数据集时,我们希望能够快速了解有关它的某些信息:
- Do some values occur more often than others? 某些价值观比其他价值观更经常出现吗?
- What is the range of the dataset (i.e., the min and the max values)? 数据集的范围是多少(即最小值和最大值)?
- Are there a lot of outliers? 有很多离群值吗?
We can visualize this information using a chart called a histogram.
我们可以使用称为直方图的图表来可视化此信息。
For instance, suppose that we have the following dataset:
例如,假设我们具有以下数据集:
d = [1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 5]
d = [1,1,1,2,2,2,2,2,3,3,4,4,4,4,4,5]
A simple histogram might show us how many 1’s, 2’s, 3’s, etc. we have in this dataset.
一个简单的直方图可以显示我们在此数据集中有多少个1、2、3等。
When graphed, our histogram would look like this:
绘制图表时,我们的直方图如下所示:
![Image for post](https://i-blog.csdnimg.cn/blog_migrate/32f5de37f095eb641e0c4e5df72b3577.png)