python中数据可视化_如何在Python中可视化中心极限定理

最新推荐文章于 2023-01-12 16:37:24 发布

cumian8165

最新推荐文章于 2023-01-12 16:37:24 发布

阅读量599

点赞数

文章标签：可视化 python 机器学习人工智能数据分析

原文链接：https://www.freecodecamp.org/news/how-to-visualize-the-central-limit-theorem-in-python-b619f5b00168/

版权

本文介绍了如何使用Python可视化中心极限定理，即随着样本量增大，样本均值分布接近正态分布。通过模拟投掷骰子的实验，展示了样本均值从分散到集中于期望值（骰子的平均值3.5）的过程，同时验证了大数定律。通过增加样本大小，直方图逐渐呈现正态分布，体现了中心极限定理的概念。

摘要由CSDN通过智能技术生成

python中数据可视化

by Rohan Joseph

罗汉·约瑟夫(Rohan Joseph)

如何在Python中可视化中心极限定理 (How to visualize the Central Limit Theorem in Python)

The Central Limit Theorem states that the sampling distribution of the sample means approaches a normal distribution as the sample size gets larger.

中心极限定理指出，随着样本数量的增加，样本均值的样本分布接近正态分布。

The sample means will converge to a normal distribution regardless of the shape of the population. That is, the population can be positively or negatively skewed, normal or non-normal.

无论总体形状如何，样本均值都将收敛到正态分布。也就是说，人口可以正偏或负偏，正常或不正常。

The Central Limit theorem is closely related to the Law of Large Numbers, which states that:

中心极限定理与大数定律密切相关，该定律指出：

as a sample size grows, the sample mean gets closer to the population mean.

随着样本量的增加，样本均值越来越接近总体均值。

So, how are these two related?

那么，这两个有什么关系？

CLT states that — as the sample size tends to infinity, the shape of the distribution resembles a bell shape (normal distribution). The center of this distribution of the sample means becomes very close to the population mean — which is essentially the law of large numbers.

CLT指出-随着样本数量趋于无穷大，分布的形状类似于钟形(正态分布)。样本均值分布的中心变得非常接近总体均值-本质上是大数定律。

Let’s illustrate this in Python with the classic die roll. Before we simulate, let’s calculate the expected value from a die roll.

让我们用经典的模具来说明这一点。在进行模拟之前，让我们从下模辊计算期望值。

An expected value is the average result of an experiment after a large number of trials.

期望值是经过大量试验后的平均实验结果。

This is the general formula to calculate an expected value of an experiment (which has 6 outcomes and 6 probabilities associated with it).

这是计算实验的期望值(具有6个结果和6个概率)的一般公式。

So, now let’s calculate the expected value from a die roll.

因此，现在让我们从下模辊计算期望值。

Even though it is impossible to get a 3.5 on a single roll of a die, with an increase in the number of die rolls, the average of the die rolls would be close to 3.5.

即使不可能在单个模具辊上获得3.5，但随着模具辊数量的增加，模具辊的平均值将接近3.5。

For visualizing this in Python, first import the necessary libraries: numpy, matplotlib, and wand. Make sure you install ImageMagick for saving the plots as a gif.
为了在Python中可视化，首先导入必要的库：numpy，matplotlib和wand 。确保安装ImageMagick以便将图另存为gif。

2. Now, create 1000 simulations of 10 die rolls, and in each simulation, find the average of the die outcome.

2.现在，创建10个模具辊的1000个模拟，并在每个模拟中找到模具结果的平均值。

This is what the first 10 expected values of the die roll would look like:

模头辊的前10个预期值如下所示：

3. Write a function to plot a histogram of the above generated values. Also, using the animation function we can visualize how the histogram slowly resembles a normal distribution.

3.编写一个函数以绘制上述生成值的直方图。同样，使用动画功能，我们可以直观地看到直方图如何缓慢地类似于正态分布。

Output:

输出：

4. You can save the animation as a gif using the following piece of code.

4.您可以使用以下代码将动画另存为gif。

From this experiment, we can observe:

从这个实验中，我们可以观察到：

With a smaller number of samples, the histogram is scattered all over and does not have a definite pattern.
样本数量较少时，直方图会散布在各处，并且没有明确的模式。
However, by increasing the sample size, the sampling distribution starts to resemble a normal distribution. This is the Central Limit Theorem.
但是，通过增加样本大小，采样分布开始类似于正态分布。这是中心极限定理。
Also, with an increase in the sample size, the frequency for “average from die roll = 3.5” is the highest — which is the expected value of a die roll. This demonstrates the Law of Large Numbers.
另外，随着样本量的增加，“模具辊的平均值= 3.5”的频率最高，这是模具辊的预期值。这证明了大数定律。

So, how is the Central Limit Theorem used?

那么，中心极限定理是如何使用的呢？

It enables us to test the hypothesis of whether our sample represents a population distinct from the known population. We can take a mean from a sample and compare it with the sampling distribution to estimate the probability whether the sample comes from the known population.

它使我们能够检验关于样本是否代表与已知种群不同的种群的假设。 我们可以从样本中取平均值，并将其与样本分布进行比较，以估计样本是否来自已知总体的概率。

Connect on LinkedIn and, check out Github (below) for the complete notebook.

在LinkedIn上连接，然后查看Github(如下)以获取完整的笔记本。

rohanjoseph93/Central-Limit-TheoremVisualize CLT in Python. Contribute to rohanjoseph93/Central-Limit-Theorem development by creating an account on…github.com

rohanjoseph93 / Central-Limit-Theorem定理 Python中的CLT。 通过在… github.com 上创建一个帐户为rohanjoseph93 / Central-Limit-Theorem开发做出贡献

翻译自: https://www.freecodecamp.org/news/how-to-visualize-the-central-limit-theorem-in-python-b619f5b00168/

python中数据可视化

cumian8165

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python中数据可视化_如何在Python中可视化中心极限定理

python中数据可视化by Rohan Joseph 罗汉·约瑟夫(Rohan Joseph) 如何在Python中可视化中心极限定理 (How to visualize the Central Limit Theorem in Python)The Central Limit Theorem states that the sampling distribution of the sam...
复制链接

扫一扫