68-95-99规则–以普通英语解释正态分布

本文介绍了正态分布的概念,通过68-95-99规则阐述了数据分布的特点,并举例说明了如何计算正态分布。正态分布常用于描述身高、体重等特征的分布情况,掌握这一规则有助于理解数据分布和概率计算。
摘要由CSDN通过智能技术生成

Meet Mason. He's an average American 40-year-old: 5 foot 10 inches tall and earning $47,000 per year before tax.

认识梅森。 他平均年龄40岁,身高5英尺10英寸,每年税前收入$ 47,000。

How often would you expect to meet someone who earns 10x as much as Mason?

您期望多久见到一个收入比梅森高10倍的人?

And now, how often would you expect to meet someone who is 10x as tall as Mason?

现在,您希望多久见到一个比梅森高10倍的人?

Your answers to the two questions above are different, because the distribution of data is different. In some cases, 10x above average is common. While in others, it's not common at all.

您对上述两个问题的答案是不同的,因为数据的分布是不同的。 在某些情况下,通常会比平均水平高出10倍。 在其他情况下,这并不常见。

那么什么是正态分布? (So what are normal distributions?)

Today, we're interested in normal distributions. They are represented by a bell curve: they have a peak in the middle that tapers towards each edge. A lot of things follow this distribution, like your height, weight, and IQ.

今天,我们对正态分布感兴趣。 它们由钟形曲线表示:它们在中间有一个朝向每个边缘逐渐变细的峰。 很多事情都遵循这种分布,例如身高,体重和智商。

This distribution is exciting because it's symmetric – which makes it easy to work with. You can reduce lots of complicated mathematics down to a few rules of thumb, because you don't need to worry about weird edge cases.

这种分布令人兴奋,因为它是对称的-使其易于使用。 您可以将许多复杂的数学简化为几条经验法则,因为您不必担心奇怪的情况。

For example, the peak always divides the distribution in half. There's equal mass before and after the peak.

例如,峰值总是将分布一分为二。 高峰前后的质量相等。

Another important property is that we don't need a lot of information to describe a normal distribution.

另一个重要的特性是我们不需要太多信息来描述正态分布。

Indeed, we only need two things:

确实,我们只需要两件事:

  1. The mean. Most people just call this "the average." It's what you get if you add up the value of all your observations, then divide that number by the number of observations. For example, the average of these three numbers: 1, 2, 3 = (1 + 2 + 3) / 3 = 2

    均值。 大多数人将其称为“平均值”。 如果将所有观察值相加,然后将其除以观察数,便得到了结果。 例如,这三个数字的平均值: 1, 2, 3 = (1 + 2 + 3) / 3 = 2

  2. And the standard deviation. This tells you how rare an observation would be. Most observations fall within one standard deviation of the mean. Fewer observations are two standard deviations from the mean. And even fewer are three standard deviations away (or further).

    和标准偏差。 这告诉您观察将是多么罕见。 大多数观察值均在平均值的一个标准偏差之内。 较少的观察值是与平均值的两个标准偏差。 甚至更少(三个或更多个标准偏差)。

Together, the mean and the standard deviation make up everything you need to know about a distribution.

平均值和标准偏差共同构成了您需要了解的有关分布的所有信息。

68-95-99规则 (The 68-95-99 rule)

The 68-95-99 rule is based on the mean and standard deviation. It says:

68-95-99规则基于均值和标准差。 它说:

68% of the population is within 1 standard deviation of the mean.

68%的人口在平均值的1个标准差内。

68% of the population is within 1 standard deviation of the mean.

68%的人口在平均值的1个标准差内。

95% of the population is within 2 standard deviation of the mean.

95%的人口在平均值的2个标准差内。

95% of the population is within 2 standard deviation of the mean.

95%的人口在平均值的2个标准差内。

如何计算正态分布 (How to calculate normal distributions)

To continue our example, the average American male height is 5 feet 10 inches, with a standard deviation of 4 inches. This means:

继续我们的示例,美国男性的平均身高为5英尺10英寸,标准差为4英寸。 这表示:

Now for the fun part: Let's apply what we've just learned.

现在开始有趣的部分:让我们应用我们刚刚学到的东西。

What's the chance of seeing someone with a height between between 5 feet 10 inches and 6 feet 2 inches? (That is, between 70 and 74 inches.)

看到某人的身高介于5英尺10英寸和6英尺2英寸之间的机会是什么? (即介于70到74英寸之间。)

It's 34%! We leverage both the properties: the distribution is symmetric, which means chances for (66-70) inches and (70-74) inches are both 68/2 = 34%.

34%! 我们利用这两个属性:分布是对称的,这意味着(66-70)英寸和(70-74)英寸的机会均为68/2 = 34%。

Let's try a tougher one. What's the chance of seeing someone with a height between 62 and 66 inches?

让我们尝试一个更严格的方法。 看到身高介于62到66英寸之间的人有什么机会?

It's (95-68)/2 = 13.5%. Both outer edges have the same %.

(95-68)/ 2 = 13.5%。 两个外边缘具有相同的%。

And now your final (and hardest test): What's the chance of seeing someone with a height greater than 82 inches?

现在是您的最后一项(也是最困难的测试):看到身高超过82英寸的人有什么机会?

Here, we use also the final property: everything must sum to 100%. So the outer edges (that is, heights below 58 and heights above 82) together make (100% - 99.7%) = 0.3%.

在这里,我们还使用final属性:所有内容之和必须为100%。 因此,外边缘(即低于58的高度和高于82的高度)合起来等于(100%-99.7%)= 0.3%。

Remember, you can apply this on any normal distribution. Try doing the same for female heights: the mean is 65 inches, and standard deviation is 3.5 inches.

请记住,您可以将其应用于任何正态分布。 尝试对女性身高做同样的事情:平均值为65英寸,标准偏差为3.5英寸。

So, the chance of seeing someone with a height between 65 and 68.5 inches would be: ___.

因此,看到某个身高介于65到68.5英寸之间的人的机会是:___。

...

...

...

...

34%! It's exactly the same as our first example. It's +1 standard deviation.

34%! 它与我们的第一个示例完全相同。 这是+1标准偏差。

结论 (Conclusion)

Knowing this rule makes it very easy to calibrate your senses. Since all we need to describe any normal distribution is the mean and standard deviation, this rule holds for every normal distribution in the world!

了解此规则可以很容易地校准您的感官 。 由于我们需要描述的任何正态分布都是均值和标准差,因此该规则适用于世界上的每个正态分布!

The challenging part, indeed, is figuring out whether the distribution is normal or not.

确实,具有挑战性的部分是弄清楚分布是否正常。

Want to learn more about calibrating your senses and thinking critically? Check out Bayes Theorem: A Framework for Critical Thinking.

想更多地了解校准感官和进行批判性思考吗? 查看贝叶斯定理:批判性思维的框架

翻译自: https://www.freecodecamp.org/news/normal-distribution-explained/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值