分散式计算是什么_什么是分散及其措施

最新推荐文章于 2024-04-16 09:56:43 发布

weixin_26729763

最新推荐文章于 2024-04-16 09:56:43 发布

阅读量1.6k

点赞数

文章标签： python leetcode

原文链接：https://medium.com/analytics-vidhya/what-is-dispersion-and-its-measures-888fedddd40a

版权

分散式计算是一种计算模式，它涉及多个独立的计算资源协同工作来处理任务。这种计算方式允许数据和应用程序在不同的节点上分布式处理，提高了效率和容错性。文章介绍了分散式计算的基本概念，并探讨了其在提升系统性能和解决大规模问题中的作用。

摘要由CSDN通过智能技术生成

分散式计算是什么

Dispersion is basically the spread of the data, the extent to which a distribution is stretched or squeezed. Central tendencies are not enough to fully understand the nature of the entire dataset, hence we need dispersion measures. With both central tendencies and dispersion measures together paint a good picture of the dataset.

分散基本上是数据的分散，即分布被拉伸或压缩的程度。中心趋势不足以完全理解整个数据集的性质，因此我们需要分散度量。通过集中趋势和分散度量，可以很好地描绘数据集。

Following are the most common dispersion measures:

以下是最常见的分散措施：

范围： (Range:)

Range is nothing but the difference between maximum value and minimum value. It tells us within what range the whole dataset lies. Range is very easy and straightforward to calculate but at the same time, it is too sensitive to outliers

范围不过是最大值和最小值之间的差异。它告诉我们整个数据集位于什么范围内。范围非常容易直接计算，但同时又对异常值过于敏感

R = max(dataset) -min(dataset)

R = max(数据集)-min(数据集)

方差： (Variance:)

Variance is the measure of spread of data, it is the expectation of squared deviation of a random variable from its mean.

方差是数据传播的量度，是对随机变量与其均值平方差的期望。

Now, let us try to understand by intuition, without going into mathematics too much, why we divide with “n-1” instead of “n”.Ideal way to calculate standard deviation from sample dataset would be

现在，让我们尝试通过直觉来理解，而不必过多地研究数学，为什么我们用“ n-1”而不是“ n”进行除法。从样本数据集计算标准差的理想方法是

But we do not know ‘mu’, hence we use ‘x-bar’ instead. ‘x-bar’ is calculated using only sample dataset, so in real world

但是我们不知道'mu'，因此我们改用'x-bar'。 “ x-bar”仅使用样本数据集进行计算，因此在现实世界中

hence,

因此，

Hence we use “n-1” in the denominator to calculate the unbiased standard deviation from sample.

因此，我们在分母中使用“ n-1”来计算样本的无偏标准偏差。

标准偏差： (Standard Deviation:)

As the unit of variance is squared the unit of data(as obvious from the formula above), the value of variance is not very intuitive, hence we take its square root which is defined as standard deviation.

由于方差单位是数据单位的平方(从上面的公式显而易见)，方差的值不是很直观，因此我们将其平方根定义为标准偏差。

If the standard deviation is small, the data has little spread (i.e., the majority of points fall very near the mean).
如果标准偏差很小，则数据的传播范围很小(即，大多数点都非常接近均值)。
If standard deviation = 0, there is no spread. This only happens when all data items are the same value.
如果标准偏差= 0，则不会扩展。仅当所有数据项具有相同的值时，才会发生这种情况。
The standard deviation is significantly affected by outliers and skewed distributions.
离群值和偏斜分布会严重影响标准偏差。

Thanks for reading the article! Wanna connect with me?Here is link to my LinkedIn Profile

感谢您阅读本文！想与我联系吗？这里是我的LinkedIn个人资料的链接

翻译自: https://medium.com/analytics-vidhya/what-is-dispersion-and-its-measures-888fedddd40a

分散式计算是什么

weixin_26729763

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
分散式计算是什么_什么是分散及其措施

分散式计算是什么Dispersion is basically the spread of the data, the extent to which a distribution is stretched or squeezed. Central tendencies are not enough to fully understand the nature of the entire data...
复制链接

扫一扫