协方差 python_协方差和相关数学和Python代码

协方差 python

There is always a confusion when you are pursuing Statistical concepts. Let us see Covariance and Correlation in detail in this article.

当您追求统计概念时,总是会感到困惑。 让我们在本文中详细了解协方差和相关性。

First useful information you can gather from data is how two variables are associated to each other. In layman terms, how one variable is related to other variable?

您可以从数据中收集的第一个有用信息是两个变量如何相互关联。 用外行术语来说,一个变量与另一个变量有什么关系?

Covariance and correlation are two significantly used terms in the field of statistics and probability theory. Both the terms measure the relationship and the dependency between two variables

协方差和相关性是统计学和概率论领域中两个重要使用的术语。 这两个术语都测量两个变量之间的关系和依赖性

Co-Variance

协方差

Covariance indicates the direction of the linear relationship between variables. Covariance values are not standard (…can be any number)

协方差指示变量之间线性关系的方向。 协方差值不是标准的(...可以是任何数字)

Covariance is given by formula -

协方差由公式给出-

Correlation

相关性

Correlation measures both the strength and direction of the linear relationship between two variables. Correlation values are standardized.

关联度测量两个变量之间线性关系的强度和方向。 相关值是标准化的。

Correlation is given by formula -

相关性由公式给出-

Image for post
Co-relation coefficient formula
关联系数公式

We can quantify the strength of the relationship with correlation. Weak correlation have small correlation value, where as strong correlation have larger value. You can obtain the correlation coefficient of two variables by dividing the covariance of these variables by the product of the standard deviations of the same values. When you divide the covariance values by the standard deviation, it essentially scales the value down to a limited range of -1 to +1. Correlation describes relationships and is not sensitive to the scale of data.

我们可以量化具有相关性的关系的强度。 弱相关具有较小的相关值,而强相关具有较大的值。 您可以通过将两个变量的协方差除以相同值的标准偏差的乘积来获得两个变量的相关系数。 当您用标准偏差除以协方差值时,它实际上将值缩小为-1到+1的有限范围。 关联描述关系,并且对数据规模不敏感。

Now let us see few differences between Covariance and Correlation -

现在让我们看一下协方差和相关之间的区别-

Image for post
Difference between Covariance and Correlation
协方差和相关性之间的差异

Now let us calculate and understand with an example, so it stays in our mind.

现在让我们通过一个示例进行计算和理解,因此它始终存在。

Let us take a simple example : X and Y are two variables

让我们举一个简单的例子:X和Y是两个变量

Image for post

Substituting in Co-Variance equation, we get,

代入协方差方程,我们得到,

Image for post
Co-Variance of Sample is 8
样本的协方差为8

Let us verify this in Python –

让我们用Python验证这一点–

Image for post
Image for post
Our Calculation and Python Calculation are matching
我们的计算和Python计算是匹配的

Now let us calculate Co-relation,

现在让我们计算相关

Image for post
Substituting in above equation we get Sx and Sy as 1.58 and 5.21
代入上述方程式,我们得出Sx和Sy分别为1.58和5.21

Substituting in Co-relation equation, we get -

代入互相关方程,我们得到-

Image for post
Co-relation Co-eff (r) = 0.9701
互相关系数(r)= 0.9701

Let us verify this in Python –

让我们用Python验证这一点–

Image for post
Manual Calculation and Python’s function results are matching
手动计算和Python函数结果匹配

Conclusion –

结论–

We understood co-variance and correlation separately. We experimented using an example. We then compared with Python code.

我们分别了解协方差和相关性。 我们使用一个示例进行了实验。 然后,我们将其与Python代码进行了比较。

We got co-variance value as 8, which is a positive number (can be any positive infinity). And we achieved correlation coefficient as 0.97 which is clearly a strong positive correlation between X and Y.

我们得到的协方差值为8,这是一个正数(可以是任何正无穷大)。 并且我们获得了0.97的相关系数,这显然是X和Y之间的强正相关。

翻译自: https://medium.com/analytics-vidhya/covariance-and-correlation-math-and-python-code-7cbef556baed

协方差 python

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值