协方差 python
There is always a confusion when you are pursuing Statistical concepts. Let us see Covariance and Correlation in detail in this article.
当您追求统计概念时,总是会感到困惑。 让我们在本文中详细了解协方差和相关性。
First useful information you can gather from data is how two variables are associated to each other. In layman terms, how one variable is related to other variable?
您可以从数据中收集的第一个有用信息是两个变量如何相互关联。 用外行术语来说,一个变量与另一个变量有什么关系?
Covariance and correlation are two significantly used terms in the field of statistics and probability theory. Both the terms measure the relationship and the dependency between two variables
协方差和相关性是统计学和概率论领域中两个重要使用的术语。 这两个术语都测量两个变量之间的关系和依赖性
Co-Variance
协方差
Covariance indicates the direction of the linear relationship between variables. Covariance values are not standard (…can be any number)
协方差指示变量之间线性关系的方向。 协方差值不是标准的(...可以是任何数字)
Covariance is given by formula -
协方差由公式给出-
Correlation
相关性
Correlation measures both the strength and direction of the linear relationship between two variables. Correlation values are standardized.
关联度测量两个变量之间线性关系的强度和方向。 相关值是标准化的。
Correlation is given by formula -
相关性由公式给出-
We can quantify the strength of the relationship with correlation. Weak correlation have small correlation value, where as strong correlation have larger value. You can obtain the correlation coefficient of two variables by dividing the covariance of these variables by the product of the standard deviations of the same values. When you divide the covariance values by the standard deviation, it essentially scales the value down to a limited range of -1 to +1. Correlation describes relationships and is not sensitive to the scale of data.
我们可以量化具有相关性的关系的强度。 弱相关具有较小的相关值,而强相关具有较大的值。 您可以通过将两个变量的协方差除以相同值的标准偏差的乘积来获得两个变量的相关系数。 当您用标准偏差除以协方差值时,它实际上将值缩小为-1到+1的有限范围。 关联描述关系,并且对数据规模不敏感。
Now let us see few differences between Covariance and Correlation -
现在让我们看一下协方差和相关之间的区别-
Now let us calculate and understand with an example, so it stays in our mind.
现在让我们通过一个示例进行计算和理解,因此它始终存在。
Let us take a simple example : X and Y are two variables
让我们举一个简单的例子:X和Y是两个变量
Substituting in Co-Variance equation, we get,
代入协方差方程,我们得到,
Let us verify this in Python –
让我们用Python验证这一点–
Now let us calculate Co-relation,
现在让我们计算相关
Substituting in Co-relation equation, we get -
代入互相关方程,我们得到-
Let us verify this in Python –
让我们用Python验证这一点–
Conclusion –
结论–
We understood co-variance and correlation separately. We experimented using an example. We then compared with Python code.
我们分别了解协方差和相关性。 我们使用一个示例进行了实验。 然后,我们将其与Python代码进行了比较。
We got co-variance value as 8, which is a positive number (can be any positive infinity). And we achieved correlation coefficient as 0.97 which is clearly a strong positive correlation between X and Y.
我们得到的协方差值为8,这是一个正数(可以是任何正无穷大)。 并且我们获得了0.97的相关系数,这显然是X和Y之间的强正相关。
翻译自: https://medium.com/analytics-vidhya/covariance-and-correlation-math-and-python-code-7cbef556baed
协方差 python