协方差numpy.cov与皮尔逊相关系数

协方差与相关系数

这两个概念别弄混就好了,挺简单的,我也就不多说了。

协方差:numpy.cov官网参数

从数值来看,协方差的数值越大,两个变量同向程度也就越大。反之亦然。

numpy.cov(my=Nonerowvar=Truebias=Falseddof=Nonefweights=Noneaweights=None)[source]

Parameters:

m : array_like

A 1-D or 2-D array containing multiple variables and observations. Each row of m represents a variable, and each column a single observation of all those variables. Also see rowvar below.

y : array_like, optional

An additional set of variables and observations. y has the same form as that of m.

rowvar : bool, optional

这个变量很重要,可以改变计算行还是计算列

If rowvar is True (default), then each row represents a variable, with observations in the columns. Otherwise, the relationship is transposed: each column represents a variable, while the rows contain observations.

bias : bool, optional

Default normalization (False) is by (N - 1), where N is the number of observations given (unbiased estimate). If bias is True, then normalization is by N. These values can be overridden by using the keyword ddof in numpy versions >= 1.5.

ddof : int, optional

If not None the default value implied by bias is overridden. Note that ddof=1 will return the unbiased estimate, even if both fweights and aweights are specified, and ddof=0 will return the simple average. See the notes for the details. The default value is None.

New in version 1.5.

fweights : array_like, int, optional

1-D array of integer frequency weights; the number of times each observation vector should be repeated.

New in version 1.10.

aweights : array_like, optional

1-D array of observation vector weights. These relative weights are typically large for observations considered “important” and smaller for observations considered less “important”. If ddof=0 the array of weights can be used to assign probabilities to observation vectors.

New in version 1.10.

import numpy as np

a = np.array([1,2,3])
b = np.array([4,3,4])
np.cov(a, b)

>>> np.cov(a, b)
array([[ 1.        ,  0.        ],
       [ 0.        ,  0.33333333]])

 相关系数:numpy.corrcoef官网参数

        相关系数是用以反映变量之间相关关系密切程度的统计指标。相关系数也可以看成协方差:一种剔除了两个变量量纲影响、标准化后的特殊协方差,它消除了两个变量变化幅度的影响,而只是单纯反应两个变量每单位变化时的相似程度。

相关系数的公式为:

翻译一下:就是用X、Y的协方差除以X的标准差和Y的标准差

所以,相关系数也可以看成协方差:一种剔除了两个变量量纲影响、标准化后的特殊协方差。

numpy.corrcoef(xy=Nonerowvar=Truebias=<no value>ddof=<no value>)[source]

Parameters:

x : array_like

A 1-D or 2-D array containing multiple variables and observations. Each row of x represents a variable, and each column a single observation of all those variables. Also see rowvar below.

y : array_like, optional

An additional set of variables and observations. y has the same shape as x.

rowvar : bool, optional

If rowvar is True (default), then each row represents a variable, with observations in the columns. Otherwise, the relationship is transposed: each column represents a variable, while the rows contain observations.

bias : _NoValue, optional

Has no effect, do not use.

Deprecated since version 1.10.0.

ddof : _NoValue, optional

Has no effect, do not use.

Deprecated since version 1.10.0.

import numpy as np

a = np.array([1,2,3])
b = np.array([2,5,8])

np.corrcoef(a, b)

>>> np.corrcoef(a, b)
array([[ 1.,  1.],
       [ 1.,  1.]])

 

  • 0
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

且行且安~

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值