相关性与协方差的计算
1. Convariance(协方差)
若COV = 0 ,没有关联;
若为正,正相关;
若为负,负相关
但是无法衡量相关性的强弱
2.Correlation (相关系数)
correlation(只看线性关系):无单位的量数, 介于-1到1 ,衡量两个变量中线性关系的强弱
ρ(X,Y)= COV(X,Y)/σX * σY
ρ = ±1,完全正/负相关
ρ=0,uncorrelated
a = np.arange(1,10).reshape(3,3)
data2 = pd.DataFrame(a,index=['a','b','c'],columns = ['one','two','three']
data2
one two three
a 1 2 3
b 4 5 6
c 7 8 9
相关性计算
#计算第一列和第二列的相关性系数
data2['one'].corr(data2['two'])
1.0
#返回相关系数矩阵
data2.corr()
one two three
one 1.0 1.0 1.0
two 1.0 1.0 1.0
three 1.0 1.0 1.0
协方差计算
# 计算第一列和第二列的协方差
data2['one'].cov(data2['two'])
9.0
#协方差矩阵
data2.cov()
one two three
one 9.0 9.0 9.0
two 9.0 9.0 9.0
three 9.0 9.0 9.0