协方差矩阵

二、为什么需要协方差

$var\left(X\right)=\frac{\sum _{i=1}^{n}\left({X}_{i}-\overline{X}\right)\left({X}_{i}-\overline{X}\right)}{n-1}$

$var\left(X,Y\right)=\frac{\sum _{i=1}^{n}\left({X}_{i}-\overline{X}\right)\left({Y}_{i}-\overline{Y}\right)}{n-1}$

$cov\left(X,X\right)=var\left(X\right)$
$cov\left(X,Y\right)=cov\left(Y,X\right)$

三、协方差矩阵

${C}_{n×n}=\left({C}_{i,j},{C}_{i,j}=cov\left(Di{m}_{i},Di{m}_{j}\right)\right)$

$\begin{array}{ccc}cov\left(x,x\right)& cov\left(x,y\right)& cov\left(x,z\right)\\ cov\left(y,x\right)& cov\left(y,y\right)& cov\left(y,z\right)\\ cov\left(z,x\right)& cov\left(z,y\right)& cov\left(z,z\right)\end{array}$

四、Matlab协方差实战

MySample = fix(rand(10,3)*50)
MySample =

32    33    31
39    26     1
11    41     6
42    13    34
45     8    25
14    21    27
36    23    40
16    39    39
13     6    25
8     2    13

dim1 = MySample(:,1);
dim2 = MySample(:,2);
dim3 = MySample(:,3);

sum( (dim1-mean(dim1)) .* (dim2-mean(dim2)) ) / ( size(MySample,1)-1 )
sum( (dim1-mean(dim1)) .* (dim3-mean(dim3)) ) / ( size(MySample,1)-1 )
sum( (dim2-mean(dim2)) .* (dim3-mean(dim3)) ) / ( size(MySample,1)-1 )


std(dim1)^2
std(dim2)^2
std(dim3)^2


cov(MySample)


Update：今天突然发现，原来协方差矩阵还可以这样计算，先让样本矩阵中心化，即每一维度减去该维度的均值，使每一维度上的均值为0，然后直接用新的到的样本矩阵乘上它的转置，然后除以(N-1)即可。其实这种方法也是由前面的公式通道而来，只不过理解起来不是很直观，但在抽象的公式推导时还是很常用的！同样给出Matlab代码实现：

X = MySample - repmat(mean(MySample),10,1);    % 中心化样本矩阵，使各维度均值为0
C = (X'*X)./(size(X,1)-1);