Covariance and Correlation
Demystifying the terms
Covariance indicates the direction of the linear relationship between variables.
Correlation on the other hand measures both the strength and direction of the linear relationship between two variables.
Correlation is a function of the covariance. What sets them apart is the fact that correlation values are standardized whereas, covariance values are not.
Defining the terms mathematically
Covariance
c o v ( x , y ) = E [ ( x − μ x ) ( y − μ y ) ] = E [ x y ] − E [ x ] E [ y ] \begin{aligned} cov(x,y) &= E[(x - \mu_x) (y - \mu_y)]\\ &= E[xy] - E[x] E[y] \end{aligned} cov(x,y)=E[(x−μx)(y−μy)]=E[xy]−E[x]E[y]
If we have only a single variable x x x, then
c o v ( x , x ) = E [ ( x − μ x ) ( x − μ x ) ] = E [ ( x − μ x ) 2 ] = v a r ( x ) = σ 2 ( x ) = σ x 2 Let v a r ( x ) : = s 2 sampled varaince \begin{aligned} cov(x, x) &= E[(x - \mu_x) (x - \mu_x)]\\ &= E[(x - \mu_x)^2] \\ &= var(x) = \sigma^2(x) = \sigma^2_x \\ \text{Let }var(x) & := s^2 \hspace{1cm} \text{sampled varaince} \end{aligned} cov(x,x)Let var(x)=E[(x−μx)(x−μx)]=E[(x−μx)2]=var(x)=σ2(x)=σx2:=s2sampled varaince
Expand it, we can get
s 2 = c o v ( x , x ) = ∑ i = 1 N ( x i − x ˉ ) 2 n − 1 c o v ( x , y ) = ∑ i = 1 N ( x i − x ˉ ) ( y i − y ˉ ) n − 1 \begin{aligned} s^2 = cov(x, x) &= \frac{\sum_{i=1}^N (x_i - \bar{x})^2}{n-1} \\ cov(x,y) &= \frac{\sum_{i=1}^{N}(x_i - \bar{x}) (y_i - \bar{y})}{n-1} \end{aligned} s2=cov(x,x)cov(x,y)=n−1∑i=1N(xi−xˉ)2=n−1∑i=1N(xi−xˉ)(yi−yˉ)
The numerator of the first equation is called sum of squared deviation, and the second is called sum of cross product.
Correlation
c o r r ( x , y ) = c o v ( x , y ) s x s y = E [ ( x − μ x ) ( y − μ y ) ] s x s y = E [ ( x − μ x ) ( y − μ y ) ] σ x σ y \begin{aligned} corr(x,y) = \frac{cov(x,y)}{s_x s_y} &= \frac{E[(x - \mu_x) (y - \mu_y)]}{s_x s_y} \\ &= \frac{E[(x - \mu_x) (y - \mu_y)]}{\sigma_x \sigma_y} \end{aligned} corr(x,y)=sxsycov(x,y)=sxsyE[(x−μx)(y−μy)]=σxσyE[(x−μx)(y−μy)]
So the values of correlation coefficient rnge from [-1, 1]. The positive sign signifies the direction of the correlation i.e. if one of the variables increases, the other variable is also supposed to increase.
Data-matrix representation of covariance and correlation
X = [ x 11 . . . x 1 n . . . . . . . . . x m 1 . . . x m n ] = [ x 1 . . . x n ] X = \begin{bmatrix} x_{11} & ... & x_{1n} \\ ... & ... & ... \\ x_{m1} & ... & x_{mn} \\ \end{bmatrix} = \begin{bmatrix} \mathbf{x}_1 & ... & \mathbf{x}_n \end{bmatrix} X=⎣⎡x11...xm1.........x1n...xmn⎦⎤=[x1...xn]
order of X = m × n X = m\times n X=m×n
We call a row is item / subject and a column variable
Now we can calculate the sample mean of j j jth variable
x ˉ j = 1 m ∑ i = 1 m x i j \bar{x}_j = \frac{1}{m}\sum_{i=1}^m x_{ij} xˉj=m1i=1∑mxij
similarly, the row-mean is
x ˉ i = 1 n ∑ j = 1 n x i j \bar{x}_i = \frac{1}{n}\sum_{j=1}^nx_{ij} xˉi=n1j=1∑nxij
We then can define the covariance matrix:
S = 1 m [ x 1 − x ˉ 1 . . . x n − x ˉ n ] [ x 1 − x ˉ 1 . . . x n − x ˉ n ] = [ s 1 2 . . . s 1 n 2 . . . . . . . . . s n 1 2 . . . s n 2 ] where s j 2 = 1 m ∑ i = 1 m ( x i j − x ˉ j ) 2 variance of jth variable s j k = 1 m ∑ i = 1 m ( x i j − x ˉ j ) ( x i k − x ˉ k ) covariance between jth and kth variable x ˉ j = 1 m ∑ i = 1 m x i j mean of jth variable \begin{aligned} S = \frac{1}{m}\begin{bmatrix} \mathbf{x}_1 - \bar{\mathbf{x}}_1 \\ ... \\ \mathbf{x}_n - \bar{\mathbf{x}}_n \\ \end{bmatrix} \begin{bmatrix} \mathbf{x}_1 - \bar{\mathbf{x}}_1 & ... & \mathbf{x}_n - \bar{\mathbf{x}}_n \end{bmatrix} &= \begin{bmatrix} s_{1}^2 & ... & s_{1n}^2 \\ ... & ... & ... \\ s_{n1}^2 & ... & s_{n}^2 \\ \end{bmatrix}\\ \text{where } s_j^2 &= \frac{1}{m}\sum_{i=1}^{m}(x_{ij} - \bar{x}_j)^2 \hspace{1cm} \text{variance of jth variable} \\ s_{jk} &= \frac{1}{m} \sum_{i=1}^{m}(x_{ij} - \bar{x}_j) (x_{ik} - \bar{x}_k) \hspace{1cm} \text{covariance between jth and kth variable}\\ \bar{\mathbf{x}}_j &= \frac{1}{m}\sum_{i=1}^{m}x_{ij} \hspace{1cm} \text{mean of jth variable} \end{aligned} S=m1⎣⎡x1−xˉ1...xn−xˉn⎦⎤[x1−xˉ1...xn−xˉn]where sj2sjkxˉj=⎣⎡s12...sn12.........s1n2...sn2⎦⎤=m1i=1∑m(xij−xˉj)2variance of jth variable=m1i=1∑m(xij−xˉj)(xik−xˉk)covariance between jth and kth variable=m1i=1∑mxijmean of jth variable
We can see that the covariance matrix is a n × n n\times n n×n symmetric matrix
Then we can define the Correlation matrix
R = 1 m [ ( x 1 − x ˉ 1 ) / s 1 . . . ( x n − x ˉ n ) / s n ] [ ( x 1 − x ˉ 1 ) / s 1 . . . ( x n − x ˉ n ) / s n ] = [ 1 r 12 . . . r 1 n . . . . . . . . . . . . r n 1 . . . . . . 1 ] \begin{aligned} R &= \frac{1}{m} \begin{bmatrix} (\mathbf{x}_1 - \bar{\mathbf{x}}_1) / s_1 \\ ... \\ (\mathbf{x}_n - \bar{\mathbf{x}}_n) / s_n \\ \end{bmatrix} \begin{bmatrix} (\mathbf{x}_1 - \bar{\mathbf{x}}_1) / s_1 & ... & (\mathbf{x}_n - \bar{\mathbf{x}}_n) / s_n \\ \end{bmatrix}\\ &= \begin{bmatrix} 1 & r_{12} & ... & r_{1n} \\ ...& ... & ... & ... \\ r_{n1} & ... & ... & 1 \end{bmatrix} \end{aligned} R=m1⎣⎡(x1−xˉ1)/s1...(xn−xˉn)/sn⎦⎤[(x1−xˉ1)/s1...(xn−xˉn)/sn]=⎣⎡1...rn1r12...............r1n...1⎦⎤
Covariance versus Correlation
-
Covariance has unit from the product of the units of the two variables
Correlation is dimensionless -
Covariance can take value from ( − ∞ , + ∞ ) (-\infty, +\infty) (−∞,+∞)
Correlation lies between [ − 1 , 1 ] [-1, 1] [−1,1]