Table of Content
1. Covariance and Correlation
Let
x
i
x_{i}
xi and
x
j
x_{j}
xj be two real random variables in a random vector
x
=
[
x
1
,
⋯
,
x
N
]
T
x=[x_{1},\cdots , x_{N}]^{T}
x=[x1,⋯,xN]T.
The mean and varicance of a variable
x
i
x_{i}
xi and the covariance and correlation coefficient (normalized correlation) between two variables
x
i
x_{i}
xi and
x
j
x_{j}
xj are defined below:
- Mean of x i : x_{i}: xi: μ i = E ( x i ) \mu_{i}=E(x_{i}) μi=E(xi)
- Variance of x i : x_{i}: xi: σ i 2 = E [ ( x i − μ i ) 2 ] = E ( x i 2 ) − μ i 2 \sigma_{i}^{2}=E[(x_{i}-\mu_{i})^{2}]=E(x_{i}^{2})-\mu_{i}^{2} σi2=E[(xi−μi)2]=E(xi2)−μi2
- Covariance of x i x_{i} xi and x j : x_{j}: xj: σ i j 2 = E [ ( x i − μ i ) ( x j − μ j ) ] = E ( x i x j ) − μ i μ j \sigma_{ij}^{2}=E[(x_{i}-\mu_{i})(x_{j}-\mu_{j})]=E(x_{i}x_{j})-\mu_{i}\mu_{j} σij2=E[(xi−μi)(xj−μj)]=E(xixj)−μiμj
- Correlation coefficient between x i x_{i} xi and x j x_{j} xj: r i j = σ i j 2 σ i 2 σ j 2 = σ i j 2 σ i σ j r_{ij}=\dfrac{\sigma_{ij}^{2}}{\sqrt{\sigma_{i}^{2}\sigma_{j}^{2}}}=\dfrac{\sigma_{ij}^{2}}{\sigma_{i}\sigma_{j}} rij=σi2σj2σij2=σiσjσij2
Note that: the correlation coefficient r i j r_{ij} rij can be considered as the normalized covariance σ i j 2 \sigma_{ij}^{2} σij2.
To obtain these parameters as expectations of the first and second order functions of the random variables, the joint probability density function
p
(
x
1
,
⋯
,
x
N
)
p(x_{1},\cdots, x_{N})
p(x1,⋯,xN) is required.
However, when it is not avaiable, the parameters can still be estimated by averaging the outcomes of a random experiment involving these variables repeated
K
K
K times:
1.1 examples
Assume the experiment concering
x
i
x_{i}
xi and
x
j
x_{j}
xj is repeated
K
=
3
K=3
K=3 times with the following outcomes:
so, we can get
We see that
x
i
x_{i}
xi and
x
j
x_{j}
xj are highly correlated.
2. Unbiased estimate
Defination: Let X X X is overall, θ ∈ Θ \theta \in \Theta θ∈Θ is an under-estimated parameter which is inclued in the distribution of X X X, and X 1 , X 2 , ⋯ , X n X_{1},X_{2},\cdots, X_{n} X1,X2,⋯,Xn is an sample from X X X. If the expectation of the estimation θ ^ = θ ^ ( X 1 , X 2 , ⋯ , X n ) \hat{\theta}=\hat{\theta}(X_{1},X_{2},\cdots,X_{n}) θ^=θ^(X1,X2,⋯,Xn) exists, while the equaction E ( θ ^ ) = θ E(\hat{\theta})=\theta E(θ^)=θ holds for any θ ∈ Θ \theta \in \Theta θ∈Θ. Then θ ^ \hat{\theta} θ^ is called the unbiased estimate of θ \theta θ.
Example1: Let
μ
\mu
μ and
σ
2
\sigma^{2}
σ2 as the mean and variance of
X
X
X, they are unknown, then the estimator of
σ
2
\sigma^{2}
σ2
σ
2
^
=
1
n
∑
i
=
1
n
(
X
i
−
X
ˉ
)
2
\hat{\sigma^{2}}=\frac{1}{n}\sum_{i=1}^{n}(X_{i}-\bar{X})^{2}
σ2^=n1i=1∑n(Xi−Xˉ)2
is a biased estimator.
Proof: As
σ
2
^
=
1
n
∑
i
=
1
n
(
X
i
−
X
ˉ
)
2
=
1
n
∑
i
=
1
n
X
i
2
−
X
ˉ
2
,
\hat{\sigma^{2}}=\frac{1}{n}\sum_{i=1}^{n}(X_{i}-\bar{X})^{2}=\frac{1}{n}\sum_{i=1}^{n}X_{i}^{2}-\bar{X}^{2},
σ2^=n1i=1∑n(Xi−Xˉ)2=n1i=1∑nXi2−Xˉ2,
E
(
σ
2
^
)
=
E
(
1
n
∑
i
=
1
n
X
i
2
)
−
E
(
X
ˉ
2
)
=
1
n
∑
i
=
1
n
E
(
X
i
2
)
−
E
(
X
ˉ
2
)
,
E(\hat{\sigma^{2}})=E(\frac{1}{n}\sum_{i=1}^{n}X_{i}^{2})-E(\bar{X}^{2})=\frac{1}{n}\sum_{i=1}^{n}E(X_{i}^{2})-E(\bar{X}^{2}),
E(σ2^)=E(n1i=1∑nXi2)−E(Xˉ2)=n1i=1∑nE(Xi2)−E(Xˉ2),
and
E
(
X
i
2
)
=
v
a
r
(
X
i
)
+
[
E
(
X
i
)
]
2
=
σ
2
+
μ
2
,
E(X_{i}^{2})=var(X_{i})+[E(X_{i})]^{2}=\sigma^{2}+\mu^{2},
E(Xi2)=var(Xi)+[E(Xi)]2=σ2+μ2,
E
(
X
ˉ
2
)
=
v
a
r
(
X
ˉ
)
+
[
E
(
X
ˉ
)
]
2
=
σ
2
n
+
μ
2
,
E(\bar{X}^{2})=var(\bar{X})+[E(\bar{X})]^{2}=\frac{\sigma^{2}}{n}+\mu^{2},
E(Xˉ2)=var(Xˉ)+[E(Xˉ)]2=nσ2+μ2,
then,
E
(
σ
^
2
)
=
σ
2
+
μ
2
−
(
σ
2
n
+
μ
2
)
=
n
−
1
n
σ
2
≠
σ
2
E(\hat{\sigma}^{2})=\sigma^{2}+\mu^{2}-(\frac{\sigma^{2}}{n}+\mu^{2})=\frac{n-1}{n}\sigma^{2}\neq \sigma^{2}
E(σ^2)=σ2+μ2−(nσ2+μ2)=nn−1σ2=σ2.
So,
σ
^
2
\hat{\sigma}^{2}
σ^2 is a biased estimate. If we use
σ
^
2
\hat{\sigma}^{2}
σ^2 to estimate the value of
σ
2
\sigma^{2}
σ2, it will be less than the real vaule. (However, when the sample size
n
→
∞
n \rightarrow \infty
n→∞,
lim
n
→
∞
[
E
(
σ
^
2
)
−
σ
2
]
=
0
\lim_{n\rightarrow \infty}[E(\hat{\sigma}^{2})-\sigma^{2}]=0
limn→∞[E(σ^2)−σ2]=0, so the
σ
^
2
\hat{\sigma}^{2}
σ^2 is called asymptotically unbiased estimation).
For the Sample variance,
S
2
=
1
n
−
1
∑
i
=
1
n
(
X
i
−
X
ˉ
)
2
=
n
−
1
n
σ
^
2
,
S^{2}=\frac{1}{n-1}\sum_{i=1}^{n}(X_{i}-\bar{X})^{2}=\frac{n-1}{n}\hat{\sigma}^{2},
S2=n−11i=1∑n(Xi−Xˉ)2=nn−1σ^2,
E
(
S
2
)
=
n
n
−
1
E
(
σ
^
2
)
=
n
n
−
1
⋅
n
−
1
n
σ
2
=
σ
2
.
E(S^{2})=\frac{n}{n-1}E(\hat{\sigma}^{2})=\frac{n}{n-1} \cdot \frac{n-1}{n}\sigma^{2}=\sigma^{2}.
E(S2)=n−1nE(σ^2)=n−1n⋅nn−1σ2=σ2.
That is to say, the sample variance
S
2
S^{2}
S2 is a unbiased estimator of
σ
2
\sigma^{2}
σ2. Thus we usually use
S
2
S^{2}
S2 as the estimator of
σ
2
\sigma^{2}
σ2.
Reference
【1】Covariance and Correlation
【2】统计建模与R软件. 薛毅,陈立萍著.