title: 样本方差为何除以n-1mathjax: truecategories: ML
博客主站链接:https://fainke.com
**1.**设样本均值为
X
‾
\overline{X}
X,样本方差为
S
2
S^2
S2,总体均值为
μ
\mu
μ,总体方差为
σ
2
\sigma^{2}
σ2,那么样本方差
S
2
S^2
S2的公式为:
S
2
=
1
n
−
1
∑
i
=
1
n
(
x
i
−
X
‾
)
2
S^{2}=\frac{1}{n-1} \sum_{i=1}^{n}\left(x_{i}-\overline{X}\right)^{2}
S2=n−11∑i=1n(xi−X)2
2.知识补充
(1)为何样本均值的方差等于总体方差除以总体单位数?
答:设X为随机变量,X1,X2,…,Xn为其n个样本,D(X)为方差。根据方差的性质,有 D ( X + Y ) = D X + D Y D(X+Y)=D X+D Y D(X+Y)=DX+DY,以及 D ( k X ) = k 2 ∗ D ( X ) D(k X)=k^{2} * D(X) D(kX)=k2∗D(X),其中X和Y相互独立,k为常数。于是有 D ( ∑ i = 1 n X i n ) = D ( ∑ i = 1 n X i n ) = ∑ i = 1 n D ( X i ) n 2 = 1 n D ( X ) D\left(\frac{\sum_{i=1}^{n} X_{i}}{n}\right)=D\left(\sum_{i=1}^{n} \frac{X_{i}}{n}\right)=\frac{\sum_{i=1}^{n} D\left(X_{i}\right)}{n^{2}}=\frac{1}{n} D(X) D(n∑i=1nXi)=D(∑i=1nnXi)=n2∑i=1nD(Xi)=n1D(X)
3.公式证明
假设样本方差的公式为: S 1 2 = 1 n ∑ i = 1 n ( X i − X ‾ ) 2 S_{1}^{2}=\frac{1}{n} \sum_{i=1}^{n}\left(X_{i}-\overline{X}\right)^{2} S12=n1∑i=1n(Xi−X)2有:
E
(
S
1
2
)
=
1
n
∑
i
=
1
n
E
(
(
X
i
−
X
‾
)
2
)
=
1
n
E
(
∑
i
=
1
n
(
X
i
−
μ
+
μ
−
X
‾
)
2
)
E\left(S_{1}^{2}\right)=\frac{1}{n} \sum_{i=1}^{n} E\left(\left(X_{i}-\overline{X}\right)^{2}\right)=\frac{1}{n} E\left(\sum_{i=1}^{n}\left(X_{i}-\mu+\mu-\overline{X}\right)^{2}\right)
E(S12)=n1∑i=1nE((Xi−X)2)=n1E(∑i=1n(Xi−μ+μ−X)2)
=
1
n
E
(
∑
i
=
1
n
(
(
X
i
−
μ
)
2
−
2
(
X
i
−
μ
)
(
X
‾
−
μ
)
+
(
X
‾
−
μ
)
2
)
)
=\frac{1}{n} E\left(\sum_{i=1}^{n}\left(\left(X_{i}-\mu\right)^{2}-2\left(X_{i}-\mu\right)(\overline{X}-\mu)+(\overline{X}-\mu)^{2}\right)\right)
=n1E(∑i=1n((Xi−μ)2−2(Xi−μ)(X−μ)+(X−μ)2))
=
1
n
E
(
∑
i
=
1
n
(
X
i
−
μ
)
2
−
2
∑
i
=
1
n
(
X
i
−
μ
)
(
X
‾
−
μ
)
+
n
(
X
‾
−
μ
)
2
)
=\frac{1}{n} E\left(\sum_{i=1}^{n}\left(X_{i}-\mu\right)^{2}-2 \sum_{i=1}^{n}\left(X_{i}-\mu\right)(\overline{X}-\mu)+n(\overline{X}-\mu)^{2}\right)
=n1E(∑i=1n(Xi−μ)2−2∑i=1n(Xi−μ)(X−μ)+n(X−μ)2)
=
1
n
E
(
∑
i
=
1
n
(
X
i
−
μ
)
2
−
n
(
X
‾
−
μ
)
(
X
‾
−
μ
)
+
n
(
X
‾
−
μ
)
2
)
=\frac{1}{n} E\left(\sum_{i=1}^{n}\left(X_{i}-\mu\right)^{2}-n(\overline{X}-\mu)(\overline{X}-\mu)+n(\overline{X}-\mu)^{2}\right)
=n1E(∑i=1n(Xi−μ)2−n(X−μ)(X−μ)+n(X−μ)2)
=
1
n
E
(
∑
i
=
1
n
(
X
i
−
μ
)
2
−
n
E
(
X
‾
−
μ
)
(
X
‾
−
μ
)
+
n
(
X
‾
−
μ
)
2
)
=\frac{1}{n} E\left(\sum_{i=1}^{n}\left(X_{i}-\mu\right)^{2}-n E(\overline{X}-\mu)(\overline{X}-\mu)+n(\overline{X}-\mu)^{2}\right)
=n1E(∑i=1n(Xi−μ)2−nE(X−μ)(X−μ)+n(X−μ)2)
=
1
n
(
∑
i
=
1
n
(
X
i
−
μ
)
2
−
n
E
(
X
‾
−
μ
)
(
X
‾
−
μ
)
+
n
(
X
‾
−
μ
)
2
)
=\frac{1}{n}\left(\sum_{i=1}^{n}\left(X_{i}-\mu\right)^{2}-n E(\overline{X}-\mu)(\overline{X}-\mu)+n(\overline{X}-\mu)^{2}\right)
=n1(∑i=1n(Xi−μ)2−nE(X−μ)(X−μ)+n(X−μ)2)
=
1
n
(
n
Var
(
X
)
−
n
Var
(
X
‾
)
)
=\frac{1}{n}(n \operatorname{Var}(X)-n \operatorname{Var}(\overline{X}))
=n1(nVar(X)−nVar(X))
=
Var
(
X
)
−
Var
(
X
‾
)
=
σ
2
−
σ
2
n
=
n
−
1
n
σ
2
=\operatorname{Var}(X)-\operatorname{Var}(\overline{X})=\sigma^{2}-\frac{\sigma^{2}}{n}=\frac{n-1}{n} \sigma^{2}
=Var(X)−Var(X)=σ2−nσ2=nn−1σ2
样本方差有偏是因为样本均值相对总体有偏,在这种情况下,样本方差比总体方差小1/n个总体方差,所以分母为n-1即可做到无偏。