【统计推断类题目】
设总体
X
X
X 服从
[
0
,
θ
]
[0, \theta]
[0,θ] 上的均匀分布,其中
θ
∈
(
0
,
+
∞
)
\theta \in (0, +\infty)
θ∈(0,+∞) 为未知参数,
X
1
,
X
2
,
⋯
,
X
n
X_1, X_2, \cdots, X_n
X1,X2,⋯,Xn 是来自总体
X
X
X 的简单随机样本,记
X
(
n
)
=
max
{
X
1
,
X
2
,
⋯
,
X
n
}
,
T
c
=
c
X
(
n
)
X(n) = \max \{ X_1, X_2, \cdots, X_n \}, \quad T_c = cX(n)
X(n)=max{X1,X2,⋯,Xn},Tc=cX(n)。
(1) 求 c c c,使得 T c T_c Tc 是 θ \theta θ 的无偏估计;
(2) 记 h ( c ) = E ( T c − θ ) 2 h(c) = E(T_c - \theta)^2 h(c)=E(Tc−θ)2,求 c c c 使得 h ( c ) h(c) h(c) 最小。
—————— 解答 —————
【解法一】
首先,由于
X
i
∼
U
(
0
,
θ
)
X_i \sim \operatorname{U}(0, \theta)
Xi∼U(0,θ),记
Y
=
X
(
n
)
Y=X(n)
Y=X(n) 为样本最大值,其累积分布函数为
F
Y
(
y
)
=
(
y
θ
)
n
,
0
≤
y
≤
θ
,
F_Y(y) = \left(\frac{y}{\theta}\right)^n, \quad 0\le y\le\theta,
FY(y)=(θy)n,0≤y≤θ,
故其概率密度函数为
f
Y
(
y
)
=
d
d
y
F
Y
(
y
)
=
n
θ
n
y
n
−
1
,
0
≤
y
≤
θ
.
f_Y(y) = \frac{d}{dy}F_Y(y) = \frac{n}{\theta^n}y^{n-1}, \quad 0\le y\le\theta.
fY(y)=dydFY(y)=θnnyn−1,0≤y≤θ.
【(1) 求无偏估计】
计算
Y
Y
Y 的均值:
E
(
Y
)
=
∫
0
θ
y
n
θ
n
y
n
−
1
d
y
=
n
θ
n
∫
0
θ
y
n
d
y
=
n
θ
n
⋅
θ
n
+
1
n
+
1
=
n
n
+
1
θ
.
E(Y) = \int_0^\theta y \frac{n}{\theta^n}y^{n-1}\,dy = \frac{n}{\theta^n}\int_0^\theta y^n\,dy = \frac{n}{\theta^n} \cdot \frac{\theta^{n+1}}{n+1} = \frac{n}{n+1}\theta.
E(Y)=∫0θyθnnyn−1dy=θnn∫0θyndy=θnn⋅n+1θn+1=n+1nθ.
由于估计量
T
c
=
c
Y
T_c = cY
Tc=cY 要无偏,即需满足
E
(
T
c
)
=
c
E
(
Y
)
=
c
n
n
+
1
θ
=
θ
,
E(T_c)= cE(Y)= c\frac{n}{n+1}\theta = \theta,
E(Tc)=cE(Y)=cn+1nθ=θ,
从而解得
c
=
n
+
1
n
.
c = \frac{n+1}{n}.
c=nn+1.
【(2) 求最小均方误差】
计算均方误差:
h
(
c
)
=
E
[
(
c
Y
−
θ
)
2
]
=
c
2
E
(
Y
2
)
−
2
c
θ
E
(
Y
)
+
θ
2
.
h(c) = E[(cY-\theta)^2] = c^2E(Y^2) - 2c\theta E(Y) + \theta^2.
h(c)=E[(cY−θ)2]=c2E(Y2)−2cθE(Y)+θ2.
同样地,可计算
E
(
Y
2
)
=
∫
0
θ
y
2
n
θ
n
y
n
−
1
d
y
=
n
θ
n
∫
0
θ
y
n
+
1
d
y
=
n
θ
n
⋅
θ
n
+
2
n
+
2
=
n
n
+
2
θ
2
.
E(Y^2) = \int_0^\theta y^2\frac{n}{\theta^n}y^{n-1}\,dy = \frac{n}{\theta^n}\int_0^\theta y^{n+1}\,dy = \frac{n}{\theta^n} \cdot \frac{\theta^{n+2}}{n+2} = \frac{n}{n+2}\theta^2.
E(Y2)=∫0θy2θnnyn−1dy=θnn∫0θyn+1dy=θnn⋅n+2θn+2=n+2nθ2.
代入上式得
h
(
c
)
=
c
2
n
n
+
2
θ
2
−
2
c
n
n
+
1
θ
2
+
θ
2
=
θ
2
(
c
2
n
n
+
2
−
2
c
n
n
+
1
+
1
)
.
h(c) = c^2\frac{n}{n+2}\theta^2 - 2c\frac{n}{n+1}\theta^2 + \theta^2 = \theta^2\left(c^2\frac{n}{n+2} - 2c\frac{n}{n+1} + 1\right).
h(c)=c2n+2nθ2−2cn+1nθ2+θ2=θ2(c2n+2n−2cn+1n+1).
令关于
c
c
c 的表达式取极值,对
c
c
c 求导并置零:
d
d
c
(
c
2
n
n
+
2
−
2
c
n
n
+
1
+
1
)
=
2
c
n
n
+
2
−
2
n
n
+
1
=
0.
\frac{d}{dc} \left(c^2\frac{n}{n+2} - 2c\frac{n}{n+1} + 1\right) = 2c\frac{n}{n+2} - 2\frac{n}{n+1} = 0.
dcd(c2n+2n−2cn+1n+1)=2cn+2n−2n+1n=0.
解得
c
=
n
+
2
n
+
1
.
c = \frac{n+2}{n+1}.
c=n+1n+2.
—————— 结论 —————
(1) 当 c = n + 1 n c=\frac{n+1}{n} c=nn+1 时, T c T_c Tc 为 θ \theta θ 的无偏估计;
(2) 当 c = n + 2 n + 1 c=\frac{n+2}{n+1} c=n+1n+2 时,均方误差 h ( c ) h(c) h(c) 取得最小值。
【解法二】
注意到样本极大值 Y 的概率密度函数为
f
Y
(
y
)
=
n
θ
n
y
n
−
1
,
0
≤
y
≤
θ
,
f_Y(y)= \frac{n}{\theta^n}y^{n-1},\quad 0\le y\le \theta,
fY(y)=θnnyn−1,0≤y≤θ,
实际上可将 Y 表示成
Y
=
θ
Z
,
Y=\theta Z,
Y=θZ,
其中
Z
∼
B
e
t
a
(
n
,
1
)
Z\sim \mathrm{Beta}(n,1)
Z∼Beta(n,1),因此有
E
(
Z
)
=
n
n
+
1
,
E
(
Z
2
)
=
n
n
+
2
.
E(Z)=\frac{n}{n+1},\quad E(Z^2)=\frac{n}{n+2}.
E(Z)=n+1n,E(Z2)=n+2n.
【(1) 求无偏估计】
记估计量为
T
c
=
c
Y
=
c
θ
Z
T_c=cY=c\theta Z
Tc=cY=cθZ,则其期望为
E
(
T
c
)
=
c
θ
E
(
Z
)
=
c
θ
n
n
+
1
.
E(T_c)=c\theta E(Z)=c\theta\frac{n}{n+1}.
E(Tc)=cθE(Z)=cθn+1n.
令
E
(
T
c
)
=
θ
E(T_c)=\theta
E(Tc)=θ,则必有
c
n
n
+
1
=
1
,
⟹
c
=
n
+
1
n
.
c\frac{n}{n+1}=1,\quad\Longrightarrow\quad c=\frac{n+1}{n}.
cn+1n=1,⟹c=nn+1.
【(2) 求最小均方误差】
目标是最小化均方误差
h
(
c
)
=
E
[
(
c
Y
−
θ
)
2
]
=
θ
2
E
[
(
c
Z
−
1
)
2
]
.
h(c)=E[(cY-\theta)^2]=\theta^2E[(cZ-1)^2].
h(c)=E[(cY−θ)2]=θ2E[(cZ−1)2].
将平方展开得
h
(
c
)
=
θ
2
(
c
2
E
(
Z
2
)
−
2
c
E
(
Z
)
+
1
)
=
θ
2
(
c
2
n
n
+
2
−
2
c
n
n
+
1
+
1
)
.
h(c)=\theta^2\left(c^2E(Z^2)-2cE(Z)+1\right) =\theta^2\left(c^2\frac{n}{n+2}-2c\frac{n}{n+1}+1\right).
h(c)=θ2(c2E(Z2)−2cE(Z)+1)=θ2(c2n+2n−2cn+1n+1).
下面采用配方法对括号内的表达式进行处理:
设
Q
(
c
)
=
c
2
n
n
+
2
−
2
c
n
n
+
1
+
1.
Q(c)=c^2\frac{n}{n+2}-2c\frac{n}{n+1}+1.
Q(c)=c2n+2n−2cn+1n+1.
首先,将
Q
(
c
)
Q(c)
Q(c) 中的二次项因子提取出来:
Q
(
c
)
=
n
n
+
2
[
c
2
−
2
c
(
n
+
2
n
+
1
)
]
+
1.
Q(c)=\frac{n}{n+2}\left[c^2-2c\left(\frac{n+2}{n+1}\right)\right]+1.
Q(c)=n+2n[c2−2c(n+1n+2)]+1.
为了配成完全平方式,我们在括号中加减
(
n
+
2
n
+
1
)
2
\left(\frac{n+2}{n+1}\right)^2
(n+1n+2)2:
c
2
−
2
c
(
n
+
2
n
+
1
)
=
(
c
−
n
+
2
n
+
1
)
2
−
(
n
+
2
n
+
1
)
2
.
c^2-2c\left(\frac{n+2}{n+1}\right) =\left(c-\frac{n+2}{n+1}\right)^2-\left(\frac{n+2}{n+1}\right)^2.
c2−2c(n+1n+2)=(c−n+1n+2)2−(n+1n+2)2.
因此,
Q
(
c
)
=
n
n
+
2
[
(
c
−
n
+
2
n
+
1
)
2
−
(
n
+
2
n
+
1
)
2
]
+
1
,
Q(c)=\frac{n}{n+2}\left[\left(c-\frac{n+2}{n+1}\right)^2-\left(\frac{n+2}{n+1}\right)^2\right]+1,
Q(c)=n+2n[(c−n+1n+2)2−(n+1n+2)2]+1,
即
Q
(
c
)
=
n
n
+
2
(
c
−
n
+
2
n
+
1
)
2
+
[
1
−
n
n
+
2
(
n
+
2
n
+
1
)
2
]
.
Q(c)=\frac{n}{n+2}\left(c-\frac{n+2}{n+1}\right)^2 +\left[1-\frac{n}{n+2}\left(\frac{n+2}{n+1}\right)^2\right].
Q(c)=n+2n(c−n+1n+2)2+[1−n+2n(n+1n+2)2].
由于第一项为非负项,故当且仅当
c
=
n
+
2
n
+
1
,
c=\frac{n+2}{n+1},
c=n+1n+2,
时,
Q
(
c
)
Q(c)
Q(c) 取得最小值,从而均方误差
h
(
c
)
h(c)
h(c) 也取得最小值。
—————— 结论 —————
(1) 当 c = n + 1 n c=\frac{n+1}{n} c=nn+1 时, T c = c Y T_c=cY Tc=cY 为 θ \theta θ 的无偏估计;
(2) 当 c = n + 2 n + 1 c=\frac{n+2}{n+1} c=n+1n+2 时, T c = c Y T_c=cY Tc=cY 使均方误差达到最小。