本文为 I n t r o d u c t i o n Introduction Introduction t o to to P r o b a b i l i t y Probability Probability 的读书笔记
Convergence in Probability
- We can interpret the weak law of large numbers as stating that "
M
n
M_n
Mn converges to
μ
μ
μ". However. since
M
1
,
M
2
.
.
.
.
M_1, M_2 ....
M1,M2.... is a sequence of random variables, not a sequence of numbers, the meaning of convergence has to be made precise.
- Given this definition. the weak law of large numbers simply states that the sample mean converges in probability to the true mean μ μ μ.
- More generally, the Chebyshev inequality implies that if all Y n Y_n Yn have the same mean μ μ μ, and v a r ( Y n ) var(Y_n) var(Yn) converges to 0, then Y n Y_n Yn converges to μ μ μ in probability.
- If the random variables
Y
1
,
Y
2
,
.
.
.
.
Y_1, Y_2, ....
Y1,Y2,.... have a PMF or a PDF and converge in probability to
a
a
a, then according to the above definition, almost all of the PMF or PDF of
Y
n
Y_n
Yn is concentrated within
ϵ
\epsilon
ϵ of a for large values of
n
n
n. It is also instructive to rephrase the above definition as follows: for every
ϵ
>
0
\epsilon > 0
ϵ>0, and for every
δ
>
0
\delta > 0
δ>0, there exists some
n
0
n_0
n0 such that
P ( ∣ Y n − a ∣ ≥ ϵ ) ≤ δ f o r a l l n ≥ n 0 P(|Y_n-a|\geq\epsilon)\leq\delta\ \ \ \ \ \ for\ all\ n\geq n_0 P(∣Yn−a∣≥ϵ)≤δ for all n≥n0- If we refer to ϵ \epsilon ϵ as the accuracy level (精度), and δ \delta δ as the confidence level (置信水平), the definition takes the following intuitive form: for any given levels of accuracy and confidence, Y n Y_n Yn will be equal to a a a, within these levels of accuracy and confidence, provided that n n n is large enough.
Example 5.6.
- Consider a sequence of independent random variables
X
n
X_n
Xn that are uniformly distributed in the interval
[
0
,
1
]
[0, 1]
[0,1], and let
Y n = { X 1 , . . . , X n } Y_n=\{X_1,...,X_n\} Yn={X1,...,Xn} - In particular.
lim n → ∞ P ( ∣ Y n − 0 ∣ ≥ ϵ ) = lim n → ∞ ( 1 − ϵ ) n = 0 \lim_{n\rightarrow\infty}P(|Y_n-0|\geq\epsilon)=\lim_{n\rightarrow\infty}(1-\epsilon)^n=0 n→∞limP(∣Yn−0∣≥ϵ)=n→∞lim(1−ϵ)n=0Since this is true for every ϵ > 0 \epsilon > 0 ϵ>0, we conclude that Y n Y_n Yn converges to zero, in probability.
- One might be tempted to believe that if a sequence
Y
n
Y_n
Yn converges to a number
a
a
a, then
E
[
Y
n
]
E[Y_n]
E[Yn] must converge to
a
a
a.
- The following example shows that this need not be the case, and illustrates some of the limitations (局限性) of the notion of convergence in probability.
Example 5.8.
- Consider a sequence of discrete random variables
Y
n
Y_n
Yn with the following distribution:
- For every
ϵ
>
0
\epsilon > 0
ϵ>0, we have
lim n → ∞ P ( ∣ Y n − 0 ∣ ≥ ϵ ) = lim n → ∞ 1 n = 0 \lim_{n\rightarrow\infty}P(|Y_n-0|\geq\epsilon)=\lim_{n\rightarrow\infty}\frac{1}{n}=0 n→∞limP(∣Yn−0∣≥ϵ)=n→∞limn1=0and Y n Y_n Yn converges to zero in probability. - On the other hand,
E [ Y n ] = n 2 / n = n E[Y_n ] = n^2 /n = n E[Yn]=n2/n=nwhich goes to infinity as n n n increases.
Problem 5
Let
X
1
,
X
2
,
.
.
.
X_1,X_2, .. .
X1,X2,... be independent random variables that are uniformly distributed over
[
−
1
,
1
]
[-1, 1]
[−1,1]. Show that the sequence
Y
1
,
Y
2
.
.
.
.
Y_1, Y_2 ....
Y1,Y2.... converges in probability to some limit, and identify the limit.
Y
n
=
X
1
⋅
X
2
.
.
.
X
n
Y_n=X_1\cdot X_2... X_n
Yn=X1⋅X2...Xn
SOLUTION
- We have
E [ Y n ] = E [ X 1 ] . . . E [ X n ] = 0 E[Y_n] = E[X_1]...E[X_n] = 0 E[Yn]=E[X1]...E[Xn]=0Also
v a r ( Y n ) = E [ Y n 2 ] = E [ X 1 2 ] . . . E [ X 2 n ] = v a r ( X 1 ) n = ( 4 12 ) n var(Y_n) = E[Y_n^2]= E[X_1^2]...E[X_2^n]= var(X_1)^n =(\frac{4}{12})^n var(Yn)=E[Yn2]=E[X12]...E[X2n]=var(X1)n=(124)nso v a r ( Y n ) → 0 var(Y_n)\rightarrow 0 var(Yn)→0. Since all Y n Y_n Yn have 0 as a common mean, from Chebyshev’s inequality it follows that Y n Y_n Yn converges to 0 in probability.
Problem 6.
Consider two sequences of random variables
X
1
,
X
2
,
.
.
.
X_1, X_2, ...
X1,X2,... and
Y
1
,
Y
2
,
.
.
.
Y_1, Y_2, ...
Y1,Y2,..., which converge in probability to some constants. Let
c
c
c be another constant. Show that
c
X
n
cX_n
cXn,
X
n
+
Y
n
X_n + Y_n
Xn+Yn,
max
{
0
,
X
n
}
\max\{0, X_n \}
max{0,Xn},
∣
X
n
∣
|X_n|
∣Xn∣, and
X
n
Y
n
X_nY_n
XnYn all converge in probability to corresponding limits.
SOLUTION
- Let x x x and y y y be the limits of X n X_n Xn and Y n Y_n Yn, respectively. Fix some ϵ > 0 \epsilon > 0 ϵ>0 and a constant c c c. If c = 0 c = 0 c=0, then c X n cX_n cXn equals zero for all n n n, and convergence trivially holds. If c ≠ 0 c\neq0 c=0, we observe that P ( ∣ c X n − c x ∣ ≥ ϵ ) = P ( ∣ X n − x ∣ ≥ ϵ / ∣ c ∣ ) P(|cX_n-cx|\geq \epsilon)=P(|X_n-x|\geq \epsilon/|c|) P(∣cXn−cx∣≥ϵ)=P(∣Xn−x∣≥ϵ/∣c∣), which converges to zero, thus establishing convergence in probability of c X n cX_n cXn.
- We note that
P ( ∣ X n + Y n − x − y ∣ ≥ ϵ ) ≤ P ( ∣ X n − x ∣ ≥ ϵ / 2 ) + P ( ∣ Y n − y ∣ ≥ ϵ / 2 ) P(|X_n + Y_n-x-y|\geq\epsilon)\leq P(|X_n-x|\geq \epsilon/2)+P(|Y_n-y|\geq \epsilon/2) P(∣Xn+Yn−x−y∣≥ϵ)≤P(∣Xn−x∣≥ϵ/2)+P(∣Yn−y∣≥ϵ/2)Therefore,
lim n → ∞ P ( ∣ X n + Y n − x − y ∣ ≥ ϵ ) ≤ lim n → ∞ P ( ∣ X n − x ∣ ≥ ϵ / 2 ) + lim n → ∞ P ( ∣ Y n − y ∣ ≥ ϵ / 2 ) = 0 \lim_{n\rightarrow\infty}P(|X_n + Y_n-x-y|\geq\epsilon)\leq \lim_{n\rightarrow\infty}P(|X_n-x|\geq \epsilon/2)+\lim_{n\rightarrow\infty}P(|Y_n-y|\geq \epsilon/2)=0 n→∞limP(∣Xn+Yn−x−y∣≥ϵ)≤n→∞limP(∣Xn−x∣≥ϵ/2)+n→∞limP(∣Yn−y∣≥ϵ/2)=0 - We have
lim n → ∞ P ( { ∣ max { 0 , X n } − max { 0 , x } ∣ ≥ ϵ } ) = lim n → ∞ P ( { ∣ max { 0 , X n } − max { 0 , x } ∣ ≥ ϵ } ∣ x X n ≥ 0 ) P ( x X n ≥ 0 ) + lim n → ∞ P ( { ∣ max { 0 , X n } − max { 0 , x } ∣ ≥ ϵ } ∣ x X n < 0 ) P ( x X n < 0 ) = lim n → ∞ P ( { ∣ max { 0 , X n } − max { 0 , x } ∣ ≥ ϵ } ∣ x X n ≥ 0 ) \begin{aligned} &\lim_{n\rightarrow\infty}P(\{|\max\{0,X_n\}-\max\{0,x\}|\geq\epsilon\}) \\=&\lim_{n\rightarrow\infty}P(\{|\max\{0,X_n\}-\max\{0,x\}|\geq\epsilon\}|xX_n\geq0)P(xX_n\geq0)+\lim_{n\rightarrow\infty}P(\{|\max\{0,X_n\}-\max\{0,x\}|\geq\epsilon\}|xX_n<0)P(xX_n<0) \\=&\lim_{n\rightarrow\infty}P(\{|\max\{0,X_n\}-\max\{0,x\}|\geq\epsilon\}|xX_n\geq0) \end{aligned} ==n→∞limP({∣max{0,Xn}−max{0,x}∣≥ϵ})n→∞limP({∣max{0,Xn}−max{0,x}∣≥ϵ}∣xXn≥0)P(xXn≥0)+n→∞limP({∣max{0,Xn}−max{0,x}∣≥ϵ}∣xXn<0)P(xXn<0)n→∞limP({∣max{0,Xn}−max{0,x}∣≥ϵ}∣xXn≥0)- If
x
>
0
x>0
x>0, then
lim n → ∞ P ( { ∣ max { 0 , X n } − max { 0 , x } ∣ ≥ ϵ } ) = lim n → ∞ P ( { ∣ X n − x ∣ ≥ ϵ } ∣ x X n ≥ 0 ) = 0 \lim_{n\rightarrow\infty}P(\{|\max\{0,X_n\}-\max\{0,x\}|\geq\epsilon\})=\lim_{n\rightarrow\infty}P(\{|X_n-x|\geq\epsilon\}|xX_n\geq0)=0 n→∞limP({∣max{0,Xn}−max{0,x}∣≥ϵ})=n→∞limP({∣Xn−x∣≥ϵ}∣xXn≥0)=0 - If
x
≤
0
x\leq0
x≤0, then
lim n → ∞ P ( { ∣ max { 0 , X n } − max { 0 , x } ∣ ≥ ϵ } ) = lim n → ∞ P ( { 0 ≥ ϵ } ∣ x X n ≥ 0 ) = 0 \lim_{n\rightarrow\infty}P(\{|\max\{0,X_n\}-\max\{0,x\}|\geq\epsilon\})=\lim_{n\rightarrow\infty}P(\{0\geq\epsilon\}|xX_n\geq0)=0 n→∞limP({∣max{0,Xn}−max{0,x}∣≥ϵ})=n→∞limP({0≥ϵ}∣xXn≥0)=0 - Thus, we have
lim n → ∞ P ( { ∣ max { 0 , X n } − max { 0 , x } ∣ ≥ ϵ } ) = 0 \lim_{n\rightarrow\infty}P(\{|\max\{0,X_n\}-\max\{0,x\}|\geq\epsilon\})=0 n→∞limP({∣max{0,Xn}−max{0,x}∣≥ϵ})=0
- If
x
>
0
x>0
x>0, then
- We have ∣ X n ∣ = max { 0 , X n } + max { 0 , − X n } |X_n| = \max\{0, X_n\}+\max\{0, -X_n\} ∣Xn∣=max{0,Xn}+max{0,−Xn}. Since max { 0 , X n } \max\{0, X_n\} max{0,Xn} and max { 0 , − X n } \max\{0, -X_n\} max{0,−Xn} converge, it follows that their sum, ∣ X n ∣ |X_n| ∣Xn∣, converges to max { 0 , x } + max { 0 , − x } = ∣ x ∣ \max\{0, x\}+\max\{0, -x\}=|x| max{0,x}+max{0,−x}=∣x∣ in probability.
- Finally, we have
P ( ∣ X n Y n − x y ∣ ≥ ϵ ) = P ( ∣ ( X n − x ) ( Y n − y ) + x Y n + y X n − 2 x y ∣ ≥ ϵ ) ≤ P ( ∣ ( X n − x ) ( Y n − y ) ∣ ≥ ϵ / 2 ) + P ( ∣ x Y n + y X n − 2 x y ∣ ≥ ϵ / 2 ) ≤ P ( ∣ X n − x ∣ ≥ ϵ / 2 ) P ( ∣ Y n − x ∣ ≥ ϵ / 2 ) + P ( ∣ x Y n + y X n − 2 x y ∣ ≥ ϵ / 2 ) \begin{aligned}P(|X_nY_n-xy|\geq\epsilon)&=P(|(X_n-x)(Y_n-y)+xY_n+yX_n-2xy|\geq\epsilon) \\&\leq P(|(X_n-x)(Y_n-y)|\geq\epsilon/2)+P(|xY_n+yX_n-2xy|\geq\epsilon/2) \\&\leq P(|X_n-x|\geq\sqrt{\epsilon/2})P(|Y_n-x|\geq\sqrt{\epsilon/2})+P(|xY_n+yX_n-2xy|\geq\epsilon/2)\end{aligned} P(∣XnYn−xy∣≥ϵ)=P(∣(Xn−x)(Yn−y)+xYn+yXn−2xy∣≥ϵ)≤P(∣(Xn−x)(Yn−y)∣≥ϵ/2)+P(∣xYn+yXn−2xy∣≥ϵ/2)≤P(∣Xn−x∣≥ϵ/2)P(∣Yn−x∣≥ϵ/2)+P(∣xYn+yXn−2xy∣≥ϵ/2)Since x Y n xY_n xYn and y X n yX_n yXn both converge to x y xy xy in probability. the last probability in the above expression converges to 0. It will thus suffice to show that
lim x → ∞ P ( ∣ X n Y n − x y ∣ ≥ ϵ ) ≤ 0 \begin{aligned}\lim_{x\rightarrow\infty}P(|X_nY_n-xy|\geq\epsilon)&\leq0\end{aligned} x→∞limP(∣XnYn−xy∣≥ϵ)≤0
Problem 7.
A sequence
X
n
X_n
Xn of random variables is said to converge to a number
c
c
c in the mean square (均方收敛), if
lim
n
→
∞
E
[
(
X
n
−
c
)
2
]
=
0
\lim_{n\rightarrow\infty}E[(X_n-c)^2]=0
n→∞limE[(Xn−c)2]=0
- (a) Show that convergence in the mean square implies convergence in probability.
- (b) Give an example that shows that convergence in probability does not imply convergence in the mean square.
SOLUTION
-
(
a
)
(a)
(a) Suppose that
X
n
X_n
Xn converges to
c
c
c in the mean square. Using the Markov inequality, we have
P ( ∣ X n − c ∣ ≥ ϵ ) = P ( ( X n − c ) 2 ≥ ϵ 2 ) ≤ E [ ( X n − c ) 2 ] ϵ 2 P(|X_n-c|\geq\epsilon)=P((X_n-c)^2\geq\epsilon^2)\leq\frac{E[(X_n-c)^2]}{\epsilon^2} P(∣Xn−c∣≥ϵ)=P((Xn−c)2≥ϵ2)≤ϵ2E[(Xn−c)2]Taking the limit as n → ∞ n\rightarrow\infty n→∞. we obtain
lim n → ∞ P ( ∣ X n − c ∣ ≥ ϵ ) = 0 \lim_{n\rightarrow\infty}P(|X_n-c|\geq\epsilon)=0 n→∞limP(∣Xn−c∣≥ϵ)=0 - (b) In Example 5.8, we have convergence in probability to 0 but E [ Y 2 ] = n 3 E[Y^2] = n^3 E[Y2]=n3 , which diverges to infinity.