0x00 前言
作为学术生涯的最后一门课,选了一门据说是最难的,上下来的感觉也确实是难得不行,不太懂……
决定照着ppt和上课的笔记整理一下,以此争取达到复习的目的。
(意思是有些虽然写出来了,但自己都不见得明白,有的部分存疑后续去询问之后再做修改)
Useful Inequalities
在随机算法的问题中有大量不等式常被使用,为了在运用时能想得起来,有些甚至要背熟。
0x01 Union Bound
Randomized Algorithm - Chapter 3.2 (P45)
n个随机事件各自发生的概率之和,不小于这n个事件中至少有一个发生的概率
Let
E
i
E_i
Ei be a random event, then we have
P
r
[
∪
i
=
1
n
E
i
]
≤
∑
i
=
1
n
P
r
(
E
i
)
Pr[\cup_{i=1}^{n}E_i] \le \sum_{i=1}^{n}Pr(E_i)
Pr[∪i=1nEi]≤i=1∑nPr(Ei)
0x02 马尔可夫不等式 (Markov Inequality)
Let
Y
Y
Y be a random variable assuming only non-negative values. Then
for all
t
>
0
,
P
r
[
Y
≥
t
]
≤
E
[
Y
]
t
\text{for all } t>0,~Pr[Y \ge t]\le \frac{E[Y]}{t}
for all t>0, Pr[Y≥t]≤tE[Y]
0x03 切比雪夫不等式 (Chebyshev’s Inequality)
Let
X
X
X be a random variable with expectation
μ
X
\mu_X
μX and standard deviation
σ
X
\sigma_X
σX, then
for any
t
>
0
,
P
r
[
∣
X
−
μ
X
∣
≥
t
σ
X
]
≤
1
t
2
\text{for any }t>0,~Pr[|X-\mu_X|\ge t\sigma_X] \le \frac{1}{t^2}
for any t>0, Pr[∣X−μX∣≥tσX]≤t21
0x04 切尔诺夫约束 (Chernoff’s Bound)
Randomized Algorithm - Chapter 4.1 (P67)
切尔诺夫约束有三种表现方式,在多个独立的泊松实验中
Let
X
1
,
X
2
,
⋯
 
,
X
n
X_1, X_2, \cdots, X_n
X1,X2,⋯,Xn be independent Poisson trials such that,
for
1
≤
i
≤
n
,
P
r
[
X
i
=
1
]
=
p
i
1 \le i \le n,~Pr[X_i=1]=p_i
1≤i≤n, Pr[Xi=1]=pi, where
0
<
p
i
<
1
0<p_i<1
0<pi<1. Then
Chernoff’s Bound(1)
for
X
=
∑
i
=
1
n
X
i
,
μ
=
E
[
X
]
=
∑
i
=
1
n
p
i
,
and any
δ
>
0
,
\text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } \delta>0,
for X=i=1∑nXi, μ=E[X]=i=1∑npi, and any δ>0,
P
r
[
X
>
(
1
+
δ
)
μ
]
<
[
e
δ
(
1
+
δ
)
(
1
+
δ
)
]
μ
Pr[X>(1+\delta)\mu]<\left[ \frac{e^{\delta}}{(1+\delta)^{(1+\delta)}} \right]^{\mu}
Pr[X>(1+δ)μ]<[(1+δ)(1+δ)eδ]μ
Chernoff’s Bound(2)
for
X
=
∑
i
=
1
n
X
i
,
μ
=
E
[
X
]
=
∑
i
=
1
n
p
i
,
and any
0
<
δ
<
1
,
\text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } 0<\delta<1,
for X=i=1∑nXi, μ=E[X]=i=1∑npi, and any 0<δ<1,
P
r
[
X
<
(
1
−
δ
)
μ
]
<
[
e
−
δ
(
1
−
δ
)
(
1
−
δ
)
]
μ
Pr[X<(1-\delta)\mu]<\left[ \frac{e^{-\delta}}{(1-\delta)^{(1-\delta)}} \right]^{\mu}
Pr[X<(1−δ)μ]<[(1−δ)(1−δ)e−δ]μ
Chernoff’s Bound(3)
for
X
=
∑
i
=
1
n
X
i
,
μ
=
E
[
X
]
=
∑
i
=
1
n
p
i
,
and any
0
<
δ
<
1
,
\text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } 0<\delta<1,
for X=i=1∑nXi, μ=E[X]=i=1∑npi, and any 0<δ<1,
P
r
[
∣
X
−
μ
∣
>
δ
μ
]
<
2
e
−
δ
2
3
μ
Pr[|X-\mu| >\delta\mu]<2e^{-\frac{\delta^2}{3}\mu}
Pr[∣X−μ∣>δμ]<2e−3δ2μ
0x05 Prove in detail
Chebyshev’s Inequality in 0x03
Let X X X be a random variable with expectation μ X \mu_X μX and standard deviation σ X \sigma_X σX, then
for any t > 0 , P r [ ∣ X − μ X ∣ ≥ t σ X ] ≤ 1 t 2 \text{for any }t>0,~Pr[|X-\mu_X|\ge t\sigma_X] \le \frac{1}{t^2} for any t>0, Pr[∣X−μX∣≥tσX]≤t21
P r ( ∣ X − μ X ∣ ≥ t σ X ) = P r ( ( X − μ X ) 2 ≥ ( t σ X ) 2 ) set Y ≜ ( X − μ X ) 2 ≥ 0 P r ( Y ≥ ( t σ ) 2 ) ≤ E ( Y ) ( t σ X ) 2 ∵ E ( Y ) = E ( ( X − μ X ) 2 ) = σ X 2 ∴ P r ( Y ≥ ( t σ ) 2 ) ≤ σ X 2 ( t σ X ) 2 = 1 t 2 \begin{aligned} Pr \left( |X-\mu_X| \ge t\sigma_X \right) \\ = Pr \left( (X-\mu_X)^2 \ge (t\sigma_X)^2 \right) \\ \textbf{set } Y \triangleq (X-\mu_X)^2 \ge 0 \\ Pr \left( Y \ge (t\sigma)^2 \right) \le \frac{E(Y)}{(t\sigma_X)^2} \\ \because E(Y) = E\left( (X-\mu_X)^2 \right) = \sigma_X^2 \\ \therefore Pr \left( Y \ge (t\sigma)^2 \right) \le \frac{\sigma_X^2}{(t\sigma_X)^2} = \frac{1}{t^2} \\ \end{aligned} Pr(∣X−μX∣≥tσX)=Pr((X−μX)2≥(tσX)2)set Y≜(X−μX)2≥0Pr(Y≥(tσ)2)≤(tσX)2E(Y)∵E(Y)=E((X−μX)2)=σX2∴Pr(Y≥(tσ)2)≤(tσX)2σX2=t21
Chernoff’s Bound in 0x04
Let X 1 , X 2 , ⋯   , X n X_1, X_2, \cdots, X_n X1,X2,⋯,Xn be independent Poisson trials such that,
for 1 ≤ i ≤ n , P r [ X i = 1 ] = p i 1 \le i \le n,~Pr[X_i=1]=p_i 1≤i≤n, Pr[Xi=1]=pi, where 0 < p i < 1 0<p_i<1 0<pi<1. Then
Chernoff’s Bound(1)
for X = ∑ i = 1 n X i , μ = E [ X ] = ∑ i = 1 n p i , and any δ > 0 , \text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } \delta>0, for X=i=1∑nXi, μ=E[X]=i=1∑npi, and any δ>0,
P r [ X > ( 1 + δ ) μ ] < [ e δ ( 1 + δ ) ( 1 + δ ) ] μ Pr[X>(1+\delta)\mu]<\left[ \frac{e^{\delta}}{(1+\delta)^{(1+\delta)}} \right]^{\mu} Pr[X>(1+δ)μ]<[(1+δ)(1+δ)eδ]μ
对于随机变量 (RandomVariable):
R . V . x 1 , x 2 , ⋯   , x n P r ( X i = 1 ) = p i , P r ( X i = 0 ) = 1 − p i μ = ∑ i = 1 n p i , X = ∑ i = 1 n x i , E ( X ) = μ P r ( X > ( 1 + δ ) μ ) ≤ E ( X ) ( 1 + δ ) μ = 1 1 + δ = P r ( e λ X > e λ ( 1 + δ ) μ ) ≤ E ( e λ X ) e λ ( 1 + δ ) μ ≤ e μ ( e λ − 1 ) e λ ( 1 + δ ) μ \begin{aligned} & R.V. ~x_1, x_2, \cdots, x_n \\ & Pr(X_i=1) = p_i, Pr(X_i=0) = 1-p_i \\ & \mu = \sum_{i=1}^{n}p_i, X = \sum_{i=1}^{n}x_i, E(X)=\mu \\ & Pr(X>(1+\delta)\mu) \le \frac{E(X)}{(1+\delta)\mu} = \frac{1}{1+\delta} \\ =~& Pr(e^{\lambda X}>e^{\lambda(1+\delta)\mu}) \le \frac{E(e\lambda X)}{e^{\lambda(1+\delta)\mu}}\le \frac{e^{\mu(e^{\lambda}-1)}}{e^{\lambda(1+\delta)\mu}} \\ \end{aligned} = R.V. x1,x2,⋯,xnPr(Xi=1)=pi,Pr(Xi=0)=1−piμ=i=1∑npi,X=i=1∑nxi,E(X)=μPr(X>(1+δ)μ)≤(1+δ)μE(X)=1+δ1Pr(eλX>eλ(1+δ)μ)≤eλ(1+δ)μE(eλX)≤eλ(1+δ)μeμ(eλ−1)
令 λ = l n ( 1 + δ ) \lambda = ln(1+\delta) λ=ln(1+δ),则上式化为 ( e δ ( 1 + δ ) ( 1 + δ ) ) μ \left( \frac{e^{\delta}}{(1+\delta)^{(1+\delta)}} \right)^{\mu} ((1+δ)(1+δ)eδ)μ,得证。
Chernoff’s Bound(2)
for X = ∑ i = 1 n X i , μ = E [ X ] = ∑ i = 1 n p i , and any 0 < δ < 1 , \text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } 0<\delta<1, for X=i=1∑nXi, μ=E[X]=i=1∑npi, and any 0<δ<1,
P r [ X < ( 1 − δ ) μ ] < [ e − δ ( 1 − δ ) ( 1 − δ ) ] μ Pr[X<(1-\delta)\mu]<\left[ \frac{e^{-\delta}}{(1-\delta)^{(1-\delta)}} \right]^{\mu} Pr[X<(1−δ)μ]<[(1−δ)(1−δ)e−δ]μ
其中:
E ( e − λ X ) = E ( e − λ ( ∑ i = 1 n X i ) ) = E ( ∏ i = 1 n e − λ X i ) = ∏ i = 1 n E ( e − λ X i ) = ∏ i = 1 n ( p i ⋅ e − λ + ( 1 − p i ) ) = ∏ i = 1 n ( 1 + p i ( e − λ − 1 ) ) = e μ ( e − λ − 1 ) \begin{aligned} E(e^{-\lambda X}) &= E(e^{-\lambda(\sum_{i=1}^{n}X_i)}) \\ &= E(\prod_{i=1}^{n} e^{-\lambda X_i}) = \prod_{i=1}^{n}E(e^{-\lambda X_i}) \\ &= \prod_{i=1}^{n}(p_i \cdot e^{-\lambda} + (1-p_i)) \\ &= \prod_{i=1}^{n}( 1 + p_i (e^{-\lambda}-1)) \\ &= e^{\mu(e^{-\lambda}-1)} \end{aligned} E(e−λX)=E(e−λ(∑i=1nXi))=E(i=1∏ne−λXi)=i=1∏nE(e−λXi)=i=1∏n(pi⋅e−λ+(1−pi))=i=1∏n(1+pi(e−λ−1))=eμ(e−λ−1)
代入原式子, 有:
P r [ X < ( 1 − δ ) μ ] ≤ E ( e − λ X ) e − λ ( 1 − δ ) μ = e μ ( e − λ − 1 ) e − λ ( 1 − δ ) μ = e μ ( e − λ − 1 + λ − λ δ ) \begin{aligned} Pr[X < (1-\delta)\mu] &\le \frac{E(e^{-\lambda X})}{e^{-\lambda (1-\delta) \mu}} \\ &= \frac{e^{\mu(e^{-\lambda}-1)}}{e^{-\lambda (1-\delta) \mu}} \\ &= e^{\mu(e^{-\lambda}-1+\lambda-\lambda\delta)} \end{aligned} Pr[X<(1−δ)μ]≤e−λ(1−δ)μE(e−λX)=e−λ(1−δ)μeμ(e−λ−1)=eμ(e−λ−1+λ−λδ)
令
f
(
λ
)
=
e
−
λ
−
1
+
λ
−
λ
δ
f(\lambda) = e^{-\lambda}-1+\lambda-\lambda\delta
f(λ)=e−λ−1+λ−λδ,
当
f
′
(
λ
)
=
−
e
−
λ
+
1
−
δ
=
0
f'(\lambda) = -e^{-\lambda} + 1 - \delta = 0
f′(λ)=−e−λ+1−δ=0 时,
λ
=
−
ln
(
1
−
δ
)
\lambda = -\ln (1-\delta)
λ=−ln(1−δ)
故
P
r
[
X
<
(
1
−
δ
)
μ
]
<
e
μ
f
(
−
l
n
(
1
−
δ
)
)
=
(
e
−
δ
(
1
−
δ
)
(
1
−
δ
)
)
μ
Pr[X<(1-\delta)\mu] < e^{\mu f(-ln(1-\delta))} = \left( \frac{e^{-\delta}}{(1-\delta)^{(1-\delta)}} \right)^{\mu}
Pr[X<(1−δ)μ]<eμf(−ln(1−δ))=((1−δ)(1−δ)e−δ)μ
Chernoff’s Bound(3)
for X = ∑ i = 1 n X i , μ = E [ X ] = ∑ i = 1 n p i , and any 0 < δ < 1 , \text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } 0<\delta<1, for X=i=1∑nXi, μ=E[X]=i=1∑npi, and any 0<δ<1,
P r [ ∣ X − μ ∣ > δ μ ] < 2 e − δ 2 3 μ Pr[|X-\mu| >\delta\mu]<2e^{-\frac{\delta^2}{3}\mu} Pr[∣X−μ∣>δμ]<2e−3δ2μ
首先去掉绝对值符号:
P
r
[
∣
X
−
μ
∣
>
δ
μ
]
=
P
r
[
X
−
μ
>
δ
μ
]
+
P
r
[
X
−
μ
<
−
δ
μ
]
Pr[|X-\mu| > \delta\mu] = Pr[X-\mu > \delta\mu] + Pr[X-\mu < -\delta\mu]
Pr[∣X−μ∣>δμ]=Pr[X−μ>δμ]+Pr[X−μ<−δμ]
对于第一个部分:
P
r
[
X
−
μ
>
δ
μ
]
=
P
r
[
X
>
(
δ
+
1
)
μ
]
<
(
e
δ
(
1
+
δ
)
(
1
+
δ
)
)
μ
=
e
μ
⋅
(
δ
−
(
1
+
δ
)
ln
(
1
+
δ
)
)
<
e
−
3
δ
2
μ
\begin{aligned} Pr[X-\mu > \delta\mu] &= Pr[X > (\delta+1)\mu] \\ &< \left( \frac{e^{\delta}}{(1+\delta)^{(1+\delta)}} \right)^{\mu} \\ &= e^{\mu \cdot (\delta - (1+\delta) \ln (1+\delta))} \\ &< e^{-\frac{3}{\delta^2}\mu} \end{aligned}
Pr[X−μ>δμ]=Pr[X>(δ+1)μ]<((1+δ)(1+δ)eδ)μ=eμ⋅(δ−(1+δ)ln(1+δ))<e−δ23μ
同理可证
P
r
[
X
−
μ
<
−
δ
μ
]
<
e
−
3
δ
2
μ
Pr[X-\mu < -\delta\mu] < e^{-\frac{3}{\delta^2}\mu}
Pr[X−μ<−δμ]<e−δ23μ
P
r
[
∣
X
−
μ
∣
>
δ
μ
]
=
P
r
[
X
−
μ
>
δ
μ
]
+
P
r
[
X
−
μ
<
−
δ
μ
]
<
e
−
3
δ
2
μ
+
e
−
3
δ
2
μ
=
2
e
−
3
δ
2
μ
\begin{aligned} Pr[|X-\mu| > \delta\mu] &= Pr[X-\mu > \delta\mu] + Pr[X-\mu < -\delta\mu] \\ &< e^{-\frac{3}{\delta^2}\mu} + e^{-\frac{3}{\delta^2}\mu} \\ &= 2e^{-\frac{3}{\delta^2}\mu} \end{aligned}
Pr[∣X−μ∣>δμ]=Pr[X−μ>δμ]+Pr[X−μ<−δμ]<e−δ23μ+e−δ23μ=2e−δ23μ
故
P
r
[
∣
X
−
μ
∣
>
δ
μ
]
<
2
e
−
3
δ
2
μ
Pr[|X-\mu|>\delta\mu]<2e^{-\frac{3}{\delta^2}\mu}
Pr[∣X−μ∣>δμ]<2e−δ23μ 得证
Balls and Bins
原先以为往盒子里放球取球只是个抽屉原理或者排列组合的问题,
高等算法里把这研究得还要更深刻一些……
0x01 Balls and Bins
m m m balls, n n n bins. You randomly throw each ball to some bin.
X i X_i Xi : number of balls in the i i i-th bin.
Let k ≜ m a x ( X 1 , X 2 , ⋯   , X n ) k \triangleq max(X_1, X_2, \cdots, X_n) k≜max(X1,X2,⋯,Xn).
Question: expectation and distribution of k k k?
-
m
=
o
(
n
)
m = o(\sqrt{n})
m=o(n); (Case 1)
- prove P r ( k > 1 ) = o ( 1 ) Pr(k>1)=o(1) Pr(k>1)=o(1).
- k = 1 w . h . p k=1~w.h.p k=1 w.h.p
-
m
=
Θ
(
n
)
m = \Theta(\sqrt{n})
m=Θ(n); (Case 2, Birthday Paradox)
- compute P r ( k > 1 ) Pr(k>1) Pr(k>1) again.
- k = 1 o r 2 w . h . p k=1~or~2~w.h.p k=1 or 2 w.h.p
-
m
=
n
m=n
m=n; (Case 3)
- find suitable x x x, such that P r ( k ≤ x ) = 1 − o ( 1 ) Pr(k \le x)=1-o(1) Pr(k≤x)=1−o(1)
- k = Θ ( ln n ln ln n ) w . h . p k=\Theta(\frac{\ln n}{\ln \ln n})~w.h.p k=Θ(lnlnnlnn) w.h.p
-
m
≥
n
ln
n
m \ge n\ln n
m≥nlnn; (Case 4)
- k = Θ ( m n ) w . h . p k=\Theta (\frac{m}{n})~w.h.p k=Θ(nm) w.h.p
0xFF Prove in detail
Case 1
- m = o ( n ) m = o(\sqrt{n}) m=o(n)
-
prove P r ( k > 1 ) = o ( 1 ) Pr(k>1)=o(1) Pr(k>1)=o(1).
-
k = 1 w . h . p k=1~w.h.p k=1 w.h.p
-
m = 1 , P r ( k = 1 ) = 1 − o ( 1 ) m=1, Pr(k=1) = 1-o(1) m=1,Pr(k=1)=1−o(1)
-
m = 2 , { P r ( k = 1 ) = 1 − 1 / n P r ( k = 2 ) = 1 / n m=2, \begin{cases} Pr(k=1)=1-1/n \\ Pr(k=2)=1/n \end{cases} m=2,{Pr(k=1)=1−1/nPr(k=2)=1/n
-
m = ? , P r ( k = 1 ) = 1 − o ( 1 ) m= ? ~, Pr(k=1)=1-o(1) m=? ,Pr(k=1)=1−o(1)
对于这个
P
r
(
k
=
1
)
=
1
−
o
(
1
)
Pr(k=1)=1-o(1)
Pr(k=1)=1−o(1),我们可以等价地视作:
P
r
(
m
a
x
(
X
1
,
X
2
,
⋯
 
,
X
n
)
≥
2
)
=
o
(
1
)
Pr(max(X_1, X_2, \cdots, X_n)\ge 2) = o(1)
Pr(max(X1,X2,⋯,Xn)≥2)=o(1)
那么,根据 Useful Inequalities 中提到过的 Union Bound,有:
P
r
(
X
1
≥
2
o
r
X
2
≥
2
o
r
⋯
o
r
X
n
≥
2
)
≤
∑
i
=
1
n
P
r
(
X
i
≥
2
)
=
n
⋅
P
r
(
X
1
≥
2
)
\begin{aligned} Pr(X_1 \ge 2~or~X_2 \ge 2~or~\cdots~or~X_n \ge 2) ~&\le \sum_{i=1}^{n}Pr(X_i \ge 2) \\ & = n \cdot Pr(X_1 \ge 2) \end{aligned}
Pr(X1≥2 or X2≥2 or ⋯ or Xn≥2) ≤i=1∑nPr(Xi≥2)=n⋅Pr(X1≥2)
其中,
P
r
(
X
1
≥
2
)
≤
(
m
2
)
(
1
n
)
2
=
Θ
(
m
2
n
2
)
P
r
(
X
1
≥
2
)
=
∑
k
=
2
m
P
r
(
X
1
=
k
)
=
∑
k
=
2
m
(
m
k
)
⋅
(
1
n
)
k
(
1
−
1
n
)
m
−
k
=
1
−
P
r
(
X
1
=
0
)
−
P
r
(
X
1
=
1
)
=
1
−
(
1
−
1
n
)
m
−
m
⋅
1
n
⋅
(
1
−
1
n
)
m
−
1
=
Θ
(
m
2
n
2
)
\begin{aligned} Pr(X_1 \ge 2) ~&\le \binom{m}{2} \left(\frac{1}{n} \right)^2 = \Theta(\frac{m^2}{n^2}) \\ Pr(X_1 \ge 2) ~&= \sum_{k=2}^{m}Pr(X_1=k) \\ &= \sum_{k=2}^{m} \binom{m}{k}\cdot(\frac{1}{n})^k(1-\frac{1}{n})^{m-k} \\ &= 1- Pr(X_1=0) - Pr(X_1=1) \\ &= 1-(1-\frac{1}{n})^m - m\cdot \frac{1}{n} \cdot (1-\frac{1}{n})^{m-1} \\ & = \Theta(\frac{m^2}{n^2}) \end{aligned}
Pr(X1≥2) Pr(X1≥2) ≤(2m)(n1)2=Θ(n2m2)=k=2∑mPr(X1=k)=k=2∑m(km)⋅(n1)k(1−n1)m−k=1−Pr(X1=0)−Pr(X1=1)=1−(1−n1)m−m⋅n1⋅(1−n1)m−1=Θ(n2m2)
代入原式子,故有:
n
⋅
P
r
(
X
1
≥
2
)
=
Θ
(
m
2
/
n
)
=
o
(
1
)
∴
m
=
o
(
n
)
n \cdot Pr(X_1 \ge 2) = \Theta(m^2/n) = o(1) \\ \therefore m = o(\sqrt{n})
n⋅Pr(X1≥2)=Θ(m2/n)=o(1)∴m=o(n)
Case 2
- m = Θ ( n ) m = \Theta(\sqrt{n}) m=Θ(n); (Birthday Paradox)
+ compute P r ( k > 1 ) Pr(k>1) Pr(k>1) again.
+ k = 1 o r 2 w . h . p k=1~or~2~w.h.p k=1 or 2 w.h.p
m = Θ ( n ) = c n P r ( X 1 ≥ 2 ) ≤ ( m 2 ) ( 1 n ) 2 ≈ c 2 2 n P r ( k > 1 ) ≤ n ⋅ P r ( X 1 ≥ 2 ) ≤ c 2 2 P r ( k = 1 ) = n − 1 n ⋅ n − 2 n ⋅ n − 3 n ⋯ n − m + 1 n = P r ( E 1 ⋯ E m ) , E i ≜ P r ( E 1 ) P r ( E 2 ∣ E 1 ) P r ( E 3 ∣ E 1 E 2 ) ⋯ = ( 1 − 1 n ) ⋅ ( 1 − 2 n ) ⋅ ( 1 − 3 n ) ⋯ ( 1 − m − 1 n ) \begin{aligned} m = \Theta(\sqrt{n})~&=c\sqrt{n} \\ Pr(X_1 \ge 2) ~&\le \binom{m}{2} \left(\frac{1}{n} \right)^2 \approx \frac{c^2}{2n} \\ Pr(k > 1) ~&\le n \cdot Pr(X_1 \ge 2) \le \frac{c^2}{2} \\ Pr(k = 1) ~& = \frac{n-1}{n} \cdot \frac{n-2}{n} \cdot \frac{n-3}{n} \cdots \frac{n-m+1}{n} \\ &= Pr(E_1 \cdots E_m) ~, E_i \triangleq Pr(E_1)Pr(E_2|E_1)Pr(E_3|E_1E_2)\cdots \\ &= (1-\frac{1}{n}) \cdot (1-\frac{2}{n}) \cdot (1-\frac{3}{n}) \cdots (1-\frac{m-1}{n}) \end{aligned} m=Θ(n) Pr(X1≥2) Pr(k>1) Pr(k=1) =cn≤(2m)(n1)2≈2nc2≤n⋅Pr(X1≥2)≤2c2=nn−1⋅nn−2⋅nn−3⋯nn−m+1=Pr(E1⋯Em) ,Ei≜Pr(E1)Pr(E2∣E1)Pr(E3∣E1E2)⋯=(1−n1)⋅(1−n2)⋅(1−n3)⋯(1−nm−1)
根据 Union Bound:
P
r
(
k
=
1
)
=
(
1
−
1
n
)
⋅
(
1
−
2
n
)
⋅
(
1
−
3
n
)
⋯
(
1
−
m
−
1
n
)
≥
(
1
−
m
−
1
n
)
m
−
1
(Union
Bound)
∼
(
1
−
m
−
1
n
)
n
m
−
1
⋅
(
m
−
1
)
2
n
∼
(
1
e
)
m
2
n
\begin{aligned} Pr(k = 1) ~&= (1-\frac{1}{n}) \cdot (1-\frac{2}{n}) \cdot (1-\frac{3}{n}) \cdots (1-\frac{m-1}{n})\\ &\ge (1-\frac{m-1}{n})^{m-1} ~~~~\textbf{ (Union Bound)} \\ &\sim (1-\frac{m-1}{n})^{\frac{n}{m-1}\cdot{\frac{(m-1)^2}{n}}} \sim (\frac{1}{e})^{\frac{m^2}{n}} \end{aligned}
Pr(k=1) =(1−n1)⋅(1−n2)⋅(1−n3)⋯(1−nm−1)≥(1−nm−1)m−1 (Union Bound)∼(1−nm−1)m−1n⋅n(m−1)2∼(e1)nm2
又因为
1
−
x
≤
e
−
x
1-x \le e^{-x}
1−x≤e−x:
(
1
−
1
n
)
⋅
(
1
−
2
n
)
⋅
(
1
−
3
n
)
⋯
(
1
−
m
−
1
n
)
≤
e
−
1
/
n
⋅
e
−
2
/
n
⋅
e
−
3
/
n
⋯
e
−
(
m
−
1
)
/
n
≈
e
−
m
2
/
2
n
<
1
∴
P
r
(
k
≥
2
)
=
1
−
P
r
(
k
=
1
)
≥
1
−
e
−
c
2
/
2
\begin{aligned} &(1-\frac{1}{n}) \cdot (1-\frac{2}{n}) \cdot (1-\frac{3}{n}) \cdots (1-\frac{m-1}{n}) \\ \le~ & e^{-1/n} \cdot e^{-2/n} \cdot e^{-3/n} \cdots e^{-(m-1)/n} \\ \approx~ & e^{-m^2/2n} < 1 \\ \therefore ~ & Pr(k \ge 2) = 1 - Pr(k = 1) \ge 1- e^{-c^2/2} \end{aligned}
≤ ≈ ∴ (1−n1)⋅(1−n2)⋅(1−n3)⋯(1−nm−1)e−1/n⋅e−2/n⋅e−3/n⋯e−(m−1)/ne−m2/2n<1Pr(k≥2)=1−Pr(k=1)≥1−e−c2/2
而对于
k
≥
3
k \ge 3
k≥3时:
(这段的板书顺序较为混乱,资质愚钝足足半个小时仍无法看懂,暂且搁置)
Prepare for case 3
为了 case 3 的证明,我们需要事先准备一个阶乘的近似界
( m x ) x ≤ ( m x ) ≤ ( e m x ) x (\frac{m}{x})^x \le \binom{m}{x} \le (\frac{em}{x})^x (xm)x≤(xm)≤(xem)x
先证
(
m
x
)
=
m
!
x
!
(
m
−
x
)
!
∼
m
x
x
!
\tbinom{m}{x} = \frac{m!}{x!(m-x)!} \sim \frac{m^x}{x!}
(xm)=x!(m−x)!m!∼x!mx
lim
m
→
∞
(
m
x
)
m
x
x
!
=
lim
m
→
∞
m
(
m
−
1
)
(
m
−
2
)
⋯
(
m
−
x
+
1
)
m
x
=
lim
m
→
∞
1
⋅
(
1
−
1
m
)
(
1
−
2
m
)
⋯
(
1
−
x
−
1
m
)
=
1
\begin{aligned} \lim\limits_{m \rightarrow \infty}\frac{\tbinom{m}{x}}{\frac{m^x}{x!}} &= \lim\limits_{m \rightarrow \infty}\frac{m(m-1)(m-2)\cdots(m-x+1)}{m^x} \\ &= \lim\limits_{m \rightarrow \infty} 1\cdot(1-\frac{1}{m})(1-\frac{2}{m})\cdots(1-\frac{x-1}{m}) \\ &= 1 \end{aligned}
m→∞limx!mx(xm)=m→∞limmxm(m−1)(m−2)⋯(m−x+1)=m→∞lim1⋅(1−m1)(1−m2)⋯(1−mx−1)=1
这里,我们需要引入阶乘的逼近公式:斯特林公式(Stirling’s formula):
n
!
∼
2
π
n
(
n
e
)
n
n! \sim \sqrt{2 \pi n}(\frac{n}{e})^n
n!∼2πn(en)n
m
x
x
!
∼
m
x
2
π
x
(
x
e
)
x
=
e
x
m
x
2
π
x
x
x
=
e
x
2
π
x
(
m
x
)
x
≤
(
e
m
x
)
x
\frac{m^x}{x!} \sim \frac{m^x}{\sqrt{2\pi x}(\frac{x}{e})^x}=\frac{e^xm^x}{\sqrt{2\pi x}x^x}=\frac{e^x}{\sqrt{2\pi x}}(\frac{m}{x})^x \le (\frac{em}{x})^x
x!mx∼2πx(ex)xmx=2πxxxexmx=2πxex(xm)x≤(xem)x
并且
e
x
2
π
x
>
1
\frac{e^x}{\sqrt{2\pi x}} > 1
2πxex>1
所以
e
x
2
π
x
(
m
x
)
x
≥
(
m
x
)
x
\frac{e^x}{\sqrt{2\pi x}}(\frac{m}{x})^x \ge (\frac{m}{x})^x
2πxex(xm)x≥(xm)x
即
(
m
x
)
x
≤
(
m
x
)
≤
(
e
m
x
)
x
(\frac{m}{x})^x \le \binom{m}{x} \le (\frac{em}{x})^x
(xm)x≤(xm)≤(xem)x
Case 3
- m = n m=n m=n
+ find suitable x x x, such that P r ( k ≤ x ) = 1 − o ( 1 ) Pr(k \le x)=1-o(1) Pr(k≤x)=1−o(1)
+ k = Θ ( ln n ln ln n ) w . h . p k=\Theta(\frac{\ln n}{\ln \ln n})~w.h.p k=Θ(lnlnnlnn) w.h.p
令
x
=
ln
n
ln
l
n
n
x = \frac{\ln n}{\ln ln n}
x=lnlnnlnn,先证下界:
P
r
(
k
≤
x
)
=
1
−
o
(
1
)
Pr(k \le x) = 1-o(1)
Pr(k≤x)=1−o(1)
即证:
P
r
(
k
≥
x
)
=
o
(
1
)
Pr(k \ge x) = o(1)
Pr(k≥x)=o(1)
于是,根据 Union Bound 有:
P
r
(
k
≥
x
)
≤
n
⋅
P
r
(
X
1
≥
x
)
≤
n
⋅
(
m
x
)
(
1
n
)
x
=
n
⋅
(
n
x
)
(
1
n
)
x
Pr(k \ge x) \le n \cdot Pr(X_1 \ge x) \le n \cdot \binom{m}{x}\left( \frac{1}{n} \right)^x = n \cdot \binom{n}{x}\left( \frac{1}{n} \right)^x
Pr(k≥x)≤n⋅Pr(X1≥x)≤n⋅(xm)(n1)x=n⋅(xn)(n1)x
上一小节我们通过 斯特林公式(Stirling’s formula) 得到:
(
m
x
)
x
≤
(
m
x
)
≤
(
e
m
x
)
x
(\frac{m}{x})^x \le \binom{m}{x} \le (\frac{em}{x})^x
(xm)x≤(xm)≤(xem)x
代入,有:
n
⋅
(
n
x
)
(
1
n
)
x
≤
n
⋅
(
e
n
x
)
x
(
1
n
)
x
=
n
⋅
(
e
x
)
x
=
o
(
1
)
n \cdot \binom{n}{x}\left( \frac{1}{n} \right)^x \le n\cdot \left( \frac{en}{x} \right)^x \left( \frac{1}{n} \right)^x = n\cdot \left( \frac{e}{x} \right)^x = o(1)
n⋅(xn)(n1)x≤n⋅(xen)x(n1)x=n⋅(xe)x=o(1)
再证上界:
P
r
(
k
≥
c
⋅
x
)
=
1
−
o
(
1
)
Pr(k \ge c \cdot x) = 1-o(1)
Pr(k≥c⋅x)=1−o(1)
即证:
P
r
(
k
≤
c
⋅
x
)
=
P
r
(
E
1
∧
⋯
∧
E
n
)
Pr(k \le c \cdot x) = Pr(E_1 \land \cdots \land E_n)
Pr(k≤c⋅x)=Pr(E1∧⋯∧En)
其中,
E
i
E_i
Ei 表示:
x
i
≤
c
⋅
x
,
Y
i
=
{
1
,
E
i
没发生
0
,
E
i
发生
x_i \le c \cdot x,~Y_i=\begin{cases} 1, ~E_i\text{ 没发生}\\ 0, ~E_i\text{ 发生} \end{cases}
xi≤c⋅x, Yi={1, Ei 没发生0, Ei 发生
则有:
P
r
(
k
≤
c
⋅
x
)
=
P
r
(
k
≤
c
⋅
x
)
=
P
r
(
∀
i
,
Y
i
=
0
)
=
P
r
(
∑
i
=
1
n
Y
i
=
0
)
Pr(k \le c \cdot x) = Pr(k \le c \cdot x)=Pr(\forall i, Y_i=0) = Pr(\sum_{i=1}^{n}Y_i=0)
Pr(k≤c⋅x)=Pr(k≤c⋅x)=Pr(∀i,Yi=0)=Pr(i=1∑nYi=0)
而上式不大于:
P
r
(
∣
∑
i
=
1
n
−
E
(
∑
i
=
1
n
Y
i
)
∣
≥
E
(
∑
i
=
1
n
Y
i
)
)
≤
σ
2
(
∑
i
=
1
n
Y
i
)
(
E
(
∑
i
=
1
n
Y
i
)
)
2
Pr \left( \left|\sum_{i=1}^{n} - E(\sum_{i=1}^{n}Y_i) \right| \ge E(\sum_{i=1}^{n}Y_i) \right) \le \frac{\sigma^2(\sum_{i=1}^{n}Y_i)}{(E(\sum_{i=1}^{n}Y_i))^2}
Pr(∣∣∣∣∣i=1∑n−E(i=1∑nYi)∣∣∣∣∣≥E(i=1∑nYi))≤(E(∑i=1nYi))2σ2(∑i=1nYi)
(期望与方差的推导较长,暂时搁置,事后有时间再补), 故:
P
r
(
k
<
c
x
)
=
P
r
(
Y
1
+
Y
2
+
⋯
+
Y
n
=
0
)
Pr(k<cx)=Pr(Y_1+Y_2+\cdots+Y_n=0)
Pr(k<cx)=Pr(Y1+Y2+⋯+Yn=0)
≤
V
a
r
(
∑
i
=
1
n
Y
i
)
E
2
(
∑
i
=
1
n
Y
i
)
=
O
(
n
(
n
1
−
c
)
2
)
∼
1
n
1
/
3
,
∴
c
=
1
/
3
\le \frac{Var(\sum_{i=1}^{n}Y_i)}{E^2(\sum_{i=1}^{n}Y_i)} = O\left(\frac{n}{(n^{1-c})^2}\right) \sim \frac{1}{n^{1/3}},~~~\therefore c=1/3
≤E2(∑i=1nYi)Var(∑i=1nYi)=O((n1−c)2n)∼n1/31, ∴c=1/3
ln n 3 ln ln n < k < ln n ln ln n \frac{\ln n}{3\ln\ln n}<k<\frac{\ln n}{\ln\ln n} 3lnlnnlnn<k<lnlnnlnn
Consider the case with n n n balls and n n n bins,
let X X X be the random variable of the number of empty bins. Compute E ( X ) E(X) E(X), and the deviation between X X X and E ( X ) E(X) E(X).
the result should be in the form P r ( ∣ X − E ( X ) ∣ > a ) < b Pr(|X-E(X)|>a)<b Pr(∣X−E(X)∣>a)<b
令
Z
i
Z_i
Zi 表示第
i
i
i 个盒子里是否没有球: 没有球时为
Z
i
=
1
Z_i=1
Zi=1,反之为
Z
i
=
0
Z_i=0
Zi=0
则有
Y
=
∑
i
=
1
n
Z
i
Y=\sum_{i=1}^{n}Z_i
Y=i=1∑nZi
E
(
Y
)
=
E
(
∑
i
=
1
n
Z
i
)
=
∑
i
=
1
n
E
(
Z
i
)
=
n
E
(
Z
1
)
E(Y)=E(\sum_{i=1}^{n}Z_i)=\sum_{i=1}^{n}E(Z_i)=nE(Z_1)
E(Y)=E(i=1∑nZi)=i=1∑nE(Zi)=nE(Z1)
其中
E
(
Z
1
)
=
p
(
Z
1
=
0
)
⋅
1
+
p
(
Z
1
=
1
)
⋅
0
=
1
−
(
1
−
1
n
)
n
=
1
−
e
−
1
E(Z_1)=p(Z_1=0)\cdot 1 + p(Z_1=1)\cdot 0 = 1 - (1-\frac{1}{n})^n = 1-e^{-1}
E(Z1)=p(Z1=0)⋅1+p(Z1=1)⋅0=1−(1−n1)n=1−e−1
所以
E
(
X
)
=
E
(
n
−
Y
)
=
n
−
E
(
Y
)
=
e
−
1
n
E(X) = E(n-Y) = n-E(Y) = e^{-1}n
E(X)=E(n−Y)=n−E(Y)=e−1n
对于
λ
>
0
\lambda > 0
λ>0
μ
=
E
[
Z
]
=
n
(
1
−
1
n
)
n
∼
n
e
−
1
\mu = E[Z] = n(1-\frac{1}{n})^n \sim ne^{-1}
μ=E[Z]=n(1−n1)n∼ne−1
P
r
[
∣
Z
−
μ
∣
≥
λ
]
≤
2
⋅
e
x
p
(
−
λ
2
2
n
)
Pr[|Z-\mu|\ge \lambda]\le 2\cdot exp(-\frac{\lambda^2}{2n})
Pr[∣Z−μ∣≥λ]≤2⋅exp(−2nλ2)
特别地, 当
m
≫
n
m \gg n
m≫n 时:
μ
=
E
[
Z
]
=
n
(
1
−
1
n
)
m
∼
n
e
−
m
/
n
\mu = E[Z] = n(1-\frac{1}{n})^m \sim ne^{-m/n}
μ=E[Z]=n(1−n1)m∼ne−m/n
P
r
[
∣
Z
−
μ
∣
≥
λ
]
≤
2
⋅
e
x
p
(
−
λ
2
(
n
−
1
/
2
)
n
2
−
μ
2
)
Pr[|Z-\mu|\ge \lambda]\le 2\cdot exp(-\frac{\lambda^2(n-1/2)}{n^2-\mu^2})
Pr[∣Z−μ∣≥λ]≤2⋅exp(−n2−μ2λ2(n−1/2))
Case 4
- m ≥ n ln n m \ge n\ln n m≥nlnn
+ k = Θ ( m n ) w . h . p k=\Theta (\frac{m}{n})~w.h.p k=Θ(nm) w.h.p
要证:
P
r
(
k
≥
c
⋅
m
n
)
=
o
(
1
)
Pr(k \ge c \cdot \frac{m}{n}) = o(1)
Pr(k≥c⋅nm)=o(1)
即证:
P
r
(
x
1
≥
c
m
n
o
r
x
2
≥
c
m
n
o
r
⋯
o
r
x
n
≥
c
m
n
)
Pr(x_1 \ge c\frac{m}{n}~~or~~x_2 \ge c\frac{m}{n}~~or~\cdots~or~~x_n \ge c\frac{m}{n})
Pr(x1≥cnm or x2≥cnm or ⋯ or xn≥cnm)
而根据 Union Bound,
P
r
(
k
≥
c
⋅
m
n
)
≤
n
⋅
P
r
(
x
1
≥
c
m
n
)
Pr(k \ge c \cdot \frac{m}{n}) \le n \cdot Pr(x_1 \ge c \frac{m}{n})
Pr(k≥c⋅nm)≤n⋅Pr(x1≥cnm)
先证上界:
P
r
(
x
1
≥
c
m
n
)
≤
(
m
c
m
n
)
(
1
n
)
c
m
n
≤
(
e
m
c
m
n
)
c
m
n
(
1
n
)
c
m
n
=
(
e
c
)
c
m
n
Pr \left(x_1 \ge c\frac{m}{n} \right) \le \binom{m}{c\frac{m}{n}} \left( \frac{1}{n} \right)^{c\frac{m}{n}} \le \left( \frac{em}{c\frac{m}{n}} \right)^{c\frac{m}{n}} \left( \frac{1}{n} \right)^{c\frac{m}{n}} = \left( \frac{e}{c} \right)^{c\frac{m}{n}}
Pr(x1≥cnm)≤(cnmm)(n1)cnm≤(cnmem)cnm(n1)cnm=(ce)cnm
由于
m
≥
n
ln
n
m \ge n\ln n
m≥nlnn,
P
r
(
k
≥
c
m
n
)
=
(
e
c
)
c
m
n
≤
(
e
c
)
c
ln
n
=
o
(
1
/
n
)
Pr(k \ge c\frac{m}{n})= \left( \frac{e}{c} \right)^{c\frac{m}{n}} \le \left( \frac{e}{c} \right)^{c\ln n} = o(1/n)
Pr(k≥cnm)=(ce)cnm≤(ce)clnn=o(1/n)
再证下界,根据 Chernoff’s Bound:
P
r
(
∣
Y
1
+
⋯
+
Y
n
−
E
(
Y
1
+
⋯
+
Y
n
)
∣
)
≤
?
Pr\left( \left| Y_1 + \cdots + Y_n - E(Y_1 + \cdots + Y_n) \right| \right) \le~?
Pr(∣Y1+⋯+Yn−E(Y1+⋯+Yn)∣)≤ ?
其中, Y i Y_i Yi 指 i i i-th ball 扔进了第一个盒子, X 1 = ∑ i = 1 m Y i , Y i = { 1 , 1 / n 0 , 1 − 1 / n X_1 = \sum_{i=1}^{m}Y_i,~~Y_i=\begin{cases} 1,~~1/n \\ 0,~~1-1/n \end{cases} X1=∑i=1mYi, Yi={1, 1/n0, 1−1/n
P r ( ∣ X 1 − m / n ∣ > c 1 m n ) ≤ 2 ⋅ e x p ( − c 1 2 3 ⋅ m n ) ≤ 2 ⋅ e x p ( − c 1 2 3 ln n ) = 2 1 n c 1 2 3 = o ( 1 n ) Pr( |X_1 - m/n| > c_1\frac{m}{n} ) \le 2 \cdot exp(-\frac{c_1^2}{3}\cdot\frac{m}{n}) \le 2\cdot exp(-\frac{c_1^2}{3}\ln n) = 2 \frac{1}{n^{\frac{c1^2}{3}}} = o(\frac{1}{n}) Pr(∣X1−m/n∣>c1nm)≤2⋅exp(−3c12⋅nm)≤2⋅exp(−3c12lnn)=2n3c121=o(n1)