A Probabilistic Perspective-chapter2-sol

2.1.1.
A = o n e   c h i l d   i s   b o y ∣ A ∣ = 3 P = 2 3 A=one\ child\ is\ boy\\ |A|=3\\ P=\frac{2}{3} A=one child is boyA=3P=32
2.1.2.
First child is boy.
P = 1 2 P=\frac{1}{2} P=21

2.3:
v a r [ x + y ] = E [ ( x + y ) 2 ] − E 2 [ x + y ] v a r [ x + y ] = E [ x 2 ] + E [ y 2 ] + 2 E [ x y ] − ( E [ x ] + E [ y ] ) 2 v a r [ x + y ] = v a r [ x ] + v a r [ y ] + 2 ( E [ x y ] − E [ x ] E [ y ] ) = v a r [ x ] + v a r [ y ] + 2 c o n v ( x , y ) \begin{align} var[x+y]=E[(x+y)^ 2]-E^2[x+y] \\ var[x+y]=E[x^ 2]+E[y^ 2]+2E[xy]-(E[x]+E[y])^2\\ var[x+y]=var[x]+var[y]+2(E[xy]-E[x]E[y])=var[x]+var[y]+2conv(x,y) \end{align} var[x+y]=E[(x+y)2]E2[x+y]var[x+y]=E[x2]+E[y2]+2E[xy](E[x]+E[y])2var[x+y]=var[x]+var[y]+2(E[xy]E[x]E[y])=var[x]+var[y]+2conv(x,y)
2.4
P ( i l l ∣ p o s i t i v e ) = 0.99 , P ( i l l ) = 1 e − 4 P(ill|positive)=0.99,P(ill)=1e-4 P(illpositive)=0.99,P(ill)=1e4
the answer is 0.99.
text example:
p ( p o s i t i v e ∣ i l l ) = 0.8 , p ( i l l ) = 0.004 , p ( p o s i t i v e ∣ n o t i l l ) = 0.1 p(positive|ill)=0.8,p(ill)=0.004,p(positive|not ill)=0.1 p(positiveill)=0.8,p(ill)=0.004,p(positivenotill)=0.1
p ( p o s i t i v e ) = p ( p o s i t i v e ∣ i l l ) p ( i l l ) + p ( p o s i t i v e ∣ n o t i l l ) p ( n o t i l l ) = 0.8 ∗ 0.004 + 0.1 ∗ ( 1 − 0.004 ) = 0.1028 p(positive)=p(positive|ill)p(ill)+p(positive|not ill)p(not ill)=0.8*0.004+0.1*(1-0.004)=0.1028 p(positive)=p(positiveill)p(ill)+p(positivenotill)p(notill)=0.80.004+0.1(10.004)=0.1028
p ( i l l ∣ p o s i t i v e ) = p ( p o s i t i v e ∣ i l l ) p ( i l l ) p ( p o s i t i v e ) = 0.8 ∗ 0.004 0.1028 = 0.031 p(ill|positive)=\frac{p(positive|ill)p(ill)}{p(positive)}=\frac{0.8*0.004}{0.1028}=0.031 p(illpositive)=p(positive)p(positiveill)p(ill)=0.10280.80.004=0.031

2.5
A=prize behind first picked door
B=prize behind final picked door
P ( A ) = 1 / 3 ,   P (   A ) = 2 / 3 P(A)=1/3,\ P(~A)=2/3 P(A)=1/3, P( A)=2/3
P ( B ) = P ( B ∣ A ) P ( A ) + P ( B ∣   A ) P (   A ) = 0 ∗ 1 / 3 + 1 ∗ 2 / 3 = 2 / 3 P(B)=P(B|A)P(A)+P(B|~A)P(~A)=0*1/3+1*2/3=2/3 P(B)=P(BA)P(A)+P(B A)P( A)=01/3+12/3=2/3

2.6
1.
P ( H ∣ e 1 , e 2 ) = P ( e 1 , e 2 ∣ H ) P ( H ) P ( e 1 , e 2 ) P(H|e_1,e_2)=\frac{P(e_1,e_2|H)P(H)}{P(e_1,e_2)} P(He1,e2)=P(e1,e2)P(e1,e2H)P(H)
answer is ii.

P ( e 1 , e 2 ∣ H ) = P ( e 1 ∣ H ) P ( e 2 ∣ H ) P(e_1,e_2|H)=P(e_1|H)P(e_2|H) P(e1,e2H)=P(e1H)P(e2H)
i, ii is sufficient.
P ( e 1 , e 2 ) = ∑ H P ( H ) P ( e 1 ∣ H ) P ( e 2 ∣ H ) P(e_1,e_2)=\sum_H P(H)P(e_1|H)P(e_2|H) P(e1,e2)=HP(H)P(e1H)P(e2H)
iii is sufficient.

2.7
wikipedia
x = U ( 0 , 1 ) ,   y = U ( 0 , 1 ) ,   z = x   x o r   y x=U(0,1),\ y=U(0,1),\ z=x\ xor\ y x=U(0,1), y=U(0,1), z=x xor y

2.8
x ⊥ y ∣ z → p ( x , y ∣ z ) = h ( x , y ) g ( y , z ) x\bot y|z \rightarrow p(x,y|z)=h(x,y)g(y,z) xyzp(x,yz)=h(x,y)g(y,z)
It is trival that h ( x , z ) = p ( x ∣ z ) , g ( y , z ) = p ( y ∣ z ) h(x,z)=p(x|z),g(y,z)=p(y|z) h(x,z)=p(xz),g(y,z)=p(yz).
vice versa,
p ( x ∣ z ) = ∑ y p ( x , y ∣ z ) = g ( x , z ) ∑ y h ( y , z ) p ( y ∣ z ) = ∑ y p ( x , y ∣ z ) = h ( y , z ) ∑ x g ( x , z ) 1 = p ( x , y ∣ z ) = ∑ x , y h ( x , z ) g ( y , z ) = ∑ x h ( x , z ) ∑ y g ( y , z ) t h e n , p ( x ∣ z ) p ( y ∣ z ) = g ( x , z ) h ( y , z ) ∑ x h ( x , z ) ∑ y g ( y , z ) = g ( x , z ) h ( y , z ) \begin{align} p(x|z)&=\sum_y{p(x,y|z)}\\ &=g(x,z)\sum_y{h(y,z)}\\ p(y|z)&=\sum_y{p(x,y|z)}\\ &=h(y,z)\sum_x{g(x,z)}\\ 1&=p(x,y|z)\\ &=\sum_{x,y}h(x,z)g(y,z)\\ &=\sum_x h(x,z)\sum_y g(y,z)\\ then,\\ p(x|z)p(y|z)&=g(x,z)h(y,z)\sum_x h(x,z)\sum_y g(y,z)\\ &=g(x,z)h(y,z) \end{align} p(xz)p(yz)1then,p(xz)p(yz)=yp(x,yz)=g(x,z)yh(y,z)=yp(x,yz)=h(y,z)xg(x,z)=p(x,yz)=x,yh(x,z)g(y,z)=xh(x,z)yg(y,z)=g(x,z)h(y,z)xh(x,z)yg(y,z)=g(x,z)h(y,z)

2.9
(i) true
(ii)false

2.10
p ( y ) = p ( x ) d y d x d y d x = − 1 x 2 I G ( x ∣ a , b ) = b a Γ ( a ) x − ( a + 1 ) e − b x \begin{align} p(y)&=p(x)\frac{dy}{dx}\\ \frac{dy}{dx}&=-\frac{1}{x^2}\\ IG(x|a,b)=\frac{b^a}{\Gamma(a)}x^{-(a+1)}e^{-\frac{b}{x}} \end{align} p(y)dxdyIG(xa,b)=Γ(a)bax(a+1)exb=p(x)dxdy=x21

2.11
Intergral θ \theta θ first,
Z 2 = ∫ 0 2 π d θ ∫ − ∞ ∞ r exp ⁡ ( − r 2 2 σ 2 ) d r = 2 π ∫ 0 ∞ r exp ⁡ ( − r 2 2 σ 2 ) d r \begin{align} Z^ 2&=\int_0^ {2\pi}d\theta\int_{-\infty }^{\infty }r\exp(-\frac{r^2}{2\sigma^2})dr\\ &=2\pi\int_{0}^{\infty }r\exp(-\frac{r^2}{2\sigma^2})dr \end{align} Z2=02πdθrexp(2σ2r2)dr=2π0rexp(2σ2r2)dr

KaTeX parse error: Expected 'EOF', got '\end' at position 168: …frac{\sigma^2} \̲e̲n̲d̲{align}
So, Z 2 = 2 π σ 2 Z^2=2\pi\sigma^2 Z2=2πσ2,then Z = σ 2 π Z=\sigma\sqrt{2\pi} Z=σ2π

2.12
I ( X , Y ) = ∑ x , y p ( x , y ) log ⁡ p ( x , y ) p ( x ) p ( y ) = ∑ x , y p ( x , y ) log ⁡ p ( x ∣ y ) p ( x ) = ∑ x , y p ( x , y ) ( log ⁡ p x ∣ y − log ⁡ p ( x ) = − H ( x ∣ y ) − ∑ x log ⁡ p ( x ) ( ∑ y p ( x , y ) ) = − H ( x ∣ y ) + H ( x ) \begin{align} I(X,Y)=&\sum_{x,y}p(x,y)\log\frac{p(x,y)}{p(x)p(y)}\\ =&\sum_{x,y}p(x,y)\log\frac{p(x|y)}{p(x)}\\ =&\sum_{x,y}p(x,y)(\log p{x|y}-\log p(x)\\ =&-H(x|y)-\sum_x \log p(x)(\sum_y p(x,y))\\ =&-H(x|y)+H(x) \end{align} I(X,Y)=====x,yp(x,y)logp(x)p(y)p(x,y)x,yp(x,y)logp(x)p(xy)x,yp(x,y)(logpxylogp(x)H(xy)xlogp(x)(yp(x,y))H(xy)+H(x)

2.13
I ( X , Y ) = H ( x ) − H ( x ∣ y ) = H ( x ) + H ( y ) − H ( x , y ) = log ⁡ 2 π e σ 2 + 1 2 log ⁡ ( 2 π e ) 2 σ 4 ( 1 − ρ 2 ) \begin{align} I(X,Y)=&H(x)-H(x|y)\\ =&H(x)+H(y)-H(x,y)\\ =&\log{2\pi e \sigma^2}+\frac{1}{2}\log{(2\pi e)^2 \sigma^4(1-\rho^2)} \end{align} I(X,Y)===H(x)H(xy)H(x)+H(y)H(x,y)log2πeσ2+21log(2πe)2σ4(1ρ2)
For ρ = 0 \rho=0 ρ=0,
I ( x , y ) = log ⁡ 2 π e σ 2 + 1 2 log ⁡ ( 2 π e ) 2 σ 4 = 2 log ⁡ 2 π e σ 2 = H ( x ) + H ( y ) I(x,y)=\log{2\pi e \sigma^2}+\frac{1}{2}\log{(2\pi e)^2 \sigma^4}=2\log{2\pi e \sigma^2}=H(x)+H(y) I(x,y)=log2πeσ2+21log(2πe)2σ4=2log2πeσ2=H(x)+H(y)
When C o v ( x , y ) = 0 Cov(x,y)=0 Cov(x,y)=0, mutual information is simply sum of single information of two variables, which knowing x x x does not give any information about y y y and vice versa.
For ρ = ± 1 \rho=\pm 1 ρ=±1,
I ( x , y ) = ∞ I(x,y)=\infty I(x,y)=
All information conveyed by x x x is shared with y y y: knowing x x x determines the value of y y y and vice versa.

2.14
(i)
obvious.
(ii)
It is easy to prove non negativity of entropy H ( x ) > 0 H(x)>0 H(x)>0.
I ( x , y ) ≥ 0 → r ≥ 0 I(x,y)\geq 0\rightarrow r\geq 0 I(x,y)0r0
I ( x , y ) ≥ 0 I(x,y)\geq 0 I(x,y)0 is obvious due to its formula.
(iii)
I ( x , y ) = 0 I(x,y)=0 I(x,y)=0, x,y are independent
(iiii)
I ( x , y ) = 1 I(x,y)=1 I(x,y)=1, x is fully dependent of y.

2.15
θ = arg min ⁡ θ K L ( P e m p ∣ ∣ q ( θ ) ) = arg min ⁡ θ E ( P e m p log ⁡ P e m p q ( θ ) ) = arg min ⁡ θ E ( P e m p ( log ⁡ P e m p − log ⁡ q ( θ ) ) ) = H e m p − arg max ⁡ θ E ( P e m p log ⁡ q ( θ ) ) = arg max ⁡ θ E ( P e m p log ⁡ q ( θ ) ) = arg max ⁡ θ ∑ x ∈ D a t a s e t log ⁡ q ( x ; θ ) \begin{align} \theta=&\argmin_\theta{KL(P_{emp}||q(\theta))}\\ =&\argmin_\theta{E(P_{emp}\log \frac{P_{emp}}{q(\theta)})}\\ =&\argmin_\theta{E(P_{emp}(\log{P_{emp}}-\log{q(\theta)}) )}\\ =&H_{emp}-\argmax_\theta{E(P_{emp}\log{q(\theta)} )}\\ =&\argmax_\theta{E(P_{emp}\log{q(\theta)} )}\\ =&\argmax_\theta{\sum_{x\in Dataset}\log{q(x;\theta)}} \end{align} θ======θargminKL(Pemp∣∣q(θ))θargminE(Pemplogq(θ)Pemp)θargminE(Pemp(logPemplogq(θ)))HempθargmaxE(Pemplogq(θ))θargmaxE(Pemplogq(θ))θargmaxxDatasetlogq(x;θ)

2.16
pdf of beta distribution:
x α − 1 ( 1 − x ) β − 1 B ( α , β ) \frac{x^{\alpha-1}(1-x)^{\beta-1}}{B(\alpha,\beta)} B(α,β)xα1(1x)β1
mode:
d d x x α − 1 ( 1 − x ) β − 1 B ( α , β ) = 0 x = α − 1 α + β − 2 \begin{align} \frac{d}{dx}\frac{x^{\alpha-1}(1-x)^{\beta-1}}{B(\alpha,\beta)}&=0\\ x&=\frac{\alpha-1}{\alpha+\beta-2} \end{align} dxdB(α,β)xα1(1x)β1x=0=α+β2α1
E ( x N ) = 1 B ( α , β ) ∫ x α + N − 1 ( 1 − x ) β − 1 = B ( α + N , β ) B ( α , β ) \begin{align} E(x^N)&=\frac{1}{B(\alpha,\beta)}\int x^{\alpha+N-1}(1-x)^{\beta-1}\\ &=\frac{B(\alpha+N,\beta)}{B(\alpha,\beta)} \end{align} E(xN)=B(α,β)1xα+N1(1x)β1=B(α,β)B(α+N,β)
mean:
E ( x ) = 1 B ( α , β ) ∫ x α + N − 1 ( 1 − x ) β − 1 = a a + b \begin{align} E(x)&=\frac{1}{B(\alpha,\beta)}\int x^{\alpha+N-1}(1-x)^{\beta-1}\\ &=\frac{a}{a+b} \end{align} E(x)=B(α,β)1xα+N1(1x)β1=a+ba
var:
E ( x 2 ) − E 2 ( x ) = a b ( a + b ) 2 ( a + b + a ) − a 2 ( a + b ) 2 E(x^2)-E^2(x)=\frac{ab}{(a+b)^2(a+b+a)}-\frac{a^2}{(a+b)^2} E(x2)E2(x)=(a+b)2(a+b+a)ab(a+b)2a2

2.17
The leftest point’s coordinate f ( x , y ) = m i n ( x , y ) f(x,y)=min(x,y) f(x,y)=min(x,y),
p ( f ( x , y ) = m ) = p ( x = m , y > = m ) + p ( x > = m , y = m ) = 2 ( 1 − m ) E ( m ) = ∫ 0 1 2 m ( 1 − m ) d m = ∫ 0 1 2 m − 2 m 2 d m = m 2 − 2 3 m 3 ∣ 0 1 = 1 3 \begin{align} p(f(x,y)=m)=&p(x=m,y>=m)+p(x>=m,y=m)\\ =&2(1-m)\\ E(m)=&\int_0^1 2m(1-m)dm\\ &=\int_0^1 2m-2m^2dm\\ &=\left. m^2-\frac{2}{3}m^3\right|_0^1\\ &=\frac{1}{3} \end{align} p(f(x,y)=m)==E(m)=p(x=m,y>=m)+p(x>=m,y=m)2(1m)012m(1m)dm=012m2m2dm=m232m3 01=31
The problem can also be solved in 3-d coordinates. The body is a cone with height 1 and bottom area 1.

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值