Chapter 1 (Sample Space and Probability): Independence (独立性)

本文为 I n t r o d u c t i o n Introduction Introduction t o to to P r o b a b i l i t y Probability Probability 的读书笔记

Independence

  • We have introduced the conditional probability P ( A ∣ B ) P(A | B) P(AB) to capture the partial information that event B B B provides about event A A A. An interesting and important special case arises when the occurrence of B B B provides no such information and does not alter the probability that A A A has occurred, i.e.,
    P ( A ∣ B ) = P ( A ) P(A|B)=P(A) P(AB)=P(A)
  • When the above equality holds. we say that A A A is independent of B B B ( A A A and B B B are independent events). The equation above is equivalent to
    P ( A ∩ B ) = P ( A ) P ( B ) P(A\cap B)=P(A)P(B) P(AB)=P(A)P(B)
    • We adopt this latter relation as the definition of independence because it can be used even when P ( B ) = 0 P(B) = 0 P(B)=0, in which case P ( A ∣ B ) P(A | B) P(AB) is undefined.
    • If A A A and B B B are independent, the occurrence of B B B does not provide any new information on the probability of A A A occurring.

Pitfall: A common first thought is that two events are independent if they are disjoint, but in fact the opposite is true: two disjoint events A A A and B B B with P ( A ) > 0 P(A) > 0 P(A)>0 and P ( B ) > 0 P(B) > 0 P(B)>0 are never independent. (e.g. A A A and A C A^C AC)


性质

  • ( A , B ) , ( A , B ˉ ) , ( A ˉ , B ) , ( A ˉ , B ˉ ) (A,B),(A,\bar B),(\bar A,B),(\bar A,\bar B) (A,B),(A,Bˉ),(Aˉ,B),(Aˉ,Bˉ) 这四对事件中,如果有一对独立,则另外三对也独立

Conditional Independence 条件独立

  • We noted earlier that the conditional probabilities of events form a legitimate probability law. We can thus talk about independence of various events with respect to this conditional law.

  • In particular, given an event C C C. the events A A A and B B B are called conditionally independent if
    P ( A ∩ B ∣ C ) = P ( A ∣ C ) P ( B ∣ C ) P(A\cap B|C)=P(A|C)P(B|C) P(ABC)=P(AC)P(BC)
  • To derive an alternative characterization of conditional independence, we write
    在这里插入图片描述
  • We now compare the preceding two expressions. and after eliminating the common factor P ( B ∣ C ) P(B|C) P(BC), assumed nonzero. we see that conditional independence is the same as the condition
    P ( A ∣ B ∩ C ) = P ( A ∣ C ) P(A|B\cap C)=P(A|C) P(ABC)=P(AC)In words, this relation states that if C C C is known to have occurred, the additional knowledge that B B B also occurred does not change the probability of A A A.
  • Interestingly, independence of two events A A A and B B B with respect to the unconditional probability law. does not imply conditional independence, and vice versa, as illustrated by the next example.

Example 1.21.

  • There are two coins, a blue and a red one. We choose one of the two at random, each being chosen with probability 1 / 2 1 /2 1/2, and proceed with two independent tosses. The coins are biased: with the blue coin, the probability of heads in any given toss is 0.99 0.99 0.99, whereas for the red coin it is 0.01 0.01 0.01.
  • Let B B B be the event that the blue coin was selected. Let also H i H_i Hi be the event that the i i ith toss resulted in heads. Given the choice of a coin, the events H 1 H_1 H1 and H 2 H_2 H2 are independent. Thus,
    P ( H 1 ∩ H 2 ∣ B ) = P ( H 1 ∣ B ) P ( H 2 ∣ B ) = 0.99 ⋅ 0.99 P(H_1\cap H_2|B)=P(H_1|B)P(H_2|B)=0.99\cdot0.99 P(H1H2B)=P(H1B)P(H2B)=0.990.99
  • On the other hand, the events H 1 H_1 H1 and H 2 H_2 H2 are not independent. Intuitively, if we are told that the first toss resulted in heads, this leads us to suspect that the blue coin was selected. Mathematically,
    P ( H 1 ) = P ( B ) P ( H 1 ∣ B ) + P ( B C ) P ( H 1 ∣ B C ) = 1 2 ⋅ 0.99 + 1 2 ⋅ 0.01 = 1 2 P(H_1)=P(B)P(H_1|B)+P(B^C)P(H_1|B^C)=\frac{1}{2}\cdot0.99+\frac{1}{2}\cdot0.01=\frac{1}{2} P(H1)=P(B)P(H1B)+P(BC)P(H1BC)=210.99+210.01=21
    Similarly, we have P ( H 2 ) = 1 / 2 P(H_2) =1/2 P(H2)=1/2. Now notice that
    P ( H 1 ∩ H 2 ) = P ( B ) P ( H 1 ∩ H 2 ∣ B ) + P ( B C ) P ( H 1 ∩ H 2 ∣ B C ) = 1 2 ⋅ 0.99 ⋅ 0.99 + 1 2 ⋅ 0.01 ⋅ 0.01 ≈ 1 2 \begin{aligned}P(H_1\cap H_2)&=P(B)P(H_1\cap H_2|B)+P(B^C)P(H_1\cap H_2|B^C) \\&=\frac{1}{2}\cdot0.99\cdot0.99+\frac{1}{2}\cdot0.01\cdot0.01\approx\frac{1}{2}\end{aligned} P(H1H2)=P(B)P(H1H2B)+P(BC)P(H1H2BC)=210.990.99+210.010.0121
  • Thus, the events H 1 H_1 H1 and H 2 H_2 H2 are dependent, even though they are conditionally independent given B B B.

Summary

在这里插入图片描述

Independence of a Collection of Events

在这里插入图片描述


  • For the case of three events, A 1 A_1 A1, A 2 A_2 A2, and A 3 A_3 A3, independence amounts to satisfying the four conditions
    在这里插入图片描述
    The first three conditions simply assert that any two events are independent, a property known as pairwise independence (两两独立). But the fourth condition is also important and does not follow from the first three. Conversely, the fourth condition does not imply the first three.

  • The intuition behind the independence of a collection of events is analogous to the case of two events. Independence means that the occurrence or non-occurrence of any number of the events from that collection carries no information on the remaining events or their complements.
  • For example, if the events A 1 A_1 A1, A 2 A_2 A2, A 3 A_3 A3, A 4 A_4 A4 are independent, one obtains relations such as
    P ( A 1 ∪ A 2 ∣ A 3 ∩ A 4 ) = P ( A 1 ∪ A 2 ) P(A_1\cup A_2|A_3\cap A_4)=P(A_1\cup A_2) P(A1A2A3A4)=P(A1A2)( P ( A 1 ∪ A 2 ∣ A 3 ∩ A 4 ) = P ( A 1 ∣ A 3 ∩ A 4 ) + P ( A 2 ∣ A 3 ∩ A 4 ) + P ( A 1 ∩ A 2 ∣ A 3 ∩ A 4 ) = P ( A 1 ) + P ( A 2 ) + P ( A 1 ∩ A 2 ) = P ( A 1 ∪ A 2 ) P(A_1\cup A_2|A_3\cap A_4)=P(A_1|A_3\cap A_4)+P(A_2|A_3\cap A_4)+P(A_1\cap A_2|A_3\cap A_4)=P(A_1)+P(A_2)+P(A_1\cap A_2)=P(A_1\cup A_2) P(A1A2A3A4)=P(A1A3A4)+P(A2A3A4)+P(A1A2A3A4)=P(A1)+P(A2)+P(A1A2)=P(A1A2))
    or P ( A 1 ∪ A 2 C ∣ A 3 C ∩ A 4 ) = P ( A 1 ∪ A 2 C ) P(A_1\cup A_2^C|A_3^C\cap A_4)=P(A_1\cup A_2^C) P(A1A2CA3CA4)=P(A1A2C)

Independent Trials and the Binomial Probabilities

独立实验 和 二项概率

  • If an experiment involves a sequence of independent but identical stages, we say that we have a sequence of independent trials.

  • In the special case where there are only two possible results at each stage, we say that we have a sequence of independent Bernoulli trials (伯努利试验序列).
    • We can visualize independent Bernoulli trials by means of a sequential description, as shown below for the case where n = 3 n = 3 n=3.
    • By multiplying the conditional probabilities along the corresponding path of the tree, we see that any particular outcome that involves k k k heads and 3 − k 3 - k 3k tails has probability p k ( 1 − p ) 3 − k p^k(1 - p)^{3-k} pk(1p)3k .
    • This formula extends to the case of a general number n n n of tosses. We obtain that the probability of any particular n n n-long sequence that contains k k k heads and n − k n - k nk tails is p k ( 1 − p ) n − k p^k(1 - p)^{n - k} pk(1p)nk . for all k k k from 0 0 0 to n n n.
      在这里插入图片描述

  • Let us now consider the probability
    在这里插入图片描述
    在这里插入图片描述
    where we use the notation
    在这里插入图片描述
    The number ( n k ) \begin{pmatrix}n\\k\end{pmatrix} (nk) (read as " n n n choose k k k") are known as binominal coefficients (二项式系数), while the probabilities p ( k ) p(k) p(k) are known as the binomial probabilities (二项概率).
    在这里插入图片描述
    Note that the binomial probabilities p ( k ) p(k) p(k) must add to 1, thus showing the binomial formula
    在这里插入图片描述

Exercises

Problem 30.
A hunter has two hunting dogs. One day, on the trail of some animal, the hunter comes to a place where the road diverges into two paths. He knows that each dog. independent of the other. will choose the correct path with probability p p p. The hunter decides to let each dog choose a path, and if they agree, take that one, and if they disagree, to randomly pick a path. Is his strategy better than just letting one of the two dogs decide on a path?

SOLUTION

  • The events that lead to the correct path are:
    在这里插入图片描述
  • The above events are disjoint, so we can add the probabilities
    在这里插入图片描述
  • Thus, the two strategies are equally effective.

Problem 33.
Using a biased coin to make an unbiased decision. Alice and Bob want to choose between the opera and the 1novies by tossing a fair coin. Unfortunately, the only available coin is biased (though the bias is not known exactly). How can they use the biased coin to make a decision so that either option (opera or the movies) is equally likely to be chosen?

SOLUTION

  • Flip the coin twice. If the outcome is heads-tails, choose the opera. if the outcome is tails-heads, choose the movies. Otherwise, repeat the process, until a decision can be made.
  • Let A k A_k Ak be the event that a decision was made at the k k kth round. Conditional on the event A k A_k Ak, the two choices are equally likely, and we have
    在这里插入图片描述

Problem 41.
Consider a game show with an infinite pool of contestants, where at each round i i i, contestant i i i obtains a number by spinning a continuously calibrated wheel. The contestant with the smallest number thus far survives. Successive wheel spins are independent and we assume that there are no ties. Let N N N be the round at which contestant 1 1 1 is eliminated. For any positive integer n n n, find P ( N = n ) P(N = n) P(N=n).

SOLUTION 1

  • For i ≤ j i\leq j ij, A i , j A_{i,j} Ai,j is the event that contestant i i i's number is the smallest of the numbers of contestants 1 , . . . , j 1,..., j 1,...,j. We have
    在这里插入图片描述
    where for
    P ( A 1 , n − 1 ) = 1 n − 1 P(A_{1,n-1})=\frac{1}{n-1} P(A1,n1)=n11
  • We claim that P ( A n , n ∣ A 1 , n − 1 ) = P ( A n , n ) = 1 n P(A_{n,n}|A_{1,n-1})=P(A_{n,n})=\frac{1}{n} P(An,nA1,n1)=P(An,n)=n1The reason is that by symmetry, we have
    在这里插入图片描述
    while by the total probability theorem,
    在这里插入图片描述
    Hence
    P ( N = n ) = 1 ( n − 1 ) n P(N=n)=\frac{1}{(n-1)n} P(N=n)=(n1)n1

SOLUTION 2

  • Let us fix a particular choice of n n n. Think of an outcome of the experiment as an ordering of the values of the n n n contestants, so that there are n ! n! n! equally likely outcomes. The event { N = n } \{N = n\} {N=n} occurs if and only if the first contestant’s number is smallest among the first n − 1 n- 1 n1 contestants, and contestant n n n's number is the smallest among the first n n n contestants. This event can occur in ( n − 2 ) ! (n-2)! (n2)! different ways, namely, all the possible ways of ordering contestants 2 , . . . , n − 1 2,...,n - 1 2,...,n1. Thus, the probability of this event is ( n − 2 ) ! / n ! = 1 / ( n ( n − 1 ) ) (n- 2)!/n! = 1/(n(n - 1)) (n2)!/n!=1/(n(n1)), in agreement with the previous solution.

也可以设第一个人得到数 i i i,则 P ( N = n ) = i n − 2 ( 1 − i ) P(N=n)=i^{n-2}(1-i) P(N=n)=in2(1i),然后将左式在 [ 0 , 1 ] [0,1] [0,1] 上对 i i i 积分,也可以得到相同的答案


Problem 42. Gambler’s ruin.
A gambler makes a sequence of independent bets. In each bet, he wins $1 with probability p p p, and loses $1 with probability 1 − p 1 - p 1p. Initially, the gambler has k k k, and plays until he either accumulates $ n n n or has no money left. What is the probability that the gambler will end up with $n?

SOLUTION

  • Let us denote by A A A the event that he ends up with $n, and by F F F the event that he wins the first bet. Denote also by w k w_k wk the probability of event A A A, if he starts with $k. We apply the total probability theorem to obtain
    在这里插入图片描述
    where q = 1 − p q = 1 - p q=1p. P ( A ∣ F ) = w k + 1 P(A |F) = w_{k+1} P(AF)=wk+1 and P ( A ∣ F C ) = w k − 1 P(A|F^C)= w_{k-1} P(AFC)=wk1. Thus, we have w k = p w k + 1 + q w k − 1 w_k = pw_{k+1} + qw_{k-1} wk=pwk+1+qwk1, ,which can be written as
    在这里插入图片描述
    where r = q / p r = q/p r=q/p. We will solve for w k w_k wk in terms of p p p and q q q using iteration, and the boundary values w 0 = 0 w_0 = 0 w0=0 and w n = 1 w_n = 1 wn=1.
  • We have w k + 1 − w k = r k ( w 1 − w 0 ) w_{k+1} -w_k= r^k(w_1 - w_0) wk+1wk=rk(w1w0), and since w 0 = 0 w_0 = 0 w0=0,
    在这里插入图片描述
    We have
    在这里插入图片描述
    Since w n = 1 w_n= 1 wn=1, we can solve for w 1 w_1 w1 and therefore for w k w_k wk:
    在这里插入图片描述
    so that
    在这里插入图片描述

Problem 46. Laplace’s rule of succession.
Consider m + 1 m + 1 m+1 boxes with the k k kth box containing k k k red balls and m − k m - k mk white balls, where k k k ranges from 0 0 0 to m m m. We choose a box at random (all boxes are equally likely) and then choose a ball at random from that box, n n n successive times (the ball drawn is replaced each time, and a new ball is selected independently). Suppose a red ball was drawn each of the n n n times. What is the probability that if we draw a ball one more time it will be red? Estimate this probability for large m m m.

SOLUTION

  • We want to find the conditional probability P ( E ∣ R n ) P(E | R_n) P(ERn), where E E E is the event of a red ball drawn at time n + 1 n + 1 n+1, and R n R_n Rn is the event of a red ball drawn each of the n n n preceding times.
    P ( E ∣ R n ) = P ( E ∩ R n ) P ( R n ) P(E|R_n)=\frac{P(E\cap R_n)}{P(R_n)} P(ERn)=P(Rn)P(ERn)
    and by using the total probability theorem, we obtain
    在这里插入图片描述
  • For large m m m, we can view P ( R n ) P(R_n ) P(Rn) as a piecewise constant approximation to an integral:
    在这里插入图片描述
    Similarly,
    P ( E ∩ R n ) = P ( R n + 1 ) ≈ 1 n + 2 P(E\cap R_n)=P(R_{n+1})\approx\frac{1}{n+2} P(ERn)=P(Rn+1)n+21
    so that
    P ( E ∣ R n ) ≈ n + 1 n + 2 P(E|R_n)\approx\frac{n+1}{n+2} P(ERn)n+2n+1Thus, for large m m m, drawing a red ball one more time is almost certain when n n n is large.

Problem 48. The Borel-Cantelli lemma (博雷尔-坎泰利引理).
Consider an infinite sequence of trials. The probability of success at the i i ith trial is some positive number p i p_i pi. Let N N N be the event that there is no success, and let I I I be the event that there is an infinite number of successes

  • (a) Assume that the trials are independent and that ∑ i = 1 ∞ p i = ∞ \sum_{i=1}^\infty p_i=\infty i=1pi=. Show that P ( N ) = 0 P(N)=0 P(N)=0 and P ( I ) = 1. P(I)=1. P(I)=1.
  • (b) Assume that ∑ i = 1 ∞ p i < ∞ \sum_{i=1}^\infty p_i<\infty i=1pi<. Show that P ( I ) = 0 P(I)=0 P(I)=0.

SOLUTION
(a)

  • The event N N N is a subset of the event that there were no successes in the first n n n trials, so that
    P ( N ) ≤ ∏ i = 1 n ( 1 − p i ) P(N)\leq\prod_{i=1}^n(1-p_i) P(N)i=1n(1pi)Taking logarithms,
    l o g P ( N ) ≤ ∑ i = 1 n l o g ( 1 − p i ) ≤ ∑ i = 1 n ( − p i ) logP(N)\leq\sum_{i=1}^nlog(1-p_i)\leq\sum_{i=1}^n(-p_i) logP(N)i=1nlog(1pi)i=1n(pi)Taking the limit as n n n tends to infinity, we obtain l o g P ( N ) = − ∞ log P(N) = -\infty logP(N)=. or P ( N ) = 0 P(N) = 0 P(N)=0.
  • Let now L n L_n Ln be the event that there is a finite number of successes and that the last success occurs at the n n nth trial. We use the already established result P ( N ) = 0 P(N) = 0 P(N)=0, and apply it to the sequence of trials after trial n n n, to obtain P ( L n ) = 0 P(L_n ) = 0 P(Ln)=0. The event I C I^C IC (finite number of successes) is the union of the disjoint events L n L_n Ln , n ≥ 1 n\geq1 n1. and N N N, so that
    P ( I C ) = P ( N ) + ∑ n = 1 ∞ P ( L n ) = 0 P(I^C)=P(N)+\sum_{n=1}^\infty P(L_n)=0 P(IC)=P(N)+n=1P(Ln)=0and P ( I ) = 1 P(I) = 1 P(I)=1.

(b)

  • Let S i S_i Si be the event that the i i ith trial is a success. Fix some number n n n and for every i > n i > n i>n, let F i F_i Fi be the event that the first success after time n n n occurs at time i i i. Note that F i ⊂ S i F_i\subset S_i FiSi. Finally, let A n A_n An be the event that there is at least one success after time n n n. Note that I ⊂ A n I \subset A_n IAn. Furthermore, the event A n A_n An is the union of the disjoint events F i , i > n F_i, i > n Fi,i>n. Therefore,
    在这里插入图片描述
    We take the limit of both sides as n → ∞ n\rightarrow\infty n. Because of the assumption ∑ i = 1 ∞ p i < ∞ \sum_{i=1}^\infty p_i<\infty i=1pi<, the right-hand side converges to zero. This implies that P ( I ) = 0 P(I) = 0 P(I)=0.
  • 3
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值