本文为 I n t r o d u c t i o n Introduction Introduction t o to to P r o b a b i l i t y Probability Probability 的读书笔记
目录
Probability Models 概率模型
- A probabilistic model is a mathematical description of an uncertain situation. Its two main ingredients are listed below and are visualized in Fig. 1.2.
Sample space: 样本空间
Sample Spaces and Events 样本空间和事件
- Every probabilistic model involves an underlying process, called the experiment, that will produce exactly one out of several possible outcomes.
- It is important to note that in our formulation of a probabilistic model, there is only one experiment. So, three tosses of a coin constitute (组成) a single experiment rather than three experiments.
- The set of all possible outcomes is called the sample space of the experiment, and is denoted by Ω \Omega Ω.
- A subset of the sample space, that is, a collection of possible outcomes, is called an event (事件 / 随机事件).
- 设 A A A 和 B B B 为两个随机事件,事件 A ∪ B A\cup B A∪B 称为 A A A 和 B B B 的和事件,也记为 A + B A+B A+B
- 事件 A ∩ B A\cap B A∩B 称为 A A A 和 B B B 的积事件,也记为 A B AB AB
- 事件 A − B A- B A−B 称为 A A A 和 B B B 的差事件;注意到 A − B = A B ˉ A-B=A\bar B A−B=ABˉ
- 事件 Ω − A \Omega-A Ω−A 称为 A A A 的补事件,也记为 A ˉ \bar A Aˉ 或 A C A^C AC
- 补充:De Morgan’s laws (德摩根律)
( ∪ n S n ) c = ∩ n S n c , ( ∩ n S n ) c = ∪ n S n c (\mathop{\cup}\limits_{n}S_n)^c=\mathop{\cap}\limits_{n}S_n^c,(\mathop{\cap}\limits_{n}S_n)^c=\mathop{\cup}\limits_{n}S_n^c (n∪Sn)c=n∩Snc,(n∩Sn)c=n∪SncPROOF
[Hint: if x ∈ ( ∪ n S n ) c x\in(\mathop{\cup}\limits_{n}S_n)^c x∈(n∪Sn)c, then x ∈ ∩ n S n c x\in\mathop{\cap}\limits_{n}S_n^c x∈n∩Snc]
Sequential Models 序贯模型
- Many experiments have an inherently sequential character; for example, tossing a coin three times. It is then often useful to describe the experiment and the associated sample space by means of a tree-based sequential description(序贯树形图), as in Fig. 1.3.
die: 骰子
Note that every node of the tree can be identified with an event. For example, the node labeled by a 1 can be identified with the event {(1, 1), (1, 2). (1, 3), (1, 4) } that the result of the first roll is 1.
Probability Laws 概率律
- The probability law assigns to every event
A
A
A. a number
P
(
A
)
P(A)
P(A), called the probability of
A
A
A. satisfying the following axioms.
- Inference (推论):
1 = P ( Ω ) = P ( Ω ∪ ∅ ) = P ( Ω ) + P ( ∅ ) = 1 + P ( ∅ ) ∴ P ( ∅ ) = 0 1=P(\Omega)=P(\Omega\cup\varnothing)=P(\Omega)+P(\varnothing)=1+P(\varnothing)\\\therefore P(\varnothing)=0 1=P(Ω)=P(Ω∪∅)=P(Ω)+P(∅)=1+P(∅)∴P(∅)=0
Properties of Probability Laws
- 减法公式: P ( A − B ) = P ( A B ˉ ) = P ( A ) − P ( A B ) P(A-B)=P(A\bar B)=P(A)-P(AB) P(A−B)=P(ABˉ)=P(A)−P(AB)
- 加法公式:
P
(
A
∪
B
)
=
P
(
A
)
+
P
(
B
)
−
P
(
A
B
)
P(A\cup B)=P(A)+P(B)-P(AB)
P(A∪B)=P(A)+P(B)−P(AB)
- 推广: P ( A ∪ B ∪ C ) = P ( A ) + P ( B ) + P ( C ) − P ( A B ) − P ( A C ) − P ( B C ) + P ( A B C ) P(A\cup B\cup C)=P(A)+P(B)+P(C)-P(AB)-P(AC)-P(BC)+P(ABC) P(A∪B∪C)=P(A)+P(B)+P(C)−P(AB)−P(AC)−P(BC)+P(ABC)
Problem 9
A partition of the sample space Ω \Omega Ω is a collection of disjoint events S 1 , . . . , S n S_1, ... , S_n S1,...,Sn such that Ω = ∪ i = 1 n S i \Omega = \cup_{i=1}^n S_i Ω=∪i=1nSi.
- (a) Show that for any event
A
A
A, we have
P ( A ) = ∑ i = 1 n P ( A ∩ S i ) P(A)=\sum_{i=1}^nP(A\cap S_i) P(A)=i=1∑nP(A∩Si) - (b) Use part (a) to show that for any events
A
,
B
A, B
A,B, and
C
C
C, we have
P ( A ) = P ( A ∩ B ) + P ( A ∩ C ) + P ( A ∩ B C ∩ C C ) − P ( A ∩ B ∩ C ) P(A)=P(A\cap B)+P(A\cap C)+P(A\cap B^C\cap C^C)-P(A\cap B\cap C) P(A)=P(A∩B)+P(A∩C)+P(A∩BC∩CC)−P(A∩B∩C)
Problem 10.
Show the formula
P
(
(
A
∩
B
C
)
∪
(
A
C
∩
B
)
)
=
P
(
A
)
+
P
(
B
)
−
2
P
(
A
∩
B
)
,
P((A\cap B^C)\cup(A^C\cap B))=P(A)+P(B)-2P(A\cap B),
P((A∩BC)∪(AC∩B))=P(A)+P(B)−2P(A∩B),which gives the probability that exactly one of the events
A
A
A and
B
B
B will occur.
SOLUTION
P
(
(
A
∩
B
C
)
∪
(
A
C
∩
B
)
)
=
P
(
A
∩
B
C
)
+
P
(
A
C
∩
B
)
=
P
(
A
)
−
P
(
A
∩
B
)
+
P
(
B
)
−
P
(
A
∩
B
)
\begin{aligned}&P((A\cap B^C)\cup(A^C\cap B))\\ =&P(A\cap B^C)+P(A^C\cap B)\\ =&P(A)-P(A\cap B)+P(B)-P(A\cap B)\end{aligned}
==P((A∩BC)∪(AC∩B))P(A∩BC)+P(AC∩B)P(A)−P(A∩B)+P(B)−P(A∩B)
Problem 11 Bonferroni’s inequality (邦费罗尼不等式)
- (a) Prove that for any two events
A
A
A and
B
B
B, we have
P ( A ∩ B ) ≥ P ( A ) + P ( B ) − 1 P(A\cap B)\geq P(A)+P(B)-1 P(A∩B)≥P(A)+P(B)−1
SOLUTION
- (a) We have P ( A ∪ B ) = P ( A ) + P ( B ) − P ( A ∩ B ) P(A\cup B) = P(A) + P(B) - P(A\cap B) P(A∪B)=P(A)+P(B)−P(A∩B) and P ( A ∪ B ) ≤ 1 P(A\cup B)\leq1 P(A∪B)≤1. which implies part (a).
也可以这么做: 1 + P ( A ∩ B ) = 2 P ( A ∩ B ) + P ( A ∩ B C ) + P ( A C ∩ B ) + P ( A C ∩ B C ) = ( P ( A ∩ B ) + P ( A ∩ B C ) ) + ( P ( A ∩ B ) + P ( A C ∩ B ) ) + P ( A C ∩ B C ) = P ( A ) + P ( B ) + P ( A C ∩ B C ) ≥ P ( A ) + P ( B ) 1+P(A\cap B)=2P(A\cap B)+P(A\cap B^C)+P(A^C\cap B)+P(A^C\cap B^C)=(P(A\cap B)+P(A\cap B^C))+(P(A\cap B)+P(A^C\cap B))+P(A^C\cap B^C)=P(A)+P(B)+P(A^C\cap B^C)\geq P(A)+P(B) 1+P(A∩B)=2P(A∩B)+P(A∩BC)+P(AC∩B)+P(AC∩BC)=(P(A∩B)+P(A∩BC))+(P(A∩B)+P(AC∩B))+P(AC∩BC)=P(A)+P(B)+P(AC∩BC)≥P(A)+P(B)
Problem 13. Continuity property of probabilities (概率的连续性)
-
(
a
)
(a)
(a) Let
A
1
,
A
2
,
.
.
.
A_1 , A_2, ...
A1,A2,.... be an infinite sequence of events, which is “monotonically increasing,” meaning that
A
n
⊂
A
n
+
1
A_n\subset A_{n +1}
An⊂An+1 for every
n
n
n. Let
A
=
∪
n
=
1
∞
A
n
A = \cup_{n=1}^\infty A_n
A=∪n=1∞An. Show that
P
(
A
)
=
l
i
m
n
→
∞
P
(
A
n
)
P(A) = lim_{n\rightarrow \infty} P(A_n )
P(A)=limn→∞P(An)
[Hint: Express the event A A A as a union of countably many disjoint sets.] - ( b ) (b) (b) Suppose now that the events are “monotonically decreasing,” i.e., A n + 1 ⊂ A n A_{n + 1}\subset A_n An+1⊂An for every n n n. Let A = ∩ n = 1 ∞ A n A= \cap_{n=1}^\infty A_n A=∩n=1∞An . Show that P ( A ) = l i m n → + ∞ P ( A n ) P(A) = lim_{n\rightarrow +\infty} P(A_n ) P(A)=limn→+∞P(An).
-
(
c
)
(c)
(c) Consider a probabilistic model whose sample space is the real line. Show that
P ( [ 0 , ∞ ) ) = l i m n → ∞ P ( [ 0 , n ] ) l i m n → ∞ P ( [ n , ∞ ) ) = 0 P([0,\infty))=lim_{n\rightarrow\infty}P([0,n])\\ lim_{n\rightarrow\infty}P([n,\infty))=0 P([0,∞))=limn→∞P([0,n])limn→∞P([n,∞))=0
SOLUTION
- (a) Let
B
1
=
A
1
B_1 = A_1
B1=A1 and, for
n
≥
2
,
B
n
=
A
n
∩
A
n
−
1
C
n\geq2, B_n = A_n\cap A_{n-1}^C
n≥2,Bn=An∩An−1C . The events
B
n
B_n
Bn are disjoint, and we have
∪
k
=
1
n
B
k
=
A
n
\cup_{k=1}^n B_k= A_n
∪k=1nBk=An, and
∪
k
=
1
∞
B
k
=
A
\cup_{k=1}^\infty B_k= A
∪k=1∞Bk=A.
P ( A ) = ∑ k = 1 ∞ P ( B k ) = P ( ∪ k = 1 ∞ B k ) = l i m n → ∞ P ( ∪ k = 1 n B k ) = l i m n → ∞ P ( A n ) P(A)=\sum_{k=1}^\infty P(B_k)=P(\cup_{k=1}^\infty B_k)=lim_{n\rightarrow \infty}P(\cup_{k=1}^n B_k)=lim_{n\rightarrow \infty} P(A_n) P(A)=k=1∑∞P(Bk)=P(∪k=1∞Bk)=limn→∞P(∪k=1nBk)=limn→∞P(An) - (b) [Hint: Apply the result of part (a) to the complements of the events P ( A C ) P(A^C) P(AC).]
- ( c c c) For the first equality, use the result frorn part (a) with A n = [ 0 , n ] A_n= [0, n] An=[0,n] and A = [ 0 , ∞ ) A= [0, \infty) A=[0,∞). For the second, use the result from part (b) with A n = [ n , ∞ ) A_n= [n,\infty) An=[n,∞) and A = ∅ A = \varnothing A=∅.
Discrete Models 离散模型
离散概率律
Note that we are using here the simpler notation P ( s i ) P(s_i) P(si) to denote the probability of the event { s i } \{s_i\} {si} , instead of the more precise P ( { s i } ) P(\{ s_i\}) P({si}).
离散均匀概率律 (古典概型)
- In the special case where the probabilities
P
(
s
1
)
P(s_1)
P(s1), … ,
P
(
s
n
)
P(s_n)
P(sn) are all the same (by necessity equal to
1
/
n
1/n
1/n), we obtain the following.
Continuous Models 连续模型
- Probabilistic models with continuous sample spaces differ from their discrete counterparts in that the probabilities of the single-element events may not be sufficient to characterize the probability law.
- 几何概型:连续均匀概率律
Example 1.4
- A wheel of fortune (幸运轮) is continuously calibrated from 0 to 1, so the possible outcomes of an experiment consisting of a single spin are the numbers in the interval n = [ 0 , 1 ] n = [0, 1] n=[0,1]. Assuming a fair wheel, it is appropriate to consider all outcomes equally likely, but what is the probability of the event consisting of a single element? It cannot be positive, because then, using the additivity axiom, it would follow that events with a sufficiently large number of elements would have probability larger than 1. Therefore, the probability of any event that consists of a single element must be 0.
- In this example, it makes sense to assign probability b − a b - a b−a to any subinterval [ a , b ] [a, b] [a,b] of [ 0 , 1 ] [0, 1] [0,1], and to calculate the probability of a more complicated set by evaluating its “length”.
- The legitimacy of using length as a probability law hinges on the fact that the unit interval has an uncountably infinite number of elements.