Moment Generating Function
Definition
For a random variable X X X, if E [ e t x ] E[e^{tx}] E[etx] exists ∀ t ∈ ( − h , h ) \forall t \in(-h, h) ∀t∈(−h,h) for some h > 0 h>0 h>0, moment generating function M ( t ) = E [ e t x ] M(t)=E[e^{tx}] M(t)=E[etx]
Properties
X X X is a random variable with MGF M x ( t ) M_x(t) Mx(t) that exists ∀ t ∈ ( − h , h ) \forall t\in (-h,h) ∀t∈(−h,h) where h > 0 h>0 h>0,
- Y = a X + b Y=aX+b Y=aX+b where a , b ∈ R a,b\in \R a,b∈R and a ≠ 0 a\ne 0 a=0. Then, M y ( t ) = e b t M x ( a t ) ∀ t ∈ ( − h ∣ a ∣ , h ∣ a ∣ ) M_y(t)=e^{bt}M_x(at)\forall t\in (-\frac{h}{\vert a\vert},\frac{h}{\vert a\vert}) My(t)=ebtMx(at)∀t∈(−∣a∣h,∣a∣h)
- M ( 0 ) = 1 M(0)=1 M(0)=1 ,and
- M ( k ) ( 0 ) = E [ x k ] M^{(k)}(0)=E[x^k] M(k)(0)=E[xk] where M ( k ) ( t ) = d k d t k M ( t ) M^{(k)}(t)=\frac{d^k}{dt^k}M(t) M(k)(t)=dtkdkM(t) for k = 1 , 2 , . . . k=1,2,... k=1,2,...
- Uniqueness: M x ( t ) = M y ( t ) ∀ t ∈ ( − h , h ) ⟺ X M_x(t)=M_y(t)\forall t\in(-h, h)\iff X Mx(t)=My(t)∀t∈(−h,h)⟺X and Y Y Y have the same distribution
Multivariate Random Function
Definition
Suppose X X X and Y Y Y are r . v r.v r.v defined on the sample space S S S,
- Joint
C
D
F
CDF
CDF of
X
X
X and
Y
Y
Y:
F ( x , y ) = P ( X ≤ x , Y ≤ y ⏟ i n t e r s e c t i o n o f 2 e v e n t s ) ∀ ( x , y ) ∈ R 2 F(x,y)=P(\underbrace{X\le x,Y\le y}_{intersection\space of\space 2\space events})\forall(x,y)\in\R^2 F(x,y)=P(intersection of 2 events X≤x,Y≤y)∀(x,y)∈R2 - Marginal
C
D
F
CDF
CDF of
X
X
X and
Y
Y
Y: (holds for both discrete and continuous
r
.
v
.
r.v.
r.v.)
F 1 ( x ) / F x ( x ) = lim y → ∞ F ( x , y ) = P ( X ≤ x ) , ∀ x ∈ R F_1(x)/F_x(x)= \lim\limits_{y\rarr \infty}F(x,y)=P(X\le x),\forall x\in \R F1(x)/Fx(x)=y→∞limF(x,y)=P(X≤x),∀x∈R
F 2 ( y ) / F y ( y ) = lim x → ∞ F ( x , y ) = P ( Y ≤ y ) , ∀ y ∈ R F_2(y)/F_y(y)= \lim\limits_{x\rarr \infty}F(x,y)=P(Y\le y),\forall y\in \R F2(y)/Fy(y)=x→∞limF(x,y)=P(Y≤y),∀y∈R - Marginal
P
D
F
PDF
PDF of
X
X
X and
Y
Y
Y:
- For discrete
r
.
v
.
r.v.
r.v.,
- f x ( x ) = P ( X = x ) = ∑ a l l y f ( x , y ) ∀ x ∈ R f_x(x)=P(X=x)=\displaystyle\sum_{all\space y}f(x,y)\forall x\in\R fx(x)=P(X=x)=all y∑f(x,y)∀x∈R
- f y ( y ) = P ( Y = y ) = ∑ a l l x f ( x , y ) ∀ y ∈ R f_y(y)=P(Y=y)=\displaystyle\sum_{all\space x}f(x,y)\forall y\in\R fy(y)=P(Y=y)=all x∑f(x,y)∀y∈R
- For continuous
r
.
v
.
r.v.
r.v.,
- f x ( x ) = ∫ − ∞ ∞ f ( x , y ) d y ∀ x ∈ R f_x(x)=\int_{-\infty}^{\infty}f(x,y)dy\forall x\in\R fx(x)=∫−∞∞f(x,y)dy∀x∈R
- f y ( y ) = ∫ − ∞ ∞ f ( x , y ) d x ∀ y ∈ R f_y(y)=\int_{-\infty}^{\infty}f(x,y)dx\forall y\in\R fy(y)=∫−∞∞f(x,y)dx∀y∈R
- For discrete
r
.
v
.
r.v.
r.v.,
Independence
Recall: Independent events are P ( A ∩ B ) = P ( A ) P ( B ) P(A\cap B)=P(A)P(B) P(A∩B)=P(A)P(B)
Theorem
X X X ans Y Y Y are r . v . r.v. r.v.
- If joint CDF F ( x , y ) F(x,y) F(x,y), marginal CDFs F 1 ( x ) F_1(x) F1(x) and F 2 ( y ) F_2(y) F2(y), then X X X and Y Y Y are independent ⟺ F ( x , y ) = F 1 ( x ) F 2 ( y ) ∀ ( x , y ) ∈ R 2 \iff F(x,y)=F_1(x)F_2(y)\forall(x,y)\in\R^2 ⟺F(x,y)=F1(x)F2(y)∀(x,y)∈R2
- If joint p d f / p m f pdf/pmf pdf/pmf f ( x , y ) f(x,y) f(x,y) and marginal p d f s / p m f s pdfs/pmfs pdfs/pmfs f 1 ( x ) f_1(x) f1(x) and f 2 ( y ) f_2(y) f2(y) and A 1 = { s : f 1 ( x ) > 0 } A_1=\{s:f_1(x)>0\} A1={s:f1(x)>0}, A 2 = { y : f 2 ( y ) > 0 } A_2=\{y:f_2(y)>0\} A2={y:f2(y)>0}, X X X and Y Y Y are independent ⟺ f ( x , y ) = f 1 ( x ) f 2 ( y ) ∀ ( x , y ) ∈ ( A 1 × A 2 ) \iff f(x,y)=f_1(x)f_2(y)\forall(x,y)\in(A_1\times A_2) ⟺f(x,y)=f1(x)f2(y)∀(x,y)∈(A1×A2)
- (factorization theorem of independence) If joint p d f / p m f pdf/pmf pdf/pmf f ( x , y ) f(x,y) f(x,y) and A 1 = { s : f 1 ( x ) > 0 } A_1=\{s:f_1(x)>0\} A1={s:f1(x)>0}, A 2 = { y : f 2 ( y ) > 0 } A_2=\{y:f_2(y)>0\} A2={y:f2(y)>0}, X X X and Y Y Y are independent ⟺ ∃ g ( y ) ≥ 0 \iff\exist g(y)\ge 0 ⟺∃g(y)≥0 such that f ( x , y ) = g ( x ) h ( y ) ∀ ( x , y ) ∈ ( A 1 × A 2 ) f(x,y)=g(x)h(y)\forall(x,y)\in(A_1\times A_2) f(x,y)=g(x)h(y)∀(x,y)∈(A1×A2)
- If support
A
A
A is not rectangular, then
X
X
X and
Y
Y
Y must be dependent
i
.
e
.
i.e.
i.e.
∃ ( x , y ) s . t . x ∈ A 1 , y ∈ A 2 , b u t ( x , y ) ∉ A i . e . \exist(x,y)s.t.x\in A_1,y\in A_2, but(x,y)\notin A \space i.e. ∃(x,y)s.t.x∈A1,y∈A2,but(x,y)∈/A i.e.
f 1 ( x ) > 0 , f 2 ( y ) > 0 , f ( x , y ) = 0 f_1(x)>0, f_2(y)>0,f(x,y)=0 f1(x)>0,f2(y)>0,f(x,y)=0 - if X X X and Y Y Y are independent random variables, then if g , h g,h g,h are functions, then g ( x ) g(x) g(x) and h ( y ) h(y) h(y) are independent. (the reverse is false)
Joint Expectations
Definition
- Suppose X X X and Y Y Y are bivariate discrete random variables and h ( x , y ) h(x,y) h(x,y) is a real-valued function. Then, if ∑ ( x , y ) ∈ A ∣ h ( x , y ) ∣ f ( x , y ) < ∞ \displaystyle\sum_{(x,y)\in A}\vert h(x,y)\vert f(x,y)<\infty (x,y)∈A∑∣h(x,y)∣f(x,y)<∞, then E [ h ( x , y ) ] = ∑ ( x , y ) ∈ A h ( x , y ) f ( x , y ) E[h(x,y)]=\displaystyle\sum_{(x,y)\in A}h(x,y)f(x,y) E[h(x,y)]=(x,y)∈A∑h(x,y)f(x,y). Otherwise, E [ h ( x , y ) ] E[h(x,y)] E[h(x,y)] DNE.
- Suppose X X X and Y Y Y are bivariate continuous random variables and h ( x , y ) h(x,y) h(x,y) is a real-valued function. Then, if ∫ − ∞ ∞ ∫ − ∞ ∞ ∣ h ( x , y ) ∣ f ( x , y ) d x d y < ∞ , E [ h ( x , y ) ] = ∫ − ∞ ∞ ∫ − ∞ ∞ h ( x , y ) f ( x , y ) d x d y \int_{-\infty}^\infty\int_{-\infty}^\infty\vert h(x,y)\vert f(x,y)dxdy<\infty,E[h(x,y)]=\int_{-\infty}^\infty\int_{-\infty}^\infty h(x,y) f(x,y)dxdy ∫−∞∞∫−∞∞∣h(x,y)∣f(x,y)dxdy<∞,E[h(x,y)]=∫−∞∞∫−∞∞h(x,y)f(x,y)dxdy. Otherwise, E [ h ( x , y ] E[h(x,y] E[h(x,y] DNE.
Properties
- Linearity
- If X X X and Y Y Y are independent, then ∀ f u n c t i o n g , h , E [ g ( X ) h ( Y ) ] = E [ g ( X ) ] E [ h ( Y ) ] \forall function\space g,h,E[g(X)h(Y)]=E[g(X)]E[h(Y)] ∀function g,h,E[g(X)h(Y)]=E[g(X)]E[h(Y)]
Covariance
C
o
v
(
X
,
Y
)
=
E
[
(
x
−
μ
x
)
(
y
−
μ
y
)
]
,
w
h
e
r
e
μ
x
=
E
[
X
]
,
μ
y
=
E
[
Y
]
Cov(X,Y)=E[(x-\mu_x)(y-\mu_y)],where\space\mu_x=E[X], \mu_y=E[Y]
Cov(X,Y)=E[(x−μx)(y−μy)],where μx=E[X],μy=E[Y]
C
o
v
(
X
,
Y
)
=
0
Cov(X,Y)=0
Cov(X,Y)=0, we say
X
X
X and
Y
Y
Y are uncorrelated
Properties
- C o v ( X , Y ) = E [ X Y ] − E [ X ] E [ Y ] Cov(X,Y)=E[XY]-E[X]E[Y] Cov(X,Y)=E[XY]−E[X]E[Y]
- X X X and Y Y Y are independent ⟹ C o v ( X , Y ) = 0 \implies Cov(X,Y)=0 ⟹Cov(X,Y)=0
- C o v ( X , X ) = V a r ( X ) Cov(X,X)=Var(X) Cov(X,X)=Var(X)
- V a r ( a X + b Y ) = a 2 V a r ( X ) + b 2 V a r ( Y ) + 2 a b C o v ( X , Y ) Var(aX+bY)=a^2Var(X)+b^2Var(Y)+2abCov(X,Y) Var(aX+bY)=a2Var(X)+b2Var(Y)+2abCov(X,Y)
- V a r [ ∑ i = 1 n a i X i ] = ∑ i = 1 n a i 2 V a r ( X i ) + ∑ i ≠ j a i a j C o v ( X i , X j ) Var[\sum_{i=1}^na_iX_i]=\sum_{i=1}^na_i^2Var(X_i)+\sum_{i\ne j}a_ia_jCov(X_i,X_j) Var[∑i=1naiXi]=∑i=1nai2Var(Xi)+∑i=jaiajCov(Xi,Xj)
- C o v ( X + Y , Z ) = C o v ( X , Y ) + C o v ( Y , Z ) Cov(X+Y,Z)=Cov(X,Y)+Cov(Y,Z) Cov(X+Y,Z)=Cov(X,Y)+Cov(Y,Z)
Correlation Coefficient
ρ
(
x
,
y
)
=
C
o
v
(
X
,
Y
)
V
a
r
(
X
)
V
a
r
(
Y
)
\rho(x,y)=\frac{Cov(X,Y)}{\sqrt{Var(X)}\sqrt{Var(Y)}}
ρ(x,y)=Var(X)Var(Y)Cov(X,Y)
−
1
<
ρ
<
1
-1<\rho<1
−1<ρ<1 measures the strength of linear relationship between
X
X
X and
Y
Y
Y
Conditional Distribution
P
(
A
∣
B
)
=
P
(
A
∩
B
)
P
(
B
)
,
g
i
v
e
n
P
(
B
)
>
0
P(A\mid B)=\frac{P(A\cap B)}{P(B)} \space ,given \space P(B)>0
P(A∣B)=P(B)P(A∩B) ,given P(B)>0
Suppose
X
X
X and
Y
Y
Y are bivariate random variables with joint pmf
f
(
x
,
y
)
f(x,y)
f(x,y), then
- conditional pmf/pdf of
X
X
X given
Y
=
y
Y=y
Y=y is
f 1 ( x ∣ y ) = f ( x , y ) f 2 ( y ) , g i v e n f 2 ( y ) > 0 f_1(x\mid y)=\frac{f(x,y)}{f_2(y)}\space,given\space f_2(y)>0 f1(x∣y)=f2(y)f(x,y) ,given f2(y)>0 - conditional pmf/pdf of
Y
Y
Y given
X
=
x
X=x
X=x is
f 2 ( y ∣ x ) = f ( x , y ) f 1 ( x ) , g i v e n f 1 ( x ) > 0 f_2(y\mid x)=\frac{f(x,y)}{f_1(x)}\space,given\space f_1(x)>0 f2(y∣x)=f1(x)f(x,y) ,given f1(x)>0
Properties
X X X and Y Y Y are random variables with marginal pdf/pmf f 1 ( x ) f_1(x) f1(x) and f 2 ( y ) f_2(y) f2(y) and marginal support A 1 A_1 A1 and A 2 A_2 A2
- X X X and Y Y Y are independent ⟺ f 1 ( x ∣ y ) = f 1 ( x ) ∀ x ∈ A 1 ∨ f 2 ( y ∣ x ) = f 2 ( y ) ∀ y ∈ A 2 \iff f_1(x\mid y)=f_1(x)\forall x\in A_1 \lor f_2(y\mid x)=f_2(y)\forall y\in A_2 ⟺f1(x∣y)=f1(x)∀x∈A1∨f2(y∣x)=f2(y)∀y∈A2
- Product Rule: f ( x , y ) = f 1 ( x ∣ y ) f 2 ( y ) = f 2 ( y ∣ x ) f 1 ( x ) f(x,y)=f_1(x\mid y)f_2(y)=f_2(y\mid x)f_1(x) f(x,y)=f1(x∣y)f2(y)=f2(y∣x)f1(x)
Conditional Expectations
Function
g
g
g, the conditional expectation of
g
(
Y
)
g(Y)
g(Y) given
X
=
x
X=x
X=x is:
E
[
g
(
Y
)
∣
X
=
x
]
=
{
∑
a
l
l
y
g
(
y
)
f
x
(
y
∣
x
)
if Y is discrete r.v.
∫
−
∞
∞
g
(
y
)
f
2
(
y
∣
x
)
d
y
if Y is continuous r.v.
,
u
n
l
e
s
s
∑
a
l
l
y
g
(
y
)
f
x
(
y
∣
x
)
/
∫
−
∞
∞
g
(
y
)
f
2
(
y
∣
x
)
d
y
does not converge,
in which case
E
[
g
(
Y
)
∣
X
=
x
]
D
N
E
E[g(Y)\mid X=x] = \begin{cases} \displaystyle\sum_{all\space y}g(y)f_x(y\mid x) &\text{if Y is discrete r.v.} \\ \int_{-\infty}^\infty g(y)f_2(y\mid x)dy &\text{if Y is continuous r.v.} \end{cases}, unless \sum_{all\space y}g(y)f_x(y\mid x)/\int_{-\infty}^\infty g(y)f_2(y\mid x)dy \text{ does not converge,\\ in which case} E[g(Y)\mid X=x] DNE
E[g(Y)∣X=x]=⎩⎨⎧all y∑g(y)fx(y∣x)∫−∞∞g(y)f2(y∣x)dyif Y is discrete r.v.if Y is continuous r.v.,unlessall y∑g(y)fx(y∣x)/∫−∞∞g(y)f2(y∣x)dy does not converge, in which caseE[g(Y)∣X=x]DNE
- g ( Y ) = Y g(Y)=Y g(Y)=Y, E [ Y ∣ X = x ] E[Y\mid X=x] E[Y∣X=x] is called the conditional mean
- g ( y ) = ( y − E [ Y ∣ X = x ] ) 2 ⟹ V a r [ Y ∣ X = x ] = E [ ( Y − E [ Y ∣ X = x ] ) 2 ∣ X = x ] = E [ Y 2 ∣ X = x ] − ( E [ y ∣ X = x ] ) 2 g(y)=(y-E[Y\mid X=x])^2\implies \begin{aligned}Var[Y\mid X=x] &= E\big[(Y-E[Y\mid X=x])^2\mid X=x\big]\\&=E[Y^2\mid X=x]-\big(E[y\mid X=x]\big)^2\end{aligned} g(y)=(y−E[Y∣X=x])2⟹Var[Y∣X=x]=E[(Y−E[Y∣X=x])2∣X=x]=E[Y2∣X=x]−(E[y∣X=x])2 is called conditional variance
Properties
- Independence: If X X X and Y Y Y are independent, ∀ g ( ⋅ ) a n d h ( ⋅ ) , E [ g ( x ) ∣ Y = y ] = E [ g ( x ) ] \forall g(\sdot) and\space h(\sdot),E[g(x)\mid Y=y] = E[g(x)] ∀g(⋅)and h(⋅),E[g(x)∣Y=y]=E[g(x)] and E [ h ( y ) ∣ X = x ] = E [ h ( y ) ] E[h(y)\mid X=x] = E[h(y)] E[h(y)∣X=x]=E[h(y)]
- Substitution Rule: E [ h ( X , Y ) ∣ X = x ] = E [ h ( x , Y ) ∣ X = x ] E[h(X,Y)\mid X=x]=E[h(x,Y)\mid X=x] E[h(X,Y)∣X=x]=E[h(x,Y)∣X=x]
- Double Expectation: E [ G ( Y ) ] = E [ E [ g ( Y ) ∣ X ] ] E[G(Y)]=E\big[E[g(Y)\mid X]\big] E[G(Y)]=E[E[g(Y)∣X]]
-
V
a
r
[
Y
]
=
E
[
V
a
r
(
Y
∣
X
)
⏟
f
u
n
c
t
i
o
n
h
(
x
)
=
V
a
r
[
Y
∣
X
=
x
]
a
p
p
l
i
e
d
t
o
r
.
v
.
X
]
+
V
a
r
[
E
[
Y
∣
X
]
]
Var[Y]=E[\underbrace{Var(Y\mid X)}_{function\space h(x)=Var[Y\mid X = x]\space applied \space to \space r.v.\space X}]+Var[E[Y\vert X]]
Var[Y]=E[function h(x)=Var[Y∣X=x] applied to r.v. X
Var(Y∣X)]+Var[E[Y∣X]]
-
E
[
V
a
r
(
Y
∣
X
)
]
E[Var(Y\vert X)]
E[Var(Y∣X)] :
- find
V
a
r
(
Y
∣
X
)
Var(Y\vert X)
Var(Y∣X)
- figure out the expression for h ( x ) = V a r ( Y ∣ X ) h(x)=Var(Y\vert X) h(x)=Var(Y∣X)
- substitute X X X for big x x x in that expression
- calculate E [ h ( X ) ] E[h(X)] E[h(X)]
- find
V
a
r
(
Y
∣
X
)
Var(Y\vert X)
Var(Y∣X)
-
V
a
r
[
E
[
Y
∣
X
]
]
Var[E[Y\vert X]]
Var[E[Y∣X]]:
- find E [ Y ∣ X ] E[Y\vert X] E[Y∣X]
- calculate V a r [ E [ Y ∣ X ] ⏞ h ˜ (x) ] Var[\overbrace{E[Y\vert X]}^\text{\~{h}(x)}] Var[E[Y∣X] h˜(x)]
-
E
[
V
a
r
(
Y
∣
X
)
]
E[Var(Y\vert X)]
E[Var(Y∣X)] :
Joint Moment Generating Function(MGF)
X
X
X and
Y
Y
Y are random variables. If
E
[
e
t
1
x
+
t
2
y
]
E[e^{t_1x+t_2y}]
E[et1x+t2y] exists
∀
t
1
∈
(
−
h
1
,
h
1
)
\forall t_1\in(-h_1,h_1)
∀t1∈(−h1,h1) and
t
2
∈
(
−
h
2
,
h
2
)
t_2\in(-h_2,h_2)
t2∈(−h2,h2) for
h
1
,
h
2
>
0
h_1,h_2>0
h1,h2>0
Then,
M
(
t
1
,
t
2
)
=
E
[
e
t
1
x
+
t
2
y
]
∀
t
1
,
t
2
M(t_1,t_2)=E[e^{t_1x+t_2y}]\forall t_1,t_2
M(t1,t2)=E[et1x+t2y]∀t1,t2such that
E
[
⋅
]
E[\sdot]
E[⋅] exists is called the joint MGF
Marginal joint MGF
Given M ( t 1 , t 2 ) M(t_1,t_2) M(t1,t2)
- M x ( t 1 ) = M ( t 1 , 0 ) = E [ e t 1 x + 0 y ] = E [ e t 1 x ] M_x(t_1)=M(t_1,0)=E[e^{t_1x+0y}]=E[e^{t_1x}] Mx(t1)=M(t1,0)=E[et1x+0y]=E[et1x]
- M y ( t 2 ) = M ( 0 , t 2 ) = E [ e 0 x + t 2 y ] = E [ e t 2 y ] M_y(t_2)=M(0,t_2)=E[e^{0x+t_2y}]=E[e^{t_2y}] My(t2)=M(0,t2)=E[e0x+t2y]=E[et2y]
Properties
X
X
X and
Y
Y
Y are random variables with joint MGF
M
(
t
1
,
t
2
)
M(t_1,t_2)
M(t1,t2). Then,
X
X
X and
Y
Y
Y are independent
⟺
M
(
t
1
,
t
2
)
=
M
x
(
t
1
)
M
y
(
t
2
)
\iff M(t_1,t_2)=M_x(t_1)M_y(t_2)
⟺M(t1,t2)=Mx(t1)My(t2)
Multinomial Distribution
(
X
1
,
X
2
,
.
.
.
,
X
k
)
∼
M
u
l
t
i
n
o
m
i
a
l
(
n
;
p
1
,
p
2
,
.
.
.
,
p
k
)
is discrete random variable with joint pmf
f
(
x
1
,
x
2
,
.
.
.
,
x
k
)
=
{
n
!
x
1
!
x
2
!
.
.
.
x
k
!
p
1
x
1
p
2
x
2
.
.
.
p
k
x
k
x
i
=
0
,
1
,
.
.
.
,
n
,
i
=
1
,
2
,
.
.
.
,
k
,
∑
i
=
1
k
x
1
=
n
0
o
.
w
.
f
o
r
0
<
p
i
<
1
,
∑
i
=
1
k
p
i
=
1
(X_1,X_2,...,X_k)\sim Multinomial(n;p_1,p_2,...,p_k) \text{ is discrete random variable with joint pmf} \\ f(x_1,x_2,...,x_k)=\begin{cases}\frac{n!}{x_1!x_2!...x_k!}p_1^{x_1}p_2^{x_2}...p_k^{x_k} &x_i=0,1,...,n, i=1,2,...,k,\sum_{i=1}^kx_1=n\\0 &o.w. \end{cases} \\ for\space 0<p_i<1, \sum_{i=1}^kp_i=1
(X1,X2,...,Xk)∼Multinomial(n;p1,p2,...,pk) is discrete random variable with joint pmff(x1,x2,...,xk)={x1!x2!...xk!n!p1x1p2x2...pkxk0xi=0,1,...,n,i=1,2,...,k,∑i=1kx1=no.w.for 0<pi<1,i=1∑kpi=1
Concrete example: k boxes randomly pick one of the k boxes, probability of picking the
i
t
h
i^{th}
ith box being
p
i
p_i
pi. Repeat picking n times independently.
X
i
X_i
Xi is number of times box
i
i
i was picked,
i
=
1
,
2
,
.
.
.
,
k
i=1,2,...,k
i=1,2,...,k
Properties
( X 1 , X 2 , . . . , X k ) ∼ M u l t i n o m i a l ( n ; p 1 , p 2 , . . . , p k ) (X_1,X_2,...,X_k)\sim Multinomial(n;p_1,p_2,...,p_k) (X1,X2,...,Xk)∼Multinomial(n;p1,p2,...,pk)
- Joint MGF M ( t 1 , t 2 , . . . , t k ) = E [ e t 1 x 1 + t 2 x 2 + . . . + t k x k ] = ( p 1 e t 1 + . . . + p k e t k ) n ∀ ( t 1 , t 2 , . . . , t k ) ∈ R k M(t_1,t_2,...,t_k)=E[e^{t_1x_1+t_2x_2+...+t_kx_k}]=(p_1e^{t_1}+...+p_ke^{t_k})^n\forall (t_1,t_2,...,t_k)\in \R^k M(t1,t2,...,tk)=E[et1x1+t2x2+...+tkxk]=(p1et1+...+pketk)n∀(t1,t2,...,tk)∈Rk
- X i ∼ B i n o m i a l ( n , p i ) f o r i = 1 , 2 , . . . , k X_i\sim Binomial(n,p_i)for\space i=1,2,...,k Xi∼Binomial(n,pi)for i=1,2,...,k
- T = X i + X j ( i ≠ j ) ⟹ T ∼ B i n o m i a l ( n , p i + p j ) T=X_i+X_j(i\ne j)\implies T\sim Binomial(n,p_i+p_j) T=Xi+Xj(i=j)⟹T∼Binomial(n,pi+pj)
- E [ X i ] = n p i V a r [ X i ] = n p i ( 1 − p i ) C o v ( X i , X j ) = − n p i p j ( i ≠ j ) E[X_i]=np_i\\Var[X_i]=np_i(1-p_i)\\Cov(X_i,X_j)=-np_ip_j(i\ne j) E[Xi]=npiVar[Xi]=npi(1−pi)Cov(Xi,Xj)=−npipj(i=j)
- X i ∣ X j = x j ∼ B i n ( n − x j , p i 1 − p j ) , i ≠ j X_i\mid X_j=x_j\sim Bin(n-x_j,\frac{p_i}{1-p_j}),i\ne j Xi∣Xj=xj∼Bin(n−xj,1−pjpi),i=j
- X i ∣ X i + X j = t ∼ B i n ( t , p i p i + p j ) , i ≠ j X_i\mid X_i+X_j=t\sim Bin(t,\frac{p_i}{p_i+p_j}),i\ne j Xi∣Xi+Xj=t∼Bin(t,pi+pjpi),i=j
Bivariate Normal Distribution
x → = ( x 1 x 2 ) ∼ B V N ( μ → , Σ ) w h e r e X 1 a n d X 2 are continutous random variables with joint pdf f ( x 1 , x 2 ) = 1 2 π ∣ Σ ∣ 1 / 2 e x p { − ( x − μ ) Σ − 1 ( x − μ ) T 2 } , w h e r e x = ( x 1 x 2 ) , μ = ( μ 1 μ 2 ) , Σ = ( σ 1 2 ρ σ 1 σ 2 ρ σ 1 σ 2 σ 2 2 ) i s p o s i t i v e d e f i n i t e \overrightarrow{x}=\begin{pmatrix}x_1\\x_2\end{pmatrix}\sim BVN(\overrightarrow{\mu},\Sigma)\ where\space X_1\space and\space X_2 \text{ are continutous random variables with joint pdf }\\f(x_1,x_2)=\frac{1}{2\pi |\Sigma|^{1/2}}exp\{\frac{-(x-\mu)\Sigma^{-1}(x-\mu)^T}{2}\},\\ where\ x=\begin{pmatrix}x_1\\x_2\end{pmatrix},\mu=\begin{pmatrix}\mu_1\\ \mu_2\end{pmatrix},\Sigma=\begin{pmatrix}\sigma_1^2 & \rho\sigma_1\sigma_2\\ \rho\sigma_1\sigma_2 & \sigma_2^2\end{pmatrix} is\ positive\ definite x=(x1x2)∼BVN(μ,Σ) where X1 and X2 are continutous random variables with joint pdf f(x1,x2)=2π∣Σ∣1/21exp{2−(x−μ)Σ−1(x−μ)T},where x=(x1x2),μ=(μ1μ2),Σ=(σ12ρσ1σ2ρσ1σ2σ22)is positive definite
Properties
- X 1 , X 2 X_1,X_2 X1,X2 has joint MGF M ( t 1 , t 2 ) = E [ e t 1 x 1 + t 2 x 2 ] = e x p { t T μ + 1 2 t T Σ t } ∀ t = ( t 1 t 2 ) ∈ R 2 M(t_1,t_2)=E[e^{t_1x_1+t_2x_2}]=exp\{t^T\mu+\frac{1}{2}t^T\Sigma t\}\forall t=\begin{pmatrix}t_1\\t_2\end{pmatrix}\in\R^2 M(t1,t2)=E[et1x1+t2x2]=exp{tTμ+21tTΣt}∀t=(t1t2)∈R2
- M x 1 ( t ) = M ( t , 0 ) = e x p { t 1 μ 1 + 1 2 t 1 2 σ 1 2 } → X 1 ∼ N ( μ , σ 1 2 ) M x 2 ( t ) = M ( 0 , t ) = e x p { t 2 μ 2 + 1 2 t 2 2 σ 2 2 } → X 2 ∼ N ( μ , σ 2 2 ) M_{x_1}(t)=M(t,0)=exp\{t_1\mu_1+\frac{1}{2}t_1^2\sigma_1^2\}\to X_1\sim N(\mu,\sigma_1^2)\\M_{x_2}(t)=M(0,t)=exp\{t_2\mu_2+\frac{1}{2}t_2^2\sigma_2^2\}\to X_2\sim N(\mu,\sigma_2^2) Mx1(t)=M(t,0)=exp{t1μ1+21t12σ12}→X1∼N(μ,σ12)Mx2(t)=M(0,t)=exp{t2μ2+21t22σ22}→X2∼N(μ,σ22)
- C o v ( X 1 , X 2 ) = ρ σ 1 σ 2 Cov(X_1,X_2)=\rho\sigma_1\sigma_2 Cov(X1,X2)=ρσ1σ2
- ρ = 0 ⟺ X 1 a n d X 2 \rho=0\iff X_1\ and\ X_2 ρ=0⟺X1 and X2 are independent and correlated
- c = ( c 1 c 2 ) ≠ 0 , c T X = c 1 x 1 + c 2 x 2 ∼ N ( μ T c , c T Σ c ) c=\begin{pmatrix}c_1\\c_2\end{pmatrix}\ne 0,\ c^TX=c_1x_1+c_2x_2\sim N(\mu^Tc,c^T\Sigma c) c=(c1c2)=0, cTX=c1x1+c2x2∼N(μTc,cTΣc)
- Nonsingular matrix A ∈ R 2 × 2 , b ∈ R 2 , t h e n A X + b ∼ B V N ( A μ , A Σ A T ) A\in\R^{2\times 2},b\in\R^2,\ then\ AX+b\sim BVN(A\mu,A\Sigma A^T) A∈R2×2,b∈R2, then AX+b∼BVN(Aμ,AΣAT)
- ( ( x − μ ) T Σ − 1 ( x − μ ) ) ∼ χ 2 \Big((x-\mu)^T\Sigma^{-1}(x-\mu)\Big)\sim\chi^2 ((x−μ)TΣ−1(x−μ))∼χ2