Chapter 2 (Discrete Random Variables): Probability mass functions (PMF 分布列)

连理o

已于 2022-07-03 10:48:04 修改

阅读量659

点赞数

分类专栏：概率论与数理统计文章标签：机器学习人工智能

于 2021-01-28 19:44:19 首次发布

本文链接：https://blog.csdn.net/weixin_42437114/article/details/113306886

版权

概率论与数理统计专栏收录该内容

34 篇文章

订阅专栏

The most important way to characterize a random variable is through the probabilities of the values that it can take. For a discrete random variable $X$ , these are captured by the probability mass function (PMF for short) of $X$ , denoted $p_X$ . In particular, for any real number $x$ , the probability mass of $x$ . denoted $p_X(x)$ . is the probability of the event ${X = x\}$ . Thus, from the additivity and normalization axioms, we have
$\sum_{x}p_X(x)=1$

In what follows, we will often omit the braces from the event/set notation when no ambiguity can arise. In particular, we will usually write $P (X = x)$ in place of the more correct notation $P(\{X = x\})$ .

We will use upper case characters to denote random variables, and lower case characters to denote real numbers such as the numerical values of a random variable.

The Bernoulli Random Variable (伯努利随机变量)

Consider the toss of a coin, which comes up a head with probability $p$ , and a tail with probability $1 - p$ . The Bernoulli random variable takes the two values 1 and 0, depending on whether the outcome is a head or a tail:

Its PMF is

The Binomial Random Variable

A coin is tossed $n$ times. At each toss, the coin comes up a head with probability $p$ , and a tail with probability $1 - p$ , independent of prior tosses. Let $X$ be the number of heads in the $n$ -toss sequence. We refer to $X$ as a binomial random variable with parameters $n$ and $p$ . The PMF of $X$ consists of the binomial probabilities:
$p_X(k) = P(X = k) =\begin{pmatrix}n\\k \end{pmatrix}p^k(1-p)^{n-k},k = 0, 1, ... , n.$ The normalization property, specialized to the binomial random variable, is written as
$\sum_{k=0}^n\begin{pmatrix}n\\k \end{pmatrix}p^k(1-p)^{n-k}=1$

Form of the binomial PMF.

Let $k^*=\lfloor (n + 1)p\rfloor$ . The PMF $p_X(k)$ is monotonically nondecreasing with $k$ in the range from $0$ to $k^*$ . and is monotonically decreasing with $k$ for $k\geq k^*$ .
$(\frac{p_X(k)}{p_X(k-1)}=\frac{(n+1)p-kp}{k-kp})$

Problem 6.

The Celtics and the Lakers are set to play a playoff series of $n$ basketball games, where $n$ is odd. The Celtics have a probability $p$ of winning any one game, independent of other games. For any $k > 0$ , find the values for $p$ for which $n = 2 k + 1$ is better for the Celtics than $n = 2 k - 1$ .

SOLUTION

Let $N$ be the number of Celtics’ wins in the first $2 k - 1$ games. If $A$ denotes the event that the Celtics win with $n = 2 k + 1$ , and $B$ denotes the event that the Celtics win with $n = 2 k - 1$ , then
$P(A)=P(N\geq k+1)+P(N=k)\cdot(1-(1-p)^2)+P(N=k-1)\cdot p^2\\ P(B)=P(N\geq k)=P(N=k)+P(N\geq k+1)$ and therefore
$\begin{aligned}P(A)-P(B)&=P(N=k-1)\cdot p^2-P(N=k)\cdot (1-p)^2\\&=\frac{(2k-1)!}{(k-1)!k!}p^k(1-p)^k(2p-1)\end{aligned}$ It follows that $P (A) > P (B)$ if and only if $\frac{1}{2}$ . Thus, a longer series is better for the better team.

The Geometric Random Variable

几何随机变量

Suppose that we repeatedly and independently toss a coin with probability of a head equal to $p$ , where $0 < p < 1$ . The geometric random variable is the number $X$ of tosses needed for a head to come up for the first time. Its PMF is given by
$p_X(k)=(1-p)^{k-1}p,k=1,2,...,$
More generally, we can interpret the geometric random variable in terms of repeated independent trials until the first “success.”

The Poisson Random Variable

泊松随机变量

A Poisson random variable has a PMF given by
$p_X(k)=e^{-\lambda}\frac{\lambda^k}{k!},\ \ \ \ \ \ \ \ k=0,1,2,...,$ where $\lambda$ is a positive parameter characterizing the PMF. This is a legitimate PMF because
$\sum_{k=0}^\infty e^{-\lambda}\frac{\lambda^k}{k!}=e^{-\lambda}e^{\lambda}=1$

Form of the Poisson PMF.

The PMF $p_X(k)$ increases monotonically with $k$ up to the point where $k$ reaches the largest integer not exceeding $\lambda$ , and after that point decreases monotonically with $k$ .
$(\frac{p_X(k)}{p_X(k-1)}=\frac{\lambda}{k})$

Poisson approximation property

The Poisson PMF with parameter $\lambda$ is a good approximation for a binomial PMF with parameters $n$ and $p$ . i.e …
$e^{-\lambda}\frac{\lambda^k}{k!}\approx \frac{n!}{k!(n-k)!}p^k(1-p)^{n-k},\ \ \ \ \ \ \ \ if\ k\ll n$ provided $\boldsymbol{\lambda= np}$ . $n$ is very large, and $p$ is very small. In this case. using the Poisson PMF may result in simpler models and calculations.
- For example. let $n = 100$ and $p = 0.01$ . Then the probability of $k = 5$ successes in $n = 100$ trials is calculated using the binomial PMF as $0.00290$ . Using the Poisson PMF with $\lambda= np = 100\cdot0.01 = 1$ . this probability is approximated by $0.00306$ .
Proof: Consider the PMF of a binomial random variable with parameters $n\rightarrow\infty$ and $p\rightarrow0$ while $n p$ is fixed at a given value $\lambda$
$p_X(k)=\frac{n!}{(n-k)!k!}p^k(1-p)^{n-k}=\frac{n(n-1)...(n-k+1)}{n^k}\frac{\lambda^k}{k!}(1-\frac{\lambda}{n})^{n-k}\\ \frac{n-k+j}{n}\rightarrow1,(1-\frac{\lambda}{n})^{k}\rightarrow1,(1-\frac{\lambda}{n})^{n}\rightarrow e^{-\lambda}$ Thus, for each fixed $k$ , as $n\rightarrow\infty$ we obtain
$p_X(k)\rightarrow e^{-\lambda}\frac{\lambda^k}{k!}$

Functions of Random Variables

Given a random variable $X$ , one may generate other random variables by applying various transformations on $X$ . If $Y = g (X)$ is a function of a random variable $X$ , then $Y$ is also a random variable, since it provides a numerical value for each possible outcome.
If $X$ is discrete with PMF $p_X$ . then $Y$ is also discrete, and its PMF $p_Y$ can be calculated using the PMF of $X$ .
$p_Y(y)=\sum_{\{x|g(x)=y\}}p_X(x)$