Chapter 4 (Further Topics on Random Variables): Covariance and Correlation (协方差和相关)

连理o

于 2021-02-18 12:04:41 发布

阅读量328

点赞数

分类专栏：概率论与数理统计文章标签：概率论

本文链接：https://blog.csdn.net/weixin_42437114/article/details/113832214

版权

概率论与数理统计专栏收录该内容

34 篇文章 14 订阅

订阅专栏

本文为 $I n t r o d u c t i o n$ $t o$ $P r o b a b i l i t y$ 的读书笔记

Covariance

The covariance of two random variables $X$ and $Y$ , denoted by $c o v (X, Y)$ , is defined by
$\begin{aligned}cov(X,Y)&=E[(X-E[X])(Y-E[Y])] \\&=E[XY]-E[X]E[Y] \end{aligned}$
- When $c o v (X, Y) = 0$ , we say that $X$ and $Y$ are uncorrelated.
- Roughly speaking, a positive or negative covariance indicates that the values of $X - E [X]$ and $Y - E [Y]$ obtained in a single experiment “tend” to have the same or the opposite sign, respectively. Thus the sign of the covariance provides an important qualitative indicator of the relationship between $X$ and $Y$ .

Properties of covariances

For any random variables $X$ , $Y$ , and $Z$ , and any scalars $a$ and $b$ , we have
$\text{cov}(X,X)=\text{var}(X) \\\text{cov}(X,aY+b)=a\cdot \text{cov}(X,Y) \\\text{cov}(X,Y+Z)=\text{cov}(X,Y)+\text{cov}(X,Z)$
Note thate if $X$ and $Y$ are independent, we have $\text{cov}(X, Y) = E[XY]-E[X]E[Y]=0$ . Thus, if $X$ and $Y$ are independent, they are also uncorrelated. However, the converse is generally not true.
- Assume that $X$ and $Y$ satisfy
  $E[X|Y=y]=E[X],\ \ \ \ for\ all\ y$ Then, assuming $X$ and $Y$ are discrete, the total expectation theorem implies that
  $\begin{aligned}E[XY]&=\sum_yp_Y(y)E[XY|Y=y]=\sum_yyp_Y(y)E[X|Y=y]\\ &=\sum_yyp_Y(y)E[X]=E[X]E[Y]\end{aligned}$ so $X$ and $Y$ are uncorrelated. The argument for the continuous case is similar.

Example 4.13.

The pair of random variables $(X, Y)$ takes the values $(1, 0), (0, 1), (- 1, 0)$ and $(0, - 1)$ . each with probability $1 / 4$ . Therefore,
$\text{cov}(X,Y)=E[XY]-E[X]E[Y]=0-0=0$ and $X$ and $Y$ are uncorrelated.
However, $X$ and $Y$ are not independent since, for example, a nonzero value of $X$ fixes the value of $Y$ to zero.

Correlation Coefficient

相关系数

The correlation coefficient $\rho(X,Y)$ of two random variables $X$ and $Y$ that have nonzero variances is defined as (Assuming that $X$ and $Y$ have positive variances)
$\rho(X, Y) =\frac{\text{cov}(X, Y)}{\sqrt{\text{var(X)} \text{var(Y)}}}$

The simpler notation $\rho$ will also be used when $X$ and $Y$ are clear from the context.

It may be viewed as a normalized version of the covariance $\text{cov}(X, Y)$ , and in fact, it can be shown that $\rho$ ranges from $- 1$ to $1$ .
- If $\rho>0$ (or $\rho < 0$ ), then the values of $X - E [X]$ and $Y - E [Y]$ “tend” to have the same (or opposite, respectively) sign. The size of $|\rho|$ normalized measure of the extent to which this is true.
- In fact, always assuming that $X$ and $Y$ have positive variances, it can be shown that $\rho = 1$ (or $\rho = -1$ ) if and only if there exists a positive (or negative, respectively) constant $c$ such that
  $Y - E [Y] = c (X - E [X])$

Problem 20. Schwarz inequality 施瓦兹不等式

Show that for any random variables $X$ and $Y$ , we have
$(E[XY])^2\leq E[X^2]E[Y^2]$

SOLUTION

We may assume that $E[Y^2]\neq 0$ ; otherwise, we have $Y = 0$ with probability 1, and hence $E [X Y] = 0$ . so the inequality holds.
We have
$\begin{aligned}0&\leq E[(X-\frac{E[XY]}{E[Y^2]}Y)^2]\\ &=E[X^2-2\frac{E[XY]}{E[Y^2]}XY+\frac{(E[XY])^2}{(E[Y^2])^2}Y^2] \\&=E[X^2]-2\frac{E[XY]}{E[Y^2]}E[XY]+\frac{(E[XY])^2}{(E[Y^2])^2}E[Y^2] \\&=E[X^2]-\frac{(E[XY])^2}{E[Y^2]} \end{aligned}$ i.e., $(E[XY])^2\leq E[X^2]E[Y^2]$

Problem 21. Correlation coefficient.

Consider the correlation coefficient
$\rho(X, Y) =\frac{cov(X, Y)}{\sqrt{var(X) var(Y)}}$ of two random variables $X$ and $Y$ that have positive variances. Show that:

$(a)$ $|\rho(X, Y)|\leq1$ . [Hint: Use the Schwarz inequality from the preceding problem.]
$(b)$ If $Y - E [Y]$ is a positive (or negative) multiple of $X - E [X]$ , then $\rho(X, Y) = 1$ [or $\rho(X. Y) = -1$ , respectively].
$(c)$ If $\rho(X, Y) = 1$ [or $\rho(X, Y) = -1$ ], then, with probability 1, $Y - E [Y]$ is a positive (or negative. respectively) multiple of $X - E [X]$ .

SOLUTION

$(a)$ Let $\tilde X = X - E[X]$ and $\tilde Y = Y - E[Y]$ . Using the Schwarz inequality, we get
$\rho(X, Y)^2 =\frac{(E[\tilde X\tilde Y])^2}{E[\tilde X^2]E[\tilde Y^2]}\leq1$ and hence $|\rho(X, Y)|\leq1$ .
$(b)$ If $\tilde Y = a\tilde X$ , then
$\rho(X, Y)=\frac{E[\tilde Xa\tilde X]}{\sqrt{E[\tilde {X^2}]E[(a\tilde X)^2]}}=\frac{a}{|a|}$
$(c)$ If $|\rho(X, Y)| = 1$ , the calculation in the solution of Problem 20 yields
$\begin{aligned}E[(\tilde X-\frac{E[\tilde X\tilde Y]}{E[\tilde Y^2]}\tilde Y)^2]&=E[\tilde X^2]-\frac{(E[\tilde X\tilde Y])^2}{E[\tilde Y^2]} \\&=E[\tilde X^2](1-(\rho(X,Y))^2) \\&=0\end{aligned}$ Thus, with probability 1, the random variable
$\tilde X-\frac{E[\tilde X\tilde Y]}{E[\tilde Y^2]}\tilde Y$ is equal to zero. It follows that, with probability 1,
$\tilde X=\frac{E[\tilde X\tilde Y]}{E[\tilde Y^2]}\tilde Y=\sqrt{\frac{E[\tilde X^2]}{E[\tilde Y^2]}}\rho(X,Y)\tilde Y$ i.e., the sign of the constant ratio of $X$ and $Y$ is determined by the sign of $\rho(X, Y)$ .

Covariance Matrix (协方差矩阵)

设 $X_1,...,X_n$ 为 $n$ 个随机变量, $\mu_{i}=E\left(X_{i}\right)$ ，令 $\sigma_{ij}=\text{cov}(X_i,X_j)$ , 称矩阵
$\begin{aligned}\boldsymbol \Sigma&=(\sigma_{ij})_{n\times n} \\&=\left[\begin{array}{cccc}E\left[\left(X_{1}-\mu_{1}\right)\left(X_{1}-\mu_{1}\right)\right] & E\left[\left(X_{1}-\mu_{1}\right)\left(X_{2}-\mu_{2}\right)\right] & \cdots & E\left[\left(X_{1}-\mu_{1}\right)\left(X_{n}-\mu_{n}\right)\right] \\ E\left[\left(X_{2}-\mu_{2}\right)\left(X_{1}-\mu_{1}\right)\right] & E\left[\left(X_{2}-\mu_{2}\right)\left(X_{2}-\mu_{2}\right)\right] & \cdots & E\left[\left(X_{2}-\mu_{2}\right)\left(X_{n}-\mu_{n}\right)\right] \\ \vdots & \vdots & \ddots & \vdots \\ E\left[\left(X_{n}-\mu_{n}\right)\left(X_{1}-\mu_{1}\right)\right] & E\left[\left(X_{n}-\mu_{n}\right)\left(X_{2}-\mu_{2}\right)\right] & \cdots & E\left[\left(X_{n}-\mu_{n}\right)\left(X_{n}-\mu_{n}\right)\right]\end{array}\right]\end{aligned}$ 为 $X_1,...,X_n$ 的协方差矩阵

设 $\mathbf{X}=\left[\begin{array}{c}X_{1} \\ \vdots \\ X_{n}\end{array}\right]\in\R^n,\mathbf{\boldsymbol \mu}=\left[\begin{array}{c}\mu_{1} \\ \vdots \\ \mu_{n}\end{array}\right]\in\R^n$ , 则
$\boldsymbol \Sigma=\mathbb{E}\left[(\mathbf{X}-\boldsymbol\mu)(\mathbf{X}-\boldsymbol\mu)^{\top}\right]$

协方差矩阵是半正定矩阵

$\begin{aligned}\boldsymbol{x}^{\mathrm{T}} \boldsymbol \Sigma \boldsymbol{x}&=\boldsymbol{x}^{\mathrm{T}} \mathbb{E}\left[(\mathbf{X}-\boldsymbol\mu)(\mathbf{X}-\boldsymbol\mu)^{\top}\right]\boldsymbol{x} \\&= \mathbb{E}\left[\boldsymbol{x}^{\mathrm{T}}(\mathbf{X}-\boldsymbol\mu)(\mathbf{X}-\boldsymbol\mu)^{\top}\boldsymbol{x}\right] \\&=\mathbb{E}\left[((\mathbf{X}-\boldsymbol\mu)^{\mathrm{T}}\boldsymbol{x})^{\mathrm{T}}((\mathbf{X}-\boldsymbol\mu)^{\mathrm{T}}\boldsymbol{x})\right] \\&=\mathbb{E}\left[\left\|(\mathbf{X}-\boldsymbol\mu)^{\mathrm{T}}\boldsymbol{x}\right\|^{2}\right] \\&\geq0\end{aligned}$

Variance of the Sum of Random Variables

If $X_1, X_2, ... , X_n$ are random variables with finite variance, we have
$var(X_1+X_2)=var(X_1)+var(X_2)+2cov(X_1,X_2)$ and, more generally,
$var(\sum_{i=1}^nX_i)=\sum_{i=1}^nvar(X_i)+\sum_{\{(i,j)|i\neq j\}}cov(X_i,X_j)$

PROOF

For brevity, we denote $\tilde X_i=X_i-E[X_i]$ , then
$\begin{aligned}var(\sum_{i=1}^nX_i)&=E[(\sum_{i=1}^n\tilde X_i)^2] \\&=E[\sum_{i=1}^n\sum_{j=1}^n\tilde X_i\tilde X_j] \\&=\sum_{i=1}^n\sum_{j=1}^nE[\tilde X_i\tilde X_j] \\&=\sum_{i=1}^nE[\tilde X_i^2]+\sum_{\{(i,j)|i\neq j\}}E[\tilde X_i\tilde X_j] \\&=\sum_{i=1}^nvar(X_i)+\sum_{\{(i,j)|i\neq j\}}cov(X_i,X_j) \end{aligned}$

Example 4.15.

$n$ people throw their hats in a box and then pick a hat at random. Let us find the variance of $X$ , the number of people who pick their own hat.

SOLUTION

We have
$X = X_1 +· · ·+ X_n$ where $X_1$ is the random variable that takes the value $1$ if the $i$ th person selects his/her own hat, and takes the value $0$ otherwise. Noting that $X_i$ is Bernoulli with parameter $p = P(X_i = 1) = 1/n$ , we obtain
$\begin{aligned}E[X_i]&=\frac{1}{n}\\ var(X_i)&=\frac{1}{n}(1-\frac{1}{n})\end{aligned}$ For $\neq j$ , we have
$\begin{aligned}cov(X_i, X_j) &= E[X_iX_j] - E[X_i] E[ X_j] \\&= P(X_i = 1\ and\ X_j = 1)-\frac{1}{n^2} \\&=P(X_i=1)P(X_j=1|X_i=1)-\frac{1}{n^2} \\&=\frac{1}{n}\cdot\frac{1}{n-1}-\frac{1}{n^2} \\&=\frac{1}{n^2(n-1)} \end{aligned}$ Therefore,
$\begin{aligned}var(X)&=var(\sum_{i=1}^nX_i) \\&=\sum_{i=1}^nvar(X_i)+\sum_{\{(i,j)|i\neq j\}}cov(X_i,X_j) \\&=n\cdot \frac{1}{n}(1-\frac{1}{n})+n(n-1)\cdot \frac{1}{n^2(n-1)} \\&=1 \end{aligned}$

连理o

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Chapter 4 (Further Topics on Random Variables): Covariance and Correlation (协方差和相关)

本文为 IntroductionIntroductionIntroduction tototo ProbabilityProbabilityProbability 的读书笔记目录CovarianceVariance of the Sum of Random VariablesCovarianceThe covariance of two random variables XXX and YYY, denoted by cov(X,Y)cov(X, Y)cov(X,Y), is defined by
复制链接

扫一扫