2. Formula-Quantitative Analysis

最新推荐文章于 2024-09-01 13:46:16 发布

radar_sun

最新推荐文章于 2024-09-01 13:46:16 发布

阅读量496

点赞数

分类专栏： # FRM1-定量分析文章标签： frm

本文链接：https://blog.csdn.net/agoldminer/article/details/119787792

版权

FRM1-定量分析专栏收录该内容

1 篇文章 0 订阅

订阅专栏

1. Bayes’ Formula

$P(A|B)=\frac{P(B|A)}{P(B)}*P(A)=\frac{P(A \bigcap B)}{P(B)}$

2. Basic Statistics

2.1 Expected Value

$E(X)=P(x_1)x_1 +P(x_2)x_2+\ldots+P(x_n)x_n$

2.2 Variance

$\sigma^2=E[(X-\mu)^2]$

$\sigma^2(aX)=E[(aX-a\mu)^2]=a^2\sigma^2(X)$

2.3 Covariance

$C o v (X, Y) = E [(X - E [X]) (Y - E [Y])] = E [X Y] - E [X] E [Y]$

The covariance of $X$ and itself is the variance of $X$
$Cov(X,X)=E[(X-E[X])(X-E[X])]=\sigma_X^2$

$Cov(X,Y)=\sigma_{XY}$

If $a$ , $b$ and $c$ are constant, then
$C o v (a + b X, c Y) = C o v (a, c Y) + C o v (b X, c Y) = b * c * C o v (X, Y)$

The relationship between covariance and Xvariance
$\sigma_{X \pm Y}^2=\sigma_x^2 +\sigma_y^2 \pm2Cov(X，Y)$

$\sigma_{aX \pm bY}^2=a^2\sigma_x^2 +b^2\sigma_y^2 \pm2ab*Cov(X，Y)$

2.4 Correlation

$\rho=\frac{Cov(X,Y)}{\sigma_x\sigma_y}$

Correlation has no units, ranges from $- 1$ and $+ 1$ .

Variance of correlated variables:
$\sigma_{X \pm Y}^2=\sigma_x^2 +\sigma_y^2 \pm2Cov(X，Y)=\sigma_x^2 +\sigma_y^2 \pm2\rho\sigma_X\sigma_Y$

2.5 Sums of Random Variables

If X and Y are any random variables
$E [X + Y] = E [X] + E [Y]$

If X and Y are independent:
$V a r (X + Y) = V a r (X) + V a r (Y)$

If X and Y are not independent
$V a r [X + Y] = V a r (X) + V a r (Y) + 2 C o v (X, Y)$

2.6 Skewness&Kurtosis

$Skewness={E(X-\mu_x)^3 \over \sigma_x^3}$
Positive Skewness : Mode < Median < Mean
Negative Skewness : Mode > Median > Mean

$Kurtosis=\frac{E[X-\mu_X]^4}{[E[X-\mu_X]^2]^2}$
Excess kurtosis = sample kurtosis - 3

3. Common Probability Distribution

4. Central Limit Theorem

5. Measure of Central Tendency

6. Measurement of Dispersion

7. Sampling & Estimation

7.1 Sampling Mean

Assumed the random variables $X_i$ are i.i.d, and $E[X_i]=\mu$ , $V[X_i]=\sigma^2$
When population mean $\mu$ are not observable ,it is estimated using the sample mean estimator, $\hat{\mu}$ , or $\overline{X_i}$ in mathematical expression.
In this case, $\hat{\mu}$ is an estimator of the unknown population parameter $\mu$ .

The mean estimator is unbiased because the expected value of mean estimator is the same as the population mean.
$E[\hat{\mu}]=\frac{1}{n}*\sum_{i=1}^nE[x_i]=\mu$

The variance of the mean estimator decreases as the number of observations increases, and so larger samples are better to estimate population mean.
$V[\hat{\mu}]=\frac{1}{n^2}\sum_{i=1}^nV[x_i]=\frac{\sigma^2}{n}$

7.2 Sample Variance

Similarly as $\hat{\mu}$ , sample variance is estimated using the sample variance estimator, denoted by $\hat{\sigma}^2$ .
$\hat{\sigma}^2=\frac{1}{n}\sum_{i=1}^n(x_i-\hat{\mu})^2$

Unlike the sample mean estimator, sample variance estimator is biased.
$E[\hat{\sigma}^2]=\sigma^2-\frac{\sigma^2}{n}=\frac{n-1}{n}\sigma^2$

The sample variance estimator, $S^2$ , is unbiased, as $E[S^2]=\sigma^2$

$S^2=\frac{1}{n-1}\sum_{i=1}^n(x_i-\hat{\mu})^2=\frac{1}{n-1}\sigma^2$

The expression $(n - 1)$ is known as the degrees of freedom.

7.3 Standard Error of Sample Mean

Standard error of the sample mean is the standard deviation of distribution of the sample mean.
Known population variance

Unknown population variance

7.4 The Central Limit Theorem (CLT)

When selecting simple random samples of size n from a population with a mean $\mu$ and a finite variance $\sigma^2$ , the sampling distribution of the sample mean approaches a normal probability distribution with mean $\mu$ and a variance $\sigma^2/n$ equal to as the sample size becomes large ( $\geq 30$ ).

$\hat{\mu}_n=\overline{X}\sim N(\mu,\frac{\sigma^2}{n})$

8. Hypothesis Testing

8.1 Null and Alternative Hypotheses

The null hypothesis( $H_0$ ), which specifies a parameter value that is assumed to be true.

The alternative hypothesis( $H_1$ ), which defines the range of values where the null should be rejected.

The test statistic, which has a know distribution when the null is true.
In most cases, test statistic follows standard normal distribution.

The size of the test, which captures the willingness to make a mistake and falsely reject a null hypothesis that is true.
The test size( $\alpha$ ) is chosen to reflect the willingness to mistakenly reject a true null hypothesis , it is set by the tester.
The most common test size is $5\%$ . Smaller test sizes(e.g., $1\%$ or even $0.1%) are used when it is especially important to avoid incorrectly rejecting a true null.

The critical value, which is a value that is compared to the test statistic to determine whether to reject the null hypothesis.

The decision rule, which combines the test statistic and critical value to determine whether to reject the null hypothesis.

The test power, which measures the probability that a false null is rejected.

8.2 Test of Mean/Means

$T=\frac{\hat{\mu}- \mu_0}{S/\sqrt{n}} \sim t_n-1$

$n - 1$ refers to the degree of freedom
when $n$ is small (i.e. less than 30), the Student’s $t$ has been documented to provide a better approximation than the normal.

$T=\frac{\hat{\mu}- \mu_o}{\sigma/\sqrt{n}}\sim N(0,1)$

Consider a test of the null hypothesis about a mean: $H_0:\mu=\mu_0$
When the true value of the mean ( $\mu$ ) is equal to the value assumed by the null ( $\mu_0$ ), then the asymptotic distribution leads to the test statistic
The test statistic $T$ (also known as the t-statistic) is asymptotically standard normally distributed according to CLT.

$T=\frac{\hat{\mu}_Z}{\hat{\sigma^2}/\sqrt{n}}=\frac{\hat{\mu}_X- \hat{\mu}_Y}{\sqrt{\frac{\hat{\sigma}_X^2+\hat{\sigma}_Y^2-2\hat{\sigma}_{XY}}{n}}}$

Testing whether the means of two series are equal, $H_0: \mu_X=\mu_Y$
If the null hypothesis is true, the $E[Z_i]=E[X_i]-E[Y_i]=\mu_X-\mu_Y=0$
When $X_i$ and $Y_i$ are both iid and mutually independent, the test statistic for testing that the means are equal is :

$T=\frac{\hat{\mu}_X- \hat{\mu}_Y}{\sqrt{\frac{\hat{\sigma}_X^2}{n_X} +\frac{\hat{\sigma}_Y^2}{n_Y} }}$

8.3 Difference Between One- and Two-Tailed/Sided Tests

One tailed test: test whether value is greater than or less than a given number.
$H_0:\mu\geq0 \qquad H_1:\mu<0$
$H_0:\mu\leq0 \qquad H_1:\mu>0$

When testing against one sided test, if
$\alpha = 10\% \qquad Critical\; Value=+1.28\;or\; -1.28$
$\alpha = 5\% \qquad Critical\; Value=+1.645 \;or\; -1.645$
$\alpha = 1\% \qquad Critical\; Value=+2.326 \;or\; -2.326$

Two tailed test: test whether value is equal to a given number.
$H_0:\mu=0 \qquad H_1:\mu\neq0$

When testing against two sided test, if
$\alpha = 10\% \qquad Critical\; Value=\pm1.645$
$\alpha = 5\% \qquad Critical\; Value=\pm1.96$
$\alpha = 1\% \qquad Critical\; Value=\pm2.57$

8.4 Type I & Type II Errors

Desicion	IF
Desicion	H0 is ture	Ho is false
Fail to reject H0	Correct (1-α)	Type II error (β)
Reject H0	Type I error (α) Significant Level	Correct (1-β) Power of Test

A Type I error occurs when the null is true, but the null is rejected.
The probability of Type I error is denoted by Greek letter $\alpha$ . which also referred to the test size/significant level.

A Type II error occurs when the alternative is true, but the null is not rejected.
The probability of Type II error is denoted by Greek letter $\beta$
In practice, $\beta$ should be small so that the power of test, defined as $1-\beta$ , is high.

8.5 Confidence Interval Approach

A confidence intervals is a range of parameters that complements the rejection region.

One-sided test
$CI=[Mean\;Esitmate \pm(critical\;value)\times standard\;error ]$
Two-sided test
If rejection region is on the left.
$CI=[Mean\;Esitmate - (critical\;value)\times standard\;error,+\infty]$

If rejection region is on the right.
$CI=[-\infty,Mean\;Esitmate+(critical\;value)\times standard\;error]$

9. Regression

9.1 Simple Linear Regression

$Y_i=b_0+b_1X_i+\varepsilon_i$
$Y_i$ dependent or explained variable
$X_i$ independent or explanatory variable
$b_0$ intercept coefficient

The intercept term of $b_0$ can be interpreted to mean that the independent variable is zero, the dependent variable is $b_0$ .

$\hat{b_0}=\overline{Y}-\hat{b}_1\overline{X}$

An estimated slope coefficient of $b_1$ would indicate that the dependent variable will change $b_1$ units for every 1 unit change in the independent variable.

$\hat{b}_1=\frac{Cov(X,Y)}{\sigma^2(X)}=\hat{\rho}_{xy}*\frac{\sigma_y}{\sigma_x}$

$\varepsilon_i$ error term/shock
The error term is the portion that can’t be explained by independent model.
The error term is assumed to have mean 0 so that $E[Y]=E[b_0+b_1X+\varepsilon]=b_0+b_1E[X]$

Ordinary Least Squares(OLS)
The OLS estimation is the process of estimating the population parameter $b_i$ using the corresponding $b_i$ value, which minimizes the square residual(i.e. the error terms),
The OLS sample coefficients are those that:

$minimize\sum\varepsilon_i^2=\sum[Y_i-(\hat{b_0}+\hat{b_1}*X_i)]^2$

9.2 Multiple Linear Regression

The multiple linear regression model
$Y_i=b_0+b_1X_{1i}+b_2X_{2i}+\cdots+b_kX_{ki}+\varepsilon_i$

The predicted value of the dependent variable
$\hat{Y}=\hat{b_0}+\hat{b_1}\hat{X_1}+\hat{b_2}\hat{X_2}+\cdots+\hat{b_k}\hat{X_k}$

9.3 Total Sum of Squares

Total Sum of Squares
$TSS=\sum(Y_i-\overline{Y})^2$
Explained Sum of Squares
$ESS=\sum(\hat{Y_i}-\overline{Y})^2$
Residual Sum of Squares
$RSS=\sum(Y_i-\hat{Y_i})^2$

$\sum(Y_i-\overline{Y})^2=\sum(\hat{Y_i}-\overline{Y})^2+\sum(Y_i-\hat{Y_i})^2$

9.4 Measures of Fitness

$R^2=\frac{ESS}{TSS}=1-\frac{RSS}{TSS}$

The coefficient of determination: A more intuitive measure of the “goodness of fit” of the regression. It is interpreted as a percentage of variation in the dependent variable explained by the independent variable. Its limits are $0\leq R^2 \leq 1$ .

$r^2=R^2\to r=\pm\sqrt{R^2}$
Notes that in a simple two-variable regression, the square root of $R^2$ is the correlation coefficient ( $r$ ) between $X_i$ and $Y_i$

Adjusted $R^2 = 1-\frac{RSS/n-k-1}{TSS/n-1}$
Adjusted $R^2 = 1-\frac{n-1}{n-k-1}*(1-R^2)$

Adding a new variable to the model always increases the $R^2$
The adjusted $R^2$ is a modified version of the $R^2$ that does not necessarily increase with a new independent variable is added.
Adjusted $R^2 \leq R^2$ .
Adjusted $R^2$ may be less than zero.

9.5 ANOVA Table

	df	SS	MSS
Explained	k	ESS	ESS/k
Residual	n-k-1	RSS	RSS/(n-k-1)
Total	n-1	TSS

9.5 Joint Hypothesis Testing

$=\frac{ESS/k}{RSS/n-k-1}$

An F-test is used to test whether at least one slope coefficient is significantly different from zero.
$H_0:b_1=b_2=b_3=\cdots=b_k=0$
$H_1: at\; least\; one \;b_j\neq0 \;(j=1\;to\; k)$
The F-test assesses the effectiveness of the model as a whole in explaining the dependent variable.

10. Stationary Time Series

11. Non-Stationary Time Series

13. Measuring Returns, Volatility and Correlation

13.1 Simple Return

The usual return on an asset bought at time $t - 1$ and sold at time $t$ can be expressed as percentage change in the market variable between the end of day $t - 1$ and the end of day $t$ .

$R_t=\frac{P_t-P_{t-1}}{P_{t-1}}$

The return of an asset over multiple periods is the product of the simple returns in each period:

$1+R_T=\prod_{t=1}^n(1+R_t)$

13.2 Continuously Compounded Returns

Continuously compound returns is also known as log returns. These are computed as the difference of the natural logarithm of the price.

$r_t=\ln P_t-\ln P_{t-1}$

The relationship between simple and log return

$1+R_t=e^{r_t}$

The main advantage of log returns is that the total return over multiple periods is just sum of the single period log return. However, the accuracy of log return approximation is poor when the simple return is large.

$r_T=\sum^T_{t=1}r_t$

13.3 Measuring Volatility

A volatility of a financial asset is usually measured by the standard deviation of its returns.
The measure of volatility scales with the square-root of the holding period. When the volatility is measured daily, it is common to convert daily volatility to annualized volatility by scaling by $\sqrt {252}$ .

$\sigma_{annual}=\sqrt {252} *\sigma_{daily}$

The variance (also called the variance rate) of returns is estimated using the standard estimator:

$\hat{\sigma}^2=\frac{1}{T}\sum^T_{t=1}(r_t-\hat{\mu})^2$

13.4 Two methods to test normality of a distribution

The Jarque-Bera Test
It is used to formally test whether the sample skweness and kurtosis are compatible with an assumption that the returns are normally distributied.

$H_0: \quad Skewbess=0 \quad and \quad Kurtosis =3$

The test statistic is
$JB=(T-1)(\frac{{\hat{s}}^2}{6}+\frac{(\hat{K}-3)^2}{24})$

Power Laws
Normal random variables have thin tails so that the probability of return larger than $K\sigma$ declines rapidly as $K$ increase, whereas many other distributions have tails that decline less quickly for large deviations.