Data2002 - WEEK2

Poisson distribution

A poisson random variable represents the probability of a given number of events occurring in a fixed interval 

for a Poisson distribution both E(X) and Var(X) are equal to the parameter λ.

R simulate

plot(table(rpois(n=10000, lambda= ?)),  ylab = "Count")

     OR

#library(dplyr) 

rpois(n=10000, lambda=6) %>% table() %>% plot(ylab = "Count")

 Chi-squared tests for discrete distributions

 we have a sample x1,x2,…,xn with a given distribution function F0(x|θ1,θ2,...,θh)F0(x|θ1,θ2,...,θh) where θlθl are parameters of the distribution. 

general chi-squared goodness-of-fit test with test statistic

freq \displaystyle Y_i for each sample Xi, expected freq \displaystyle e_i.

However, the param\Theta are usually unknow and have to be estimated from the sample. Then, p_i is replaced by \widehat{P_i}. The order statistic named t_0.

approximate p-value

p({\alpha ^{2}}_{k-1-q} >= t_0)                                 q is the number of parameters we need to estimate

lecture中的栗子:

  • Hypothesis: H0:the data come from a Poisson distribution vs H1: the data do not come from a Poisson distribution.

  • Assumptions: The expected frequencies, ei=npi≥5ei=npi≥5. Observations are independent.

  • Test statistic:

  • The observed test statistic: t0=1.43

  • P- value:  by R: pchisq(1.43,2,lower.tail = FALSE) = 0.489

注意:p_n= 1- sum(p_1 + ...+ p_{n-1})

Conclusion: Since the p-value is greater than 0.05, we do not reject the null hypothesis. The data are consistent with a Poisson distribution.

R直接干:chisq.test(yr【图二中的y_i】, p = pr【图二中的\widehat{p_i}】) 但这里df是错的

The conditional probability

P(A|B) = P(A\cap B)/P(B)

Bayes' rule

P(B|A) = \frac{P(A|B) * P(B)}{P(A|B)*P(B) + P(A|B^{C}) *P(B^{C})}

Actual + D^{+}Actual -D^{-}
Test  + S^{+}aba+b
Test - S^{-}cdc+d
a+cb+da+b+c+d

1. False negative rate (在阳性的前提下检测出阴性) = c/(a+c)

2.False positive rate (在阴性的前提下检测出阳性) = b/(b+d)

3.Sensitivity(在阳性的前提下检测出阳性) = a/(a+c)

4.Specificity(在阴性的条件下检测出阴性) = d/(b+d)

5.Precision(检测出阳性的条件下实际也是阳性) = a/(a+b)

6.Negative predictive value(检测出阴性的条件下实际也是阴性) = c/(c+d)

7.Accuracy = (a+d)/(a+b+c+d)

8. prevalence = (a + c)/ (a+b+c+d)

Prospective and retrospective

Prospective (cohort study): A prospective study is based on subjects who are initially identified as disease-free and classified by presence or absence of a risk factor. A random sample from each group is followed in time (prospectively) until eventually classified by disease outcome.

最初没病,后来通过发病因素的有无进行分类(R),最后观察实际发病(D)

We can estimate P(D^{+}|R^{+}) as well as P(D^{-}|R^{+}), 但是不可以base on D,因为没有从D中抽样

Retrospective (case control) studies: A retrospective study is based on random samples from each of the two outcome categories which are followed back (retrospectively) to determine the presence or absence of the risk factor for each individual.

已经发病的,按照发病种类分(D),基于结果观察危险因素(R)

We can estimate P(R^{+}|D^{+}) as well as P(R^{-}|D^{-}), 但是不可以base on R,因为没有从R中抽样

Relative risk

RR = P(D^{+}|R^{+})/P(D^{+}|R^{-}) = \frac{a(c+d)}{c(a+b)}

If D and R are independent then P(D|R) = P(D) so RR = 1 

RR < 1: the disease is less likely to occur in the group with the risk factor.

RR > 1: the disease is more likely to occur in the group with the risk factor.

 这个只有prospective可

Odds ratio

O(A) = P(A)/(1 - P(A))

O(D^{+}|R^{+}) = P (D^{+}|R^{+})/P(D^{-}|R^{+})

If D and R are independent then P(D|R) = P(D) and OR = 1, 反推也OK

Large odds ratios (OR > 1 ) implies increased risk of disease and small odd ratios (OR < 1 ) implies decreased risk of disease

Standard errors and confidence intervals for odds ratios

asymptotic standard error for log(\widehat{OR})  is  \sqrt{\frac{1}{a}+\frac{1}{b} + \frac{1}{c} + \frac{1}{d}}

confidence interval for log\Theta is approximately log(\widehat{OR}) +- Z* \sqrt{\frac{1}{a}+\frac{1}{b} + \frac{1}{c} + \frac{1}{d}}

approximate a confidence interval for the odds-ratio:

(exp( log(\widehat{OR}) - Z* \sqrt{\frac{1}{a}+\frac{1}{b} + \frac{1}{c} + \frac{1}{d}} ), exp( log(\widehat{OR}) + Z* \sqrt{\frac{1}{a}+\frac{1}{b} + \frac{1}{c} + \frac{1}{d}} ))

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值