ANOVA-学习笔记


5.1 One-way Analysis of Variance

5.1.1 Some notations:

I I I samples: every treatment value corresponds to a sample i i i

μ 1 , μ 2 . . . μ I \mu _{1},\mu _{2}...\mu _{I} μ1,μ2...μI: treatment means

J 1 , J 2 . . . J I J_{1},J_{2}...J_{I} J1,J2...JI: sample size

N N N: Total sample,所有样本数量, N = J 1 + J 2 + . . . + J I N = J_{1}+J_{2}+...+J_{I} N=J1+J2+...+JI

x i j x_{ij} xij: j j j-th observation in the i-th sample

x i . ˉ \bar{x_{i.}} xi.ˉ: Sample mean of the i t h i_{th} ith sample, x i . ˉ = ∑ j = 1 J i x i j J i \bar{x_{i.}}= \frac{\sum_{j=1}^{Ji}x_{ij}}{J_{i}} xi.ˉ=Jij=1Jixij

x ˉ . . \bar{x}_{..} xˉ.. : Sample grand mean, x ˉ . . = ∑ i = 1 I ∑ J = 1 J i x i j N = ∑ i = 1 I J i x ˉ i N \bar{x}_{..} = \frac{\sum_{i = 1}^{I}\sum_{J=1}^{J_{i}}x_{ij}}{N}={\color{Red} \frac{\sum_{i = 1}^{I}J_{i}\bar{x}_{i}}{N}} xˉ..=Ni=1IJ=1Jixij=Ni=1IJixˉi

Hypothesis is:

H 0 : μ 1 = μ 2 = . . . = μ I H_{0}:\mu _{1}= \mu _{2}= ...=\mu _{I} H0:μ1=μ2=...=μI v.s. H 1 H_{1} H1: two or more of the μ i \mu_{i} μi are different.

5.1.2 Assumptions:

Treatment populations (各样本背后的总体) must

  1. be normal 服从正态分布

  2. have the same variance σ 2 \sigma ^{2} σ2方差需要相等,即,方差齐性/方差同质性

Samples (随机抽样的样本)must be

  1. independent from each other

Note

  1. 在不满足正态性时可以采用非参数检验

  2. 要求方差齐性的原因: ANOVA虽然叫做Analysis of Variance, 但是其目的是为了检验每个组的均值是否相同,为了具有均值上的可比性,我们需要让方差相等。

5.1.3 ANOVA Brief Introduction

方差分析的基本原理: 认为不同处理组均数间的差别主要来源于

  1. 组内方差的产生只由于各个水平内部的随机变动,即不可控的随机因素造成的(random uncertainty)

  2. 组间方差的产生可能由于

    1)随机误差,测量误差/个体间的差异

    2)实验条件,不同的处理造成的差异,differences between the treatment means

5.1.4 SSTr (Treatment Sum of Squares):

The spread of the sample means around the sample grand mean, it measures how different the treatment means are from each other, i.e. inter-sample variation (组间方差)

即通过衡量 对不同样本的均值 偏离样本总均值 的程度的大小,来验证不同实验组均值的差异程度.(在这里我个人的理解是,spread of the sample mean around the sample grand mean 相当于treatment means 的一个estimate,即估计值)

如果不同样本的均值非常分散(Sample means around Sample grand mean), 有很大可能是treatment means会不同,因此我们拒绝 H 0 H_{0} H0

SSTr 的计算公式:(前一个为定义式,后一个为方便计算式)

S S T r = ∑ i = 1 I J i ( x i . ˉ − x . . ˉ ) 2 = ∑ i = 1 I J i x i . ˉ 2 − N x . . ˉ 2 SSTr =\sum_{i =1}^{I}J_{i}(\bar{x_{i.}}-\bar{x_{..}})^2= \sum_{i=1}^{I}J_{i}\bar{x_{i.}}^2-N\bar{x_{..}}^2 SSTr=i=1IJi(xi.ˉx..ˉ)2=i=1IJixi.ˉ2Nx..ˉ2

Note
SSTr is large, i.e. sample means spread out widely ⇒ \Rightarrow 我们有理由总结:treatment means differ and thus reject H 0 H_{0} H0
SSTr is small, i.e. sample means are all close to the sample grand mean ⇒ \Rightarrow 我们有理由总结:treatment means are equal.

5.1.5 SSE (Error Sum of Squares):

刚刚讲了SSTr 的计算公式和SSTr的作用,但是现在有一个问题,SSTr 多大算大,以至于让我们拒绝 H 0 H_{0} H0 ,认为treatment means不相等呢?

在这里我们引入SSE,让SSTr 和SSE 作比较。

SSE measures the variation in the individual sample points around their respective sample means:

S S E = ∑ i = 1 I ∑ j = 1 J i ( x i j − x i . ˉ ) 2 = ∑ i = 1 I ∑ j = 1 J i x i j 2 − ∑ i = 1 I J i x i . ˉ 2 \begin{aligned} SSE & =\sum_{i =1}^{I}\sum_{j=1}^{J_{i}}(x_{ij}-\bar{x_{i.}})^2 \\ &= \sum_{i=1}^{I}\sum_{j=1}^{J_{i}}x_{ij}^2-\sum_{i=1}^{I}J_{i}\bar{x_{i.}}^2 \end{aligned} SSE=i=1Ij=1Ji(xijxi.ˉ)2=i=1Ij=1Jixij2i=1IJixi.ˉ2

Based on sample variance formula:

s i 2 = ∑ j = 1 J i ( x i j − x i . ˉ ) 2 J i − 1 ⇒ ∑ j = 1 J i ( x i j − x i . ˉ ) 2 = s i 2 ( J i − 1 ) (1) s_{i}^2 = \frac{\sum_{j=1}^{J_{i}}(x_{ij}- \bar{x_{i.}})^2}{{\color{Red} J_{i}}-1} \\\Rightarrow \sum_{j=1}^{J_{i}}(x_{ij}-\bar{x_{i.}})^2 = s_{i}^2(J_{i}-1) \tag{1} si2=Ji1j=1Ji(xijxi.ˉ)2j=1Ji(xijxi.ˉ)2=si2(Ji1)(1)
将(1)带入SSE的定义式,
S S E = ∑ i = 1 I s i 2 ( J i − 1 ) SSE = \sum_{i =1}^{I}s_{i}^2(J_{i}-1) SSE=i=1Isi2(Ji1)

5.1.6 SST (Total Sum of Squares):

observation mean value - sample grand mean value

S S T = ∑ I = 1 I ∑ J = 1 J i ( x i j − x . . ˉ ) 2 = ∑ i = 1 I ∑ J = 1 J i x i j 2 − N x . . ˉ 2 \begin{aligned} SST &= \sum_{I=1}^{I} \sum_{J=1}^{Ji}(x_{ij} -\bar{x..})^2\\&=\sum_{i=1}^{I} \sum_{J=1}^{Ji} x_{ij}^2-N\bar{x_{..}}^2\end{aligned} SST=I=1IJ=1Ji(xijx..ˉ)2=i=1IJ=1Jixij2Nx..ˉ2

重要关系式:
S S T = S S T r + S S E SST = SSTr + SSE SST=SSTr+SSE

5.1.7 ANOVA Table

d.f.(SSTr) = I - 1
d.f.(SSE) = N - I

M S T r   ( T r e a t m e n t   M e a n   S q u a r e ) = S S T r I − 1 MSTr\ (Treatment\ Mean\ Square) =\frac{SSTr}{I-1} MSTr (Treatment Mean Square)=I1SSTr
M S E   ( E r r o r   M e a n   S q u a r e ) = S S E N − I MSE\ (Error\ Mean\ Square) = \frac{SSE}{N-I} MSE (Error Mean Square)=NISSE


ANOVA Table

​方差来源自由度平方和均方F值
因素(Treatment)I - 1 S S T r = ∑ i = 1 I J i ( x i . ˉ − x . . ˉ ) 2 SSTr =\sum_{i =1}^{I}J_{i}(\bar{x_{i.}}-\bar{x_{..}})^2 SSTr=i=1IJi(xi.ˉx..ˉ)2 M S T r = S S T r I − 1 MSTr=\frac{SSTr}{I-1} MSTr=I1SSTr M S T r M S E \frac{MSTr}{MSE} MSEMSTr
误差(Error)N - I S S E = ∑ i = 1 I ∑ j = 1 J i ( x i j − x i . ˉ ) 2 SSE =\sum_{i =1}^{I}\sum_{j=1}^{J_{i}}(x_{ij}-\bar{x_{i.}})^2 SSE=i=1Ij=1Ji(xijxi.ˉ)2 M S E = S S E N − I MSE= \frac{SSE}{N-I} MSE=NISSE
总和(Total)N - 1 S S T = ∑ i = 1 I ∑ j = 1 J i ( x i j − x . . ˉ ) 2 SST =\sum_{i =1}^{I}\sum_{j=1}^{J_{i}}(x_{ij}-\bar{x_{..}})^2 SST=i=1Ij=1Ji(xijx..ˉ)2

核心要点
1. 假设检验:

{ H 0 : μ 0 = μ 1 = . . . = μ I H 1 : Two or more means are not equal \left\{\begin{matrix} H_{0} :\mu_{0}=\mu_{1}=...=\mu_{I} \\ H_{1}: \text{Two\ or\ more\ means\ are\ not\ equal} \end{matrix}\right. {H0:μ0=μ1=...=μIH1:Two or more means are not equal

2. 假设检验的test statistics: F = M S T r M S E F = \frac{MSTr}{MSE} F=MSEMSTr

3. 假设检验的Criteria:
-If H 0 H_{0} H0 is right, F is near 1
-If H 0 H_{0} H0 is false, F > 1 (right tailed)

4. 对应的原理:

{ E ( S S T r ) = ( I − 1 ) σ 2   i f   H 0   i s   r i g h t E ( S S T r ) > ( I − 1 ) σ 2   i f   H 0   i s   w r o n g \left\{\begin{matrix} E(SSTr)= (I-1)\sigma^2\mathrm{\ if\ } H_{0}\mathrm{\ is\ right} \\ E(SSTr)> (I-1)\sigma^2\mathrm{\ if\ } H_{0}\mathrm{\ is\ wrong} \end{matrix}\right. {E(SSTr)=(I1)σ2 if H0 is rightE(SSTr)>(I1)σ2 if H0 is wrong

但是不管假设是否正确, E ( S S E ) = ( N − I ) σ 2 E({\color{Red}SSE }) = (N-I)\sigma^2 E(SSE)=(NI)σ2

⟹ \Longrightarrow { E ( M S T r ) = ( I − 1 ) σ 2 I − 1 = σ 2   i f   H 0   i s   r i g h t E ( M S T r ) > σ 2   i f   H 0   i s   w r o n g \left\{\begin{matrix} E(MSTr)= \frac{(I-1)\sigma^2}{I-1}=\sigma^2\mathrm{\ if\ } H_{0}\mathrm{\ is\ right} \\ E(MSTr)> \sigma^2\mathrm{\ if\ } H_{0}\mathrm{\ is\ wrong} \end{matrix}\right. {E(MSTr)=I1(I1)σ2=σ2 if H0 is rightE(MSTr)>σ2 if H0 is wrong

但是不管假设是否正确, E ( M S E ) = E ( S S E N − I ) = ( N − I ) σ 2 N − I = σ 2 E({\color{Red}MSE } ) =E(\frac{SSE}{N-I})= \frac{(N-I)\sigma^2}{N-I}=\sigma^2 E(MSE)=E(NISSE)=NI(NI)σ2=σ2

⟹ \Longrightarrow { 若 H 0 正确, F 值接近 1 若 H 0 错误, F 值 > 1 \left\{\begin{matrix} 若H_{0}正确,F值接近1 \\ 若H_{0}错误,F值>1 \end{matrix}\right. {H0正确,F值接近1H0错误,F>1

5. F test
假设经过计算得到F-value为 f f f,significance level = α \alpha α
F α ( I − 1 , N − I ) < f {\color{Red} F_{\alpha}(I-1,N-I)}<f Fα(I1,NI)<f,then we say we reject H 0 H_{0} H0 under α \alpha α significance level.Therefore there is indication from the data that there is significant difference among the treatment means.

6. F test 对应的p-value 转换
p(F= f-value) <0.05, there is moderate evidence against H 0 H_{0} H0 in favor of H 1 H_{1} H1.
There is significant difference among the treatment means.

5.2 An Alternate Parameterization

现在介绍另外一种比较treatment means的方法。

拆分

  • Assume X i j ∼ N ( μ i , σ 2 ) X_{ij} \sim N(\mu_{i},\sigma^2) XijN(μi,σ2)
  • Error term ε i j ∼ N ( 0 , σ 2 ) \varepsilon_{ij}\sim N(0,\sigma^2) εijN(0,σ2)
  • For each X i j X_{ij} Xij, 将其拆分成两个部分
    X i j = μ i + ε i j X_{ij} = \mu_{i} + \varepsilon_{ij} Xij=μi+εij

定义新的变量

  • Define population grand mean
    μ = 1 I ∑ i = 1 I μ i \mu=\frac{1}{I}\sum_{i=1}^{I}\mu_{i} μ=I1i=1Iμi

  • Define i-th treatment effect α i = μ i − μ (2) \alpha_{i}=\mu_{i}-\mu \tag{2} αi=μiμ(2)
    we can easily get ∑ i = 1 I α i = 0 \sum_{i=1}^{I}\alpha_{i} = 0 i=1Iαi=0
    From(2), we know μ i = α i + μ \mu_{i} =\alpha_{i}+\mu μi=αi+μ

  • Thus, we can change the equation of X i j X_{ij} Xij into X i j = μ i + ε i j = μ + α i + ε i j X_{ij}=\mu_{i}+\varepsilon_{ij}={\color{Red}\mu+\alpha_{i}+\varepsilon_{ij} } Xij=μi+εij=μ+αi+εijwhere ∑ i = 1 I α i = 0 \sum_{i=1}^{I}\alpha_{i} = 0 i=1Iαi=0

  • For the one-way ANOVA,
    H 0 : μ 1 = μ 2 = . . . = μ I ⇔ H 0 : α 1 = α 2 = . . . = α I = 0 H_{0}:\mu_{1}=\mu_{2}=...=\mu_{I} \Leftrightarrow \\H_{0}:\alpha_{1}=\alpha_{2}=...=\alpha_{I}= {\color{Red} 0} H0:μ1=μ2=...=μIH0:α1=α2=...=αI=0

为什么都等于0?
∵ μ i = α i + μ \because\mu_{i} =\alpha_{i}+\mu μi=αi+μ
μ \mu μ is fixed,
μ 1 = . . . = μ I \mu_{1}=...=\mu_{I} μ1=...=μI,
∴ α 1 = . . . = α I \therefore \alpha_{1}=...=\alpha_{I} α1=...=αI
又 ∵ ∑ i = 1 I α i = 0 又\because\sum_{i=1}^{I}\alpha_{i} = 0 i=1Iαi=0
∴ α 1 = α 2 = . . . = α I = 0 \therefore \alpha_{1}=\alpha_{2}=...=\alpha_{I}= {\color{Red} 0} α1=α2=...=αI=0

5.3 Comparison with Random Effects Model

固定效应模型和随机效应模型的对比

ItemDifferences H 0 H_{0} H0
Fixed Effect ModelTreatments are chosen deliberately by the experimenters. Interest is on specific treatment means μ 1 = . . . = μ I \mu_{1}=...=\mu_{I} μ1=...=μI
Random Effect ModelTreatments are chosen at random from a population of possible treatments. No particular interest.For every treatment in the population, treatment means are equal.

和固定效应模型一样,随机效应模型也假设 population of treatment means is normal.

5.4 Unbalanced Designed?

Balanced: J 1 = J 2 = . . . = J I J_{1} = J_{2} =...=J_{I} J1=J2=...=JI
Balanced 实际上更好,因为不同水平下的样本相同更有助于保证方差相等。
在平衡设计实验中,不平等方差的影响通常不会很大,it’s more robust
if it’s unbalanced, use

  1. nonparametric test
  2. Welch Test
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值