目录
5.1 One-way Analysis of Variance
5.1.1 Some notations:
I I I samples: every treatment value corresponds to a sample i i i
μ 1 , μ 2 . . . μ I \mu _{1},\mu _{2}...\mu _{I} μ1,μ2...μI: treatment means
J 1 , J 2 . . . J I J_{1},J_{2}...J_{I} J1,J2...JI: sample size
N N N: Total sample,所有样本数量, N = J 1 + J 2 + . . . + J I N = J_{1}+J_{2}+...+J_{I} N=J1+J2+...+JI
x i j x_{ij} xij: j j j-th observation in the i-th sample
x i . ˉ \bar{x_{i.}} xi.ˉ: Sample mean of the i t h i_{th} ith sample, x i . ˉ = ∑ j = 1 J i x i j J i \bar{x_{i.}}= \frac{\sum_{j=1}^{Ji}x_{ij}}{J_{i}} xi.ˉ=Ji∑j=1Jixij
x ˉ . . \bar{x}_{..} xˉ.. : Sample grand mean, x ˉ . . = ∑ i = 1 I ∑ J = 1 J i x i j N = ∑ i = 1 I J i x ˉ i N \bar{x}_{..} = \frac{\sum_{i = 1}^{I}\sum_{J=1}^{J_{i}}x_{ij}}{N}={\color{Red} \frac{\sum_{i = 1}^{I}J_{i}\bar{x}_{i}}{N}} xˉ..=N∑i=1I∑J=1Jixij=N∑i=1IJixˉi
Hypothesis is:
H 0 : μ 1 = μ 2 = . . . = μ I H_{0}:\mu _{1}= \mu _{2}= ...=\mu _{I} H0:μ1=μ2=...=μI v.s. H 1 H_{1} H1: two or more of the μ i \mu_{i} μi are different.
5.1.2 Assumptions:
Treatment populations (各样本背后的总体) must
-
be normal 服从正态分布
-
have the same variance σ 2 \sigma ^{2} σ2方差需要相等,即,方差齐性/方差同质性
Samples (随机抽样的样本)must be
- independent from each other
Note :
-
在不满足正态性时可以采用非参数检验
-
要求方差齐性的原因: ANOVA虽然叫做Analysis of Variance, 但是其目的是为了检验每个组的均值是否相同,为了具有均值上的可比性,我们需要让方差相等。
5.1.3 ANOVA Brief Introduction
方差分析的基本原理: 认为不同处理组均数间的差别主要来源于
-
组内方差的产生只由于各个水平内部的随机变动,即不可控的随机因素造成的(random uncertainty)
-
组间方差的产生可能由于
1)随机误差,测量误差/个体间的差异
2)实验条件,不同的处理造成的差异,differences between the treatment means
5.1.4 SSTr (Treatment Sum of Squares):
The spread of the sample means around the sample grand mean, it measures how different the treatment means are from each other, i.e. inter-sample variation (组间方差)
即通过衡量 对不同样本的均值 偏离样本总均值 的程度的大小,来验证不同实验组均值的差异程度.(在这里我个人的理解是,spread of the sample mean around the sample grand mean 相当于treatment means 的一个estimate,即估计值)
如果不同样本的均值非常分散(Sample means around Sample grand mean), 有很大可能是treatment means会不同,因此我们拒绝 H 0 H_{0} H0
SSTr 的计算公式:(前一个为定义式,后一个为方便计算式)
S S T r = ∑ i = 1 I J i ( x i . ˉ − x . . ˉ ) 2 = ∑ i = 1 I J i x i . ˉ 2 − N x . . ˉ 2 SSTr =\sum_{i =1}^{I}J_{i}(\bar{x_{i.}}-\bar{x_{..}})^2= \sum_{i=1}^{I}J_{i}\bar{x_{i.}}^2-N\bar{x_{..}}^2 SSTr=i=1∑IJi(xi.ˉ−x..ˉ)2=i=1∑IJixi.ˉ2−Nx..ˉ2
Note :
SSTr is large, i.e. sample means spread out widely
⇒
\Rightarrow
⇒ 我们有理由总结:treatment means differ and thus reject
H
0
H_{0}
H0
SSTr is small, i.e. sample means are all close to the sample grand mean
⇒
\Rightarrow
⇒ 我们有理由总结:treatment means are equal.
5.1.5 SSE (Error Sum of Squares):
刚刚讲了SSTr 的计算公式和SSTr的作用,但是现在有一个问题,SSTr 多大算大,以至于让我们拒绝 H 0 H_{0} H0 ,认为treatment means不相等呢?
在这里我们引入SSE,让SSTr 和SSE 作比较。
SSE measures the variation in the individual sample points around their respective sample means:
S S E = ∑ i = 1 I ∑ j = 1 J i ( x i j − x i . ˉ ) 2 = ∑ i = 1 I ∑ j = 1 J i x i j 2 − ∑ i = 1 I J i x i . ˉ 2 \begin{aligned} SSE & =\sum_{i =1}^{I}\sum_{j=1}^{J_{i}}(x_{ij}-\bar{x_{i.}})^2 \\ &= \sum_{i=1}^{I}\sum_{j=1}^{J_{i}}x_{ij}^2-\sum_{i=1}^{I}J_{i}\bar{x_{i.}}^2 \end{aligned} SSE=i=1∑Ij=1∑Ji(xij−xi.ˉ)2=i=1∑Ij=1∑Jixij2−i=1∑IJixi.ˉ2
Based on sample variance formula:
s
i
2
=
∑
j
=
1
J
i
(
x
i
j
−
x
i
.
ˉ
)
2
J
i
−
1
⇒
∑
j
=
1
J
i
(
x
i
j
−
x
i
.
ˉ
)
2
=
s
i
2
(
J
i
−
1
)
(1)
s_{i}^2 = \frac{\sum_{j=1}^{J_{i}}(x_{ij}- \bar{x_{i.}})^2}{{\color{Red} J_{i}}-1} \\\Rightarrow \sum_{j=1}^{J_{i}}(x_{ij}-\bar{x_{i.}})^2 = s_{i}^2(J_{i}-1) \tag{1}
si2=Ji−1∑j=1Ji(xij−xi.ˉ)2⇒j=1∑Ji(xij−xi.ˉ)2=si2(Ji−1)(1)
将(1)带入SSE的定义式,
S
S
E
=
∑
i
=
1
I
s
i
2
(
J
i
−
1
)
SSE = \sum_{i =1}^{I}s_{i}^2(J_{i}-1)
SSE=i=1∑Isi2(Ji−1)
5.1.6 SST (Total Sum of Squares):
observation mean value - sample grand mean value
S
S
T
=
∑
I
=
1
I
∑
J
=
1
J
i
(
x
i
j
−
x
.
.
ˉ
)
2
=
∑
i
=
1
I
∑
J
=
1
J
i
x
i
j
2
−
N
x
.
.
ˉ
2
\begin{aligned} SST &= \sum_{I=1}^{I} \sum_{J=1}^{Ji}(x_{ij} -\bar{x..})^2\\&=\sum_{i=1}^{I} \sum_{J=1}^{Ji} x_{ij}^2-N\bar{x_{..}}^2\end{aligned}
SST=I=1∑IJ=1∑Ji(xij−x..ˉ)2=i=1∑IJ=1∑Jixij2−Nx..ˉ2
重要关系式:
S
S
T
=
S
S
T
r
+
S
S
E
SST = SSTr + SSE
SST=SSTr+SSE
5.1.7 ANOVA Table
d.f.(SSTr) = I - 1
d.f.(SSE) = N - I
M
S
T
r
(
T
r
e
a
t
m
e
n
t
M
e
a
n
S
q
u
a
r
e
)
=
S
S
T
r
I
−
1
MSTr\ (Treatment\ Mean\ Square) =\frac{SSTr}{I-1}
MSTr (Treatment Mean Square)=I−1SSTr
M
S
E
(
E
r
r
o
r
M
e
a
n
S
q
u
a
r
e
)
=
S
S
E
N
−
I
MSE\ (Error\ Mean\ Square) = \frac{SSE}{N-I}
MSE (Error Mean Square)=N−ISSE
ANOVA Table
方差来源 | 自由度 | 平方和 | 均方 | F值 |
---|---|---|---|---|
因素(Treatment) | I - 1 | S S T r = ∑ i = 1 I J i ( x i . ˉ − x . . ˉ ) 2 SSTr =\sum_{i =1}^{I}J_{i}(\bar{x_{i.}}-\bar{x_{..}})^2 SSTr=∑i=1IJi(xi.ˉ−x..ˉ)2 | M S T r = S S T r I − 1 MSTr=\frac{SSTr}{I-1} MSTr=I−1SSTr | M S T r M S E \frac{MSTr}{MSE} MSEMSTr |
误差(Error) | N - I | S S E = ∑ i = 1 I ∑ j = 1 J i ( x i j − x i . ˉ ) 2 SSE =\sum_{i =1}^{I}\sum_{j=1}^{J_{i}}(x_{ij}-\bar{x_{i.}})^2 SSE=∑i=1I∑j=1Ji(xij−xi.ˉ)2 | M S E = S S E N − I MSE= \frac{SSE}{N-I} MSE=N−ISSE | |
总和(Total) | N - 1 | S S T = ∑ i = 1 I ∑ j = 1 J i ( x i j − x . . ˉ ) 2 SST =\sum_{i =1}^{I}\sum_{j=1}^{J_{i}}(x_{ij}-\bar{x_{..}})^2 SST=∑i=1I∑j=1Ji(xij−x..ˉ)2 |
核心要点:
1. 假设检验:
{ H 0 : μ 0 = μ 1 = . . . = μ I H 1 : Two or more means are not equal \left\{\begin{matrix} H_{0} :\mu_{0}=\mu_{1}=...=\mu_{I} \\ H_{1}: \text{Two\ or\ more\ means\ are\ not\ equal} \end{matrix}\right. {H0:μ0=μ1=...=μIH1:Two or more means are not equal
2. 假设检验的test statistics: F = M S T r M S E F = \frac{MSTr}{MSE} F=MSEMSTr
3. 假设检验的Criteria:
-If
H
0
H_{0}
H0 is right, F is near 1
-If
H
0
H_{0}
H0 is false, F > 1 (right tailed)
4. 对应的原理:
{
E
(
S
S
T
r
)
=
(
I
−
1
)
σ
2
i
f
H
0
i
s
r
i
g
h
t
E
(
S
S
T
r
)
>
(
I
−
1
)
σ
2
i
f
H
0
i
s
w
r
o
n
g
\left\{\begin{matrix} E(SSTr)= (I-1)\sigma^2\mathrm{\ if\ } H_{0}\mathrm{\ is\ right} \\ E(SSTr)> (I-1)\sigma^2\mathrm{\ if\ } H_{0}\mathrm{\ is\ wrong} \end{matrix}\right.
{E(SSTr)=(I−1)σ2 if H0 is rightE(SSTr)>(I−1)σ2 if H0 is wrong
但是不管假设是否正确,
E
(
S
S
E
)
=
(
N
−
I
)
σ
2
E({\color{Red}SSE }) = (N-I)\sigma^2
E(SSE)=(N−I)σ2
⟹ \Longrightarrow ⟹ { E ( M S T r ) = ( I − 1 ) σ 2 I − 1 = σ 2 i f H 0 i s r i g h t E ( M S T r ) > σ 2 i f H 0 i s w r o n g \left\{\begin{matrix} E(MSTr)= \frac{(I-1)\sigma^2}{I-1}=\sigma^2\mathrm{\ if\ } H_{0}\mathrm{\ is\ right} \\ E(MSTr)> \sigma^2\mathrm{\ if\ } H_{0}\mathrm{\ is\ wrong} \end{matrix}\right. {E(MSTr)=I−1(I−1)σ2=σ2 if H0 is rightE(MSTr)>σ2 if H0 is wrong
但是不管假设是否正确,
E
(
M
S
E
)
=
E
(
S
S
E
N
−
I
)
=
(
N
−
I
)
σ
2
N
−
I
=
σ
2
E({\color{Red}MSE } ) =E(\frac{SSE}{N-I})= \frac{(N-I)\sigma^2}{N-I}=\sigma^2
E(MSE)=E(N−ISSE)=N−I(N−I)σ2=σ2
⟹
\Longrightarrow
⟹
{
若
H
0
正确,
F
值接近
1
若
H
0
错误,
F
值
>
1
\left\{\begin{matrix} 若H_{0}正确,F值接近1 \\ 若H_{0}错误,F值>1 \end{matrix}\right.
{若H0正确,F值接近1若H0错误,F值>1
5. F test
假设经过计算得到F-value为
f
f
f,significance level =
α
\alpha
α
F
α
(
I
−
1
,
N
−
I
)
<
f
{\color{Red} F_{\alpha}(I-1,N-I)}<f
Fα(I−1,N−I)<f,then we say we reject
H
0
H_{0}
H0 under
α
\alpha
α significance level.Therefore there is indication from the data that there is significant difference among the treatment means.
6. F test 对应的p-value 转换
p(F= f-value) <0.05, there is moderate evidence against
H
0
H_{0}
H0 in favor of
H
1
H_{1}
H1.
There is significant difference among the treatment means.
5.2 An Alternate Parameterization
现在介绍另外一种比较treatment means的方法。
拆分
- Assume X i j ∼ N ( μ i , σ 2 ) X_{ij} \sim N(\mu_{i},\sigma^2) Xij∼N(μi,σ2)
- Error term ε i j ∼ N ( 0 , σ 2 ) \varepsilon_{ij}\sim N(0,\sigma^2) εij∼N(0,σ2)
- For each
X
i
j
X_{ij}
Xij, 将其拆分成两个部分
X i j = μ i + ε i j X_{ij} = \mu_{i} + \varepsilon_{ij} Xij=μi+εij
定义新的变量
-
Define population grand mean
μ = 1 I ∑ i = 1 I μ i \mu=\frac{1}{I}\sum_{i=1}^{I}\mu_{i} μ=I1∑i=1Iμi -
Define i-th treatment effect α i = μ i − μ (2) \alpha_{i}=\mu_{i}-\mu \tag{2} αi=μi−μ(2)
we can easily get ∑ i = 1 I α i = 0 \sum_{i=1}^{I}\alpha_{i} = 0 ∑i=1Iαi=0
From(2), we know μ i = α i + μ \mu_{i} =\alpha_{i}+\mu μi=αi+μ -
Thus, we can change the equation of X i j X_{ij} Xij into X i j = μ i + ε i j = μ + α i + ε i j X_{ij}=\mu_{i}+\varepsilon_{ij}={\color{Red}\mu+\alpha_{i}+\varepsilon_{ij} } Xij=μi+εij=μ+αi+εijwhere ∑ i = 1 I α i = 0 \sum_{i=1}^{I}\alpha_{i} = 0 ∑i=1Iαi=0
-
For the one-way ANOVA,
H 0 : μ 1 = μ 2 = . . . = μ I ⇔ H 0 : α 1 = α 2 = . . . = α I = 0 H_{0}:\mu_{1}=\mu_{2}=...=\mu_{I} \Leftrightarrow \\H_{0}:\alpha_{1}=\alpha_{2}=...=\alpha_{I}= {\color{Red} 0} H0:μ1=μ2=...=μI⇔H0:α1=α2=...=αI=0
为什么都等于0?
∵
μ
i
=
α
i
+
μ
\because\mu_{i} =\alpha_{i}+\mu
∵μi=αi+μ
μ
\mu
μ is fixed,
μ
1
=
.
.
.
=
μ
I
\mu_{1}=...=\mu_{I}
μ1=...=μI,
∴
α
1
=
.
.
.
=
α
I
\therefore \alpha_{1}=...=\alpha_{I}
∴α1=...=αI
又
∵
∑
i
=
1
I
α
i
=
0
又\because\sum_{i=1}^{I}\alpha_{i} = 0
又∵∑i=1Iαi=0
∴
α
1
=
α
2
=
.
.
.
=
α
I
=
0
\therefore \alpha_{1}=\alpha_{2}=...=\alpha_{I}= {\color{Red} 0}
∴α1=α2=...=αI=0
5.3 Comparison with Random Effects Model
固定效应模型和随机效应模型的对比
Item | Differences | H 0 H_{0} H0 |
---|---|---|
Fixed Effect Model | Treatments are chosen deliberately by the experimenters. Interest is on specific treatment means | μ 1 = . . . = μ I \mu_{1}=...=\mu_{I} μ1=...=μI |
Random Effect Model | Treatments are chosen at random from a population of possible treatments. No particular interest. | For every treatment in the population, treatment means are equal. |
和固定效应模型一样,随机效应模型也假设 population of treatment means is normal.
5.4 Unbalanced Designed?
Balanced:
J
1
=
J
2
=
.
.
.
=
J
I
J_{1} = J_{2} =...=J_{I}
J1=J2=...=JI
Balanced 实际上更好,因为不同水平下的样本相同更有助于保证方差相等。
在平衡设计实验中,不平等方差的影响通常不会很大,it’s more robust
if it’s unbalanced, use
- nonparametric test
- Welch Test