方差分析(2) —— 双因子方差分析及Excel示例

概率统计 专栏收录该内容
2 篇文章 1 订阅

之前介绍了单因子方差分析的基本思想,从假设检验、给出检验统计量到最后的判断,以及使用Excel进行分析示例。对于多因子方差分析的基本思想与单因子方差分析类似,同样是利用方差和 F F F检验统计量,这里介绍的多因素方差分析主要是给出检验统计量并做出判断,以及用Excel示例。

双因子方差分析方法与单因子有相似,但有差异。考虑两个因素对结果的影响时,不仅有两个因素各自对结果的影响,还有两个因素交互作用产生的影响。下面列出无重复双因子方差分析可重复上因子方差分析的数据格式和偏差平方和,以及方差分析表。

无重复双因子方差分析

无重复双因子方差分析是考虑两个因子( A A A B B B)的变化对数据结果的影响,且两个因子不进行重复实验, 建立假设与单因素相似,即 H 0 H_0 H0: r ∗ k r*k rk个总体均值相等, H 1 H_1 H1 r ∗ k r*k rk个总体均值不相等。其数据结构如下表:

因子 A A A \ 因子 B B B因子 B 1 B_1 B1因子 B 2 B_2 B2因子 B k B_k Bk均值 x ˉ i ⋅ \bar{x}_{i·} xˉi
因子 A 1 A_1 A1 x 11 x_{11} x11 x 12 x_{12} x12 x 1 k x_{1k} x1k x ˉ 1 ⋅ \bar{x}_{1·} xˉ1
因子 A 2 A_2 A2 x 21 x_{21} x21 x 22 x_{22} x22 x 2 k x_{2k} x2k x ˉ 2 ⋅ \bar{x}_{2·} xˉ2
因子 A r A_r Ar x r 1 x_{r1} xr1 x r 2 x_{r2} xr2 x r k x_{rk} xrk x ˉ r ⋅ \bar{x}_{r·} xˉr
均值 x ˉ ⋅ j \bar{x}_{·j} xˉj x ˉ ⋅ 1 \bar{x}_{·1} xˉ1 x ˉ ⋅ 2 \bar{x}_{·2} xˉ2 x ˉ ⋅ k \bar{x}_{·k} xˉk x ˉ ˉ \bar{\bar{x}} xˉˉ

其中, x ˉ ⋅ 1 = ∑ i x i 1 r , x ˉ ⋅ 2 = ∑ i x i 2 r , . . . , x ˉ ⋅ k = ∑ i x i k r , ( i = 1 , 2 , . . . , r ) \bar{x}_{·1}=\frac{\sum_{i}{x_{i1}}}{r}, \bar{x}_{·2}=\frac{\sum_{i}{x_{i2}}}{r}, ..., \bar{x}_{·k}=\frac{\sum_{i}{x_{ik}}}{r}, (i=1,2,...,r) xˉ1=rixi1,xˉ2=rixi2,...,xˉk=rixik,(i=1,2,...,r)

x ˉ 1 ⋅ = ∑ j x 1 j k , x ˉ 2 ⋅ = ∑ j x 2 j k , . . . , x ˉ r ⋅ = ∑ j x r j k , ( j = 1 , 2 , . . . , k ) \bar{x}_{1·}=\frac{\sum_{j}{x_{1j}}}{k}, \bar{x}_{2·}=\frac{\sum_{j}{x_{2j}}}{k}, ..., \bar{x}_{r·}=\frac{\sum_{j}{x_{rj}}}{k}, (j=1,2,...,k) xˉ1=kjx1j,xˉ2=kjx2j,...,xˉr=kjxrj,(j=1,2,...,k)

总体均值 x ˉ ˉ \bar{\bar{x}} xˉˉ x ˉ ˉ = ∑ i = 1 i = k x ˉ i ⋅ k = ∑ j = 1 j = r x ˉ ⋅ j r = ∑ i ∑ j x i j r k \bar{\bar{x}} = \frac{\sum_{i=1}^{i=k}\bar{x}_{i·}}{k} =\frac{\sum_{j=1}^{j=r}\bar{x}_{·j}}{r} = \frac{\sum_{i}{\sum_{j}{x_{ij}}}}{rk} xˉˉ=ki=1i=kxˉi=rj=1j=rxˉj=rkijxij

下面先给出对应因子的偏差平方和:

偏差平方和及自由度

总体偏差平方和
S S T = ∑ j = 1 k ∑ i = 1 r ( x i j − x ˉ ˉ ) 2 , SS_T = \sum_{j=1}^{k}{\sum_{i=1}^{r}{(x_{ij}-\bar{\bar{x}})^2}}, SST=j=1ki=1r(xijxˉˉ)2,自由度为 d f T = k r − 1 df_T=kr-1 dfT=kr1.

因子A偏差平方和
因子A在不同水平间的效应差异(主效应)
S S A = ∑ j = 1 k ∑ i = 1 r ( x ˉ i ⋅ − x ˉ ˉ ) 2 = k ∑ i = 1 r ( x ˉ i ⋅ − x ˉ ˉ ) 2 , SS_A = \sum_{j=1}^{k}{\sum_{i=1}^{r}{(\bar{x}_{i·}-\bar{\bar{x}})^2}} =k{\sum_{i=1}^{r}{(\bar{x}_{i·}-\bar{\bar{x}})^2}}, SSA=j=1ki=1r(xˉixˉˉ)2=ki=1r(xˉixˉˉ)2,自由度为 d f A = r − 1 df_A=r-1 dfA=r1.

因子B偏差平方和
因子B在不同水平间的效应差异(主效应)
S S B = ∑ j = 1 k ∑ i = 1 r ( x ˉ ⋅ j − x ˉ ˉ ) 2 = ∑ j = 1 k r ( x ˉ ⋅ j − x ˉ ˉ ) 2 , SS_B = \sum_{j=1}^{k}{\sum_{i=1}^{r}{(\bar{x}_{·j}-\bar{\bar{x}})^2}} ={\sum_{j=1}^{k}{r(\bar{x}_{·j}-\bar{\bar{x}})^2}}, SSB=j=1ki=1r(xˉjxˉˉ)2=j=1kr(xˉjxˉˉ)2,自由度为 k − 1 k-1 k1.

随机误差的偏差平方和
S S e = ∑ j = 1 k ∑ i = 1 r ( x i j − x ˉ i ⋅ − x ˉ ⋅ j + x ˉ ˉ ) 2 , SS_e = \sum_{j=1}^{k}{\sum_{i=1}^{r}{(x_{ij}-\bar{x}_{i·}-\bar{x}_{·j} +\bar{\bar{x}})^2}}, SSe=j=1ki=1r(xijxˉixˉj+xˉˉ)2,自由度为 d f e = ( k − 1 ) ( r − 1 ) df_e=(k-1)(r-1) dfe=(k1)(r1).

偏差平方和之间有如下等式:
S S T = S S A + S S B + S S e SS_T= SS_A+SS_B+SS_e SST=SSA+SSB+SSe

无重复双因子方差分析表

差异源偏差平方和自由度均方 F F F p p p
因子 A A A S S A SS_A SSA d f A df_A dfA M S A = S S A r − 1 MS_A=\frac{SS_A}{r-1} MSA=r1SSA F A = M S A M S e F_A=\frac{MS_A}{MS_e} FA=MSeMSA F α ( d f A , d f e ) F_{\alpha}(df_A, df_e) Fα(dfA,dfe)
因子 B B B S S B SS_B SSB d f B df_B dfB M S B = S S B k − 1 MS_B=\frac{SS_B}{k-1} MSB=k1SSB F B = M S B M S e F_B=\frac{MS_B}{MS_e} FB=MSeMSB F α ( d f B , d f e ) F_{\alpha}(df_B, df_e) Fα(dfB,dfe)
误差 e e e S S e SS_e SSe d f e df_e dfe M S e = S S e ( r − 1 ) ( k − 1 ) MS_e=\frac{SS_e}{(r-1)(k-1)} MSe=(r1)(k1)SSe
总和 S T = S S A + S S B + S S e S_T=SS_A+SS_B+SS_e ST=SSA+SSB+SSe r k − 1 rk-1 rk1

可重复双因子方差分析

可重复双因子方差分析是对两个因子的实验,且有同一水平下的重复实验,其数据格式如下(重复次数相同):

因子 A A A \ 因子 B B B因子 B 1 B_1 B1因子 B 2 B_2 B2因子 B k B_k Bk均值 x ˉ i ⋅ ⋅ \bar{x}_{i··} xˉi
因子 A 1 A_1 A1 x 111 x_{111} x111 x 121 x_{121} x121 x 1 k 1 x_{1k1} x1k1
x 112 x_{112} x112 x 122 x_{122} x122 . . . ... ... x 1 k 2 x_{1k2} x1k2
. . . ... ... . . . ... ... . . . ... ... . . . ... ...
x 11 p x_{11p} x11p x 12 p x_{12p} x12p . . . ... ... x 1 k p x_{1kp} x1kp
均值 x 1 j ⋅ x_{1j·} x1j x ˉ 11 ⋅ \bar{x}_{11·} xˉ11 x ˉ 12 ⋅ \bar{x}_{12·} xˉ12 . . . ... ... x ˉ 1 k ⋅ \bar{x}_{1k·} xˉ1k x ˉ 1 ⋅ ⋅ \bar{x}_{1··} xˉ1
因子 A 2 A_2 A2 x 211 x_{211} x211 x 221 x_{221} x221 x 2 k 1 x_{2k1} x2k1
x 212 x_{212} x212 x 222 x_{222} x222 . . . ... ... x 2 k 2 x_{2k2} x2k2
. . . ... ... . . . ... ... . . . ... ... . . . ... ...
x 21 p x_{21p} x21p x 22 p x_{22p} x22p . . . ... ... x 2 k p x_{2kp} x2kp
均值 x 2 j ⋅ x_{2j·} x2j x ˉ 21 ⋅ \bar{x}_{21·} xˉ21 x ˉ 22 ⋅ \bar{x}_{22·} xˉ22 . . . ... ... x ˉ 2 k ⋅ \bar{x}_{2k·} xˉ2k x ˉ 2 ⋅ ⋅ \bar{x}_{2··} xˉ2
. . . ... ... . . . ... ... . . . ... ... . . . ... ... . . . ... ... . . . ... ...
因子 A r A_r Ar x r 11 x_{r11} xr11 x r 21 x_{r21} xr21 x r k 1 x_{rk1} xrk1
x r 12 x_{r12} xr12 x r 22 x_{r22} xr22 . . . ... ... x r k 2 x_{rk2} xrk2
. . . ... ... . . . ... ... . . . ... ... . . . ... ...
x r 1 p x_{r1p} xr1p x r 2 p x_{r2p} xr2p . . . ... ... x r k p x_{rkp} xrkp
均值 x r j ⋅ x_{rj·} xrj x ˉ r 1 ⋅ \bar{x}_{r1·} xˉr1 x ˉ r 2 ⋅ \bar{x}_{r2·} xˉr2 . . . ... ... x ˉ r k ⋅ \bar{x}_{rk·} xˉrk x ˉ r ⋅ ⋅ \bar{x}_{r··} xˉr
均值 x ˉ ⋅ j ⋅ \bar{x}_{·j·} xˉj x ˉ ⋅ 1 ⋅ \bar{x}_{·1·} xˉ1 x ˉ ⋅ 2 ⋅ \bar{x}_{·2·} xˉ2 x ˉ ⋅ k ⋅ \bar{x}_{·k·} xˉk x ˉ ˉ \bar{\bar{x}} xˉˉ

其中,

  • 所有 r ∗ k ∗ p r*k*p rkp个数据的平均:
    x ˉ ˉ = 1 r k p ∑ i = 1 r ∑ j = 1 k ∑ m = 1 p x i j m \bar{\bar{x}} = \frac{1}{rkp}{\sum_{i=1}^{r}\sum_{j=1}^{k}\sum_{m=1}^{p}{x_{ijm}}} xˉˉ=rkp1i=1rj=1km=1pxijm

  • A i A_i Ai, B j B_j Bj条件下, p p p次重复的平均:
    x ˉ i j ⋅ = 1 p ∑ m = 1 p x i j m , e g : x ˉ 11 ⋅ = 1 p ∑ m = 1 p x 11 m , x ˉ 12 ⋅ = 1 p ∑ m = 1 p x 12 m \bar{x}_{ij·} = \frac{1}{p}{\sum_{m=1}^{p}{x_{ijm}}} , \\ eg: \bar{x}_{11·} = \frac{1}{p}{\sum_{m=1}^{p}{x_{11m}}} , \bar{x}_{12·} = \frac{1}{p}{\sum_{m=1}^{p}{x_{12m}}} xˉij=p1m=1pxijm,eg:xˉ11=p1m=1px11mxˉ12=p1m=1px12m

  • A i A_i Ai条件下, k ∗ p k*p kp个数据的平均:
    x ˉ i ⋅ ⋅ = 1 k p ∑ j = 1 k ∑ m = 1 p x i j m , e g : x ˉ 1 ⋅ ⋅ = 1 k p ∑ j = 1 k ∑ m = 1 p x 1 j m ,   x ˉ 2 ⋅ ⋅ = 1 k p ∑ j = 1 k ∑ m = 1 p x 2 j m \bar{x}_{i··} = \frac{1}{kp}{\sum_{j=1}^{k}\sum_{m=1}^{p}{x_{ijm}}}, \\ eg: \bar{x}_{1··} = \frac{1}{kp}{\sum_{j=1}^{k}\sum_{m=1}^{p}{x_{1jm}}}, \ \bar{x}_{2··} = \frac{1}{kp}{\sum_{j=1}^{k}\sum_{m=1}^{p}{x_{2jm}}} xˉi=kp1j=1km=1pxijm,eg:xˉ1=kp1j=1km=1px1jm, xˉ2=kp1j=1km=1px2jm

  • B j B_j Bj条件下, r ∗ p r*p rp个数据的平均:
    x ˉ ⋅ j ⋅ = 1 r p ∑ i = 1 r ∑ m = 1 p x i j m , e g : x ˉ ⋅ 1 ⋅ = 1 r p ∑ i = 1 r ∑ m = 1 p x i 1 m ,   x ˉ ⋅ 2 ⋅ = 1 r p ∑ i = 1 r ∑ m = 1 p x i 2 m \bar{x}_{·j·} = \frac{1}{rp}{\sum_{i=1}^{r}\sum_{m=1}^{p}{x_{ijm}}}, \\ eg: \bar{x}_{·1·} = \frac{1}{rp}{\sum_{i=1}^{r}\sum_{m=1}^{p}{x_{i1m}}}, \ \bar{x}_{·2·} = \frac{1}{rp}{\sum_{i=1}^{r}\sum_{m=1}^{p}{x_{i2m}}} xˉj=rp1i=1rm=1pxijm,eg:xˉ1=rp1i=1rm=1pxi1m, xˉ2=rp1i=1rm=1pxi2m

注:上述表中,列出数据是以等重复次数列出,对于重复次数不等的情形,计算均值方法类似。

偏差平方和及自由度

总体偏差平方和
S S T = ∑ i = 1 r ∑ j = 1 k ∑ m = 1 p ( x i j m − x ˉ ˉ ) 2 ,   自 由 度 d f T = r k p − 1 SS_T = {\sum_{i=1}^{r}\sum_{j=1}^{k}\sum_{m=1}^{p}{(x_{ijm} - \bar{\bar{x}})^2}}, \ 自由度df_T = rkp -1 SST=i=1rj=1km=1p(xijmxˉˉ)2, dfT=rkp1

因子 A A A偏差平方和
S S A = ∑ i = 1 r ∑ j = 1 k ∑ m = 1 p ( x i ⋅ ⋅ − x ˉ ˉ ) 2 = k p ∑ i = 1 r ( x i ⋅ ⋅ − x ˉ ˉ ) 2 SS_A = {\sum_{i=1}^{r}\sum_{j=1}^{k}\sum_{m=1}^{p}{(x_{i··} - \bar{\bar{x}})^2}} =kp\sum_{i=1}^{r}(x_{i··} - \bar{\bar{x}})^2 SSA=i=1rj=1km=1p(xixˉˉ)2=kpi=1r(xixˉˉ)2
自由度为 d f B = r − 1 df_B = r-1 dfB=r1.

因子 B B B偏差平方和
S S B = ∑ i = 1 r ∑ j = 1 k ∑ m = 1 p ( x ⋅ j ⋅ − x ˉ ˉ ) 2 = r p ∑ j = 1 k ( x ⋅ j ⋅ − x ˉ ˉ ) 2 , SS_B = {\sum_{i=1}^{r}\sum_{j=1}^{k}\sum_{m=1}^{p}{(x_{·j·} - \bar{\bar{x}})^2}} =rp\sum_{j=1}^{k}{(x_{·j·} - \bar{\bar{x}})^2}, SSB=i=1rj=1km=1p(xjxˉˉ)2=rpj=1k(xjxˉˉ)2,
自由度为 d f B = k − 1 df_B = k-1 dfB=k1.

随机误差的偏差平方和
S S e = ∑ i = 1 r ∑ j = 1 k ∑ m = 1 p ( x i j m − x ˉ i j ⋅ ) 2 , SS_e = {\sum_{i=1}^{r}\sum_{j=1}^{k}\sum_{m=1}^{p}{(x_{ijm} - \bar{x}_{ij·})^2}}, SSe=i=1rj=1km=1p(xijmxˉij)2,
自由度为 d f e = r k ( p − 1 ) df_e=rk(p-1) dfe=rk(p1).

因子 A B AB AB的交互影响偏差平方和
S S A B = S S T − S S A − S S B = ∑ i = 1 r ∑ j = 1 k ∑ m = 1 p [ ( x ˉ ˉ + x ˉ i j ⋅ ) − ( x i ⋅ ⋅ + x ⋅ j ⋅ ) ] 2 , SS_{AB} = SS_T - SS_A -SS_B = \sum_{i=1}^{r}\sum_{j=1}^{k}\sum_{m=1}^{p}[(\bar{\bar{x}} + \bar{x}_{ij·}) - (x_{i··} + x_{·j·}) ]^2, SSAB=SSTSSASSB=i=1rj=1km=1p[(xˉˉ+xˉij)(xi+xj)]2,
自由度为 d f A B = d f T − d f A − d f B = ( r − 1 ) ( k − 1 ) df_{AB}=df_T - df_A -df_B = (r-1)(k-1) dfAB=dfTdfAdfB=(r1)(k1).

重复双因子方差分析表

对应偏差平方和的均方( M S = S S / d f MS = SS/df MS=SS/df),分别记为 M S A , M S B , M S e , M S A B MS_A, MS_B, MS_e, MS_{AB} MSA,MSB,MSe,MSAB
单因子方差分析相似,仍使用 F F F检验统计量。

差异源偏差平方和自由度均方 F F F p p p
因子 A A A S S A SS_A SSA d f A df_A dfA M S A MS_A MSA F A = M S A M S e F_A=\frac{MS_A}{MS_e} FA=MSeMSA F α ( d f A , d f e ) F_\alpha(df_A, df_e) Fα(dfA,dfe)
因子 B B B S S B SS_B SSB d f B df_B dfB M S B MS_B MSB F B = M S B M S e F_B=\frac{MS_B}{MS_e} FB=MSeMSB F α ( d f A , d f e ) F_\alpha(df_A, df_e) Fα(dfA,dfe)
交互 A B AB AB S S A B SS_{AB} SSAB d f A B df_{AB} dfAB M S A B MS_{AB} MSAB F A B = M S A B M S e F_{AB} = \frac{MS_{AB}}{MS_e} FAB=MSeMSAB F α ( d f A B , d f e ) F_\alpha(df_{AB}, df_e) Fα(dfAB,dfe)
误差 e e e S S e SS_e SSe d f e df_e dfe M S e MS_e MSe
总和 S T = S S A + S S B + S S A B + S S e S_T=SS_A+SS_B+SS_{AB}+SS_e ST=SSA+SSB+SSAB+SSe d f T df_T dfT

当交互作用不显著时,可用随机误差平方和与交互作用偏差平方和作为新误差平方和:
S S E = S S A B + S S e , SS_E = SS_{AB} + SS_e, SSE=SSAB+SSe,
自由度 d f E = d f A B + d f e df_E =df_{AB} + df_e dfE=dfAB+dfe, 均方为 M S E = S S E / d f E MS_E = SS_E/ df_E MSE=SSE/dfE.
则对应 A A A, B B B F F F值和 p p p的计算则为:
F A = M S A / M S E , p = F α ( d f A , d f E ) ; F B = M S B / M S E , p = F α ( d f B , d f E ) . F_A= MS_A / MS_E, p = F_\alpha(df_A, df_E); \\ F_B = MS_B / MS_E, p = F_\alpha(df_B, df_E). FA=MSA/MSE,p=Fα(dfA,dfE);FB=MSB/MSE,p=Fα(dfB,dfE).

Excel表中示例

Excel版本: Microsoft Office 2016

无重复双因子示例

  1. 选择"数据分析" - "无重复双因素方差分析"工具:
    (其中行为因子A对应的数据,列为因子B对应的数据)在这里插入图片描述
  2. 选择数据,指定置信水平 α \alpha α值,默认0.05,指定输出结果区域:
    在这里插入图片描述
  3. 上一步确定后生成分析结果:
    在这里插入图片描述
    上面结果中,差异源误差分别对应的是因子 A A A,因子 B B B和随机误差 e e e的信息。各列中SS即为对应偏差平方和,df为对应自由度,MS为对应均方,F为对应 F F F值,P-value 为对应 p p p值,F-crit F F F临界值。
  • 对于 p p p值,也可用Excel中的函数 =F.DIST.RT({F_value},df1,df2) 得到。
    在这里插入图片描述
  • 对于 F F F临界值,也可用 =F.INV(1-{alpha},df1,df2)得到:
    在这里插入图片描述

可重复双因子示例

  1. 类似无重复双因子方差分析,选择"可重复上因素分析"工具:
    在这里插入图片描述

  2. 选择数据,填写"样本行数"(示例中对因子A进行3次重复)、置信水平和输出区域:
    在这里插入图片描述

  3. 上一步确定后,生成如下结果:在这里插入图片描述
    上面结果中,差异源样本交互内部分别对应的是因子 A A A,因子 B B B A B AB AB交互影响和随机误差 e e e的信息。各列中SS即为对应偏差平方和,df为对应自由度,MS为对应均方,F为对应 F F F值,P-value 为对应 p p p值,F-crit F F F临界值。

    与无重复双因子方差分析相同,其 p p p值和 F F F临界值也可通过Excel中函数得到: =F.DIST.RT({F_value},df1,df2)=F.INV(1-{alpha},df1,df2)


2019-07- 26

  • 1
    点赞
  • 0
    评论
  • 22
    收藏
  • 一键三连
    一键三连
  • 扫一扫,分享海报

©️2021 CSDN 皮肤主题: 精致技术 设计师:CSDN官方博客 返回首页
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、C币套餐、付费专栏及课程。

余额充值