import pandas as pd
import numpy as np
from statsmodels.formula.api import ols
from statsmodels.stats.anova import anova_lm
data = pd.read_excel(r"C:\Users\liuhao\Desktop\a.xls")
model = ols('数量 ~ 方法',data).fit()
anovat = anova_lm(model)
anovat
变量
df
sum_sq
mean_sq
F
PR(>F)
方法
2.0
520.0
260.000000
9.176471
0.003818
Residual
12.0
340.0
28.333333
NaN
NaN
随机化区组设计
随机分组,减少分组设计导致的随机误差
import pandas as pd
import numpy as np
from statsmodels.formula.api import ols
from statsmodels.stats.anova import anova_lm
data = pd.read_excel(r"C:\Users\liuhao\Desktop\s.xls")
model = ols('压力值 ~ 管理员+系统',data).fit()
anovat = anova_lm(model)
anovat
变量
df
sum_sq
mean_sq
F
PR(>F)
管理员
5.0
30.0
6.0
3.157895
0.057399
系统
2.0
21.0
10.5
5.526316
0.024181
Residual
10.0
19.0
1.9
NaN
NaN
变量管理员为区组设计,系统间差异 p =0.024<0.05,差异显著
析因实验(多因素方差分析)
import pandas as pd
import numpy as np
from statsmodels.formula.api import ols
from statsmodels.stats.anova import anova_lm
data = pd.read_excel(r"C:\Users\liuhao\Desktop\a.xls")
data.head()
model = ols('分数 ~ 课程+院校+课程*院校',data).fit()#课程*院校为交互作用
anovat = anova_lm(model)
anovat