混合效应模型的假设与作用

It was Eisenhart (1947) who realized that there were actually two fundamentally different sorts of categorical
explanatory variables: he called these fixed effects and random effects. It will take a good deal of practice
before you are confident in deciding whether a particular categorical explanatory variable should be treated
as a fixed effect or a random effect, but in essence:
fixed effects influence only the mean of y;

random effects influence only the variance of y


The important point is that because the random effects come from a large population, there is not much
point in concentrating on estimating means of our small subset of factor levels, and no point at all in comparing
individual pairs of means for different factor levels. Much better to recognize them for what they are, random
samples from a much larger population, and to concentrate on their variance. This is the added variation
caused by differences between the levels of the random effects



Variance components analysis is all about estimating the size of this variance, and working out its percentage

contribution to the overall variation. There are five fundamental assumptions of linear mixed-effects models:
Within-group errors are independent with mean zero and variance σ2.
Within-group errors are independent of the random effects.
The random effects are normally distributed with mean zero and covariance matrix .
The random effects are independent in different groups.

The covariance matrix does not depend on the group.



The tricks with mixed-effects models are:
learning which variables are random effects;
specifying the fixed and random effects in the model formula;
getting the nesting structure of the random effects right;

remembering to get library(lme4) or library(nlme) at the outset.



The issues fall into two broad categories: questions about experimental design and the management of
experimental error (e.g. where does most of the variation occur, and where would increased replication
be most profitable?); and questions about hierarchical structure, and the relative magnitude of variation at
different levels within the hierarchy (e.g. studies on the genetics of individuals within families, families
within parishes, and parishes with counties, to discover the relative importance of genetic and phenotypic

variation)



Most ANOVA models are based on the assumption that there is a single error term. But in hierarchical
studies and nested experiments, where the data are gathered at two or more different spatial scales, there
is a different error variance for each different spatial scale. There are two reasonably clear-cut sets of
circumstances where your first choice would be to use a linear mixed-effects model: you want to do variance
components analysis because all your explanatory variables are categorical random effects and you do not
have any fixed effects; or you do have fixed effects, but you also have pseudoreplication of one sort or another
(e.g. temporal pseudoreplication resulting from repeated measurements on the same individuals; see p. 699).
To test whether one should use a model with mixed effects or just a plain old linear model, Douglas Bates
wrote in the R help archive: ‘I would recommend the likelihood ratio test against a linear model fit by lm.
The p-value returned from this test will be conservative because you are testing on the boundary of the
parameter space.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值