GWAS
学习笔记
SNP
过滤
1:
缺失比例(
Missing rates
)
:
( GENO> 0.05 )
Shortly we will apply more stringent criteria, such that GENO > 0.05.
In
this
case,
0.05*89
=
4.45
samples,
meaning
that
if
a
SNP
is
missing in 4.45 more more samples, that SNP will be removed from
the dataset.
不久将来,
我们将采用更严格的标准,
比如
GENO> 0.05
。
在这种
情况下,
0.05 * 89 = 4.45
样本,
这意味着如果
SNP
在
4.45
多个样
本中丢失,则
SNP
将从数据集中删除。
2:
最小等位基因频率(
Minor Allele frequencies
)
( MAF
如果
SNP
较多可以设置为
MAF<0.05)
MAF is the Minor Allele Frequency. It can be used to exclude SNPs
which are not informative because they show little variation in the
sample set being analyzed. For instance, if a SNP shows variation in
only 1 of the 89 individuals, it is not useful statistically and should
be removed.
MAF
是次要等级线频率。
它可以用于排除不信息的
SNP
,
因为它
们在被分析的样本集中几乎没有变化。例如,如果
SNP
仅显示
89
个个体中的
1
个,则在统计学上不是有用的,应该被去除。