chapter10-batch effects

最新推荐文章于 2024-01-27 22:08:59 发布

盲人骑瞎马5555

最新推荐文章于 2024-01-27 22:08:59 发布

阅读量338

点赞数

分类专栏：生物信息学

本文链接：https://blog.csdn.net/wxw060709/article/details/104112610

版权

50 篇文章

订阅专栏

本文深入探讨了批量效应在高通量数据中的影响，包括其产生原因、如何通过PCA等技术检测与理解批量效应，以及使用线性模型和因子分析进行调整的方法。通过实例展示，强调了在数据分析中识别和调整批量效应的重要性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1.Introduction to batch effects [Rmd]

batch effects 产生的原因：measurements are affected by laboratory conditions, reagent lots 试剂批号, and personnel differences.
本章中将介绍：how to detect, interpret, model, and adjust for batch effects
With data from several laboratories, we can in fact estimate the γγ, if we assume they average out to 0.
Or we can consider them to be random effects and simply estimate a new estimate and standard error with all measurements.

2.Confounding [Rmd]

Correlation is not causation
Example of Simpson’s Paradox 举了一个例子，展示了不仔细剖析，混淆反应会造成的影响
Simpson’s paradox in baseball 第二个小例子
Confounding: High-throughput Example 第三个例子，不同种族的基因序列，由于采样年份的影响，最终的结论值得剖析。验证方法是在同一个种族的两个年份的基因差异表达，也发现了非常多的差异基因。

3.Confounding exercises

4.EDA with PCA [Rmd]

Discovering Batch Effects with EDA 现在开始介绍如何detect batch effects
探索性数据分析(Exploratory Data Analysis，简称EDA)
用一个公开数据库中未经处理的数据集做例子。
step1：加载数据；step2：发现有相关系数为1的两组数据，删除；
Calculating the PCs 计算成分
We have seen how PCA combined with EDA can be a powerful technique to detect and understand batches.
In a later section, we will see how we can use the PCs as estimates in factor analysis to improve model estimates.