1.Introduction to batch effects [Rmd]
- batch effects 产生的原因:measurements are affected by laboratory conditions, reagent lots 试剂批号, and personnel differences.
- 本章中将介绍:how to detect, interpret, model, and adjust for batch effects
- With data from several laboratories, we can in fact estimate the γγ, if we assume they average out to 0.
- Or we can consider them to be random effects and simply estimate a new estimate and standard error with all measurements.
2.Confounding [Rmd]
- Correlation is not causation
-
Example of Simpson’s Paradox 举了一个例子,展示了不仔细剖析,混淆反应会造成的影响
-
Simpson’s paradox in baseball 第二个小例子
-
Confounding: High-throughput Example 第三个例子,不同种族的基因序列,由于采样年份的影响,最终的结论值得剖析。验证方法是在同一个种族的两个年份的基因差异表达,也发现了非常多的差异基因。
3.Confounding exercises
- library(dagdata) 想要成功运行代码,应该需要仔细看看这个book前面的introduction。但是鉴于时间问题,本次先不看了。
- 代码主要涉及Simpson’s Paradox例子,只是换成了hard major,详细地介绍了分析思路。
4.EDA with PCA [Rmd]
-
Discovering Batch Effects with EDA 现在开始介绍如何detect batch effects
-
探索性数据分析(Exploratory Data Analysis,简称EDA)
-
用一个公开数据库中未经处理的数据集做例子。
-
step1:加载数据;step2:发现有相关系数为1的两组数据,删除;
-
Calculating the PCs 计算成分
-
We have seen how PCA combined with EDA can be a powerful technique to detect and understand batches.
-
In a later section, we will see how we can use the PCs as estimates in factor analysis to improve model estimates.
5.EDA with PCA exercises
- 我感觉好像就是有时候分析,要细致考虑一些影响因素,不然就会被confounding所迷惑,导致得出错误的结论。然后PCA技术可以帮助进行这样的分析。
6.Adjusting with linear models [Rmd]
-
Adjusting for Batch Effects with Linear Models
-
Combat is a popular method and is based on using linear models to adjust for batch effects.
7.Adjusting with linear models exercises
- 这个例子主要引起batch effect的原因还是获得sample的日期不同
8.Factor analysis [Rmd]
- 同样需要用到PCA
9.Factor analysis exercises
- 不要过度校正
10.Adjusting with factor analysis [Rmd]
11.Adjusting with factor analysis exercises