Paper intensive reading (十五):Mitigating the adverse impact of batch effects in sample detection

论文题目:Mitigating the adverse impact of batch effects in sample pattern detection 

scholar 引用:6

页数:8

发表时间:1 March 2018

发表刊物:Bioinformatics

作者:Teng Fei1, Tengjiao Zhang2, Weiyang Shi3,* and Tianwei Yu1,*

Emory University, 同济大学,跟Paper intensive reading (十二)同一个作者

摘要:

Motivation: It is well known that batch effects exist in RNA-seq data and other profiling data. Although some methods do a good job adjusting for batch effects by modifying the data matrices, it is still difficult to remove the batch effects entirely. The remaining batch effect can cause artifacts in the detection of patterns in the data.

Results: In this study, we consider the batch effect issue in the pattern detection among the samples, such as clustering, dimension reduction and construction of networks between subjects. Instead of adjusting the original data matrices, we design an adaptive method to directly adjust the dissimilarity matrix between samples. In simulation studies, the method achieved better results recovering true underlying clusters, compared to the leading batch effect adjustment method ComBat. In real data analysis, the method effectively corrected distance matrices and improved the performance of clustering algorithms.

1.cluster方面比ComBat好

2.怎么没给他们的新方法起一个名字,有名字:QuantNorm. 

3.实际数据分析中,该方法的确起了作用

结论:

  • In this paper, we proposed novel approaches based on the interpolating quantile normalization. As the data become challenging, i.e. true clusters are closer to each other, and the batch effect is heterogeneous on different clusters, our methods outperform ComBat. 本文提出的方法的新颖之处。
  • 但是,ComBat还是一种更通用的方法,ComBat is a more general method, which adjusts the data matrix for many kinds of down-stream analysis, while our method focuses on adjusting the dissimilarity matrix between samples, mainly serving the purpose of pattern detection in the samples. It does not correct the raw count matrix to adjust for batch effects.
  • our method modifies the dissimilarity matrix so that various clustering approaches can achieve better performance. 本文的方法可以与多种聚类方法相结合。
  • 本方法的缺陷:On the one hand, the vectorization approach may suffer from insufficient discrimination due to the lack of extreme values. On the other hand, the row/column iterative approach is more easily affected by the wrong extreme values since each column and each row are polarized.
  • the vectorization approach performed better on data with high similarity between batches 本方法适用于不同批次之间具有高度相似性的数据。
  • the preprocessing method can affect the result of the clustering analysis. 预处理方法
  • the choice of the two preprocessing strategies may depend on data 预处理策略取决于数据
  • Although the iterative approach seems to have limitations ex- plained earlier, we generally recommend this approach. 可以提高方法的鲁棒性

Introduction:

  • The existence of batch effects increases the difficulty in comparing the data from different labs, platforms and processing times. 不同实验室,不同平台,不同时间
  • 如果忽略批次效应,会得到错误的结果。比如说对小鼠和人的基因表达进行聚类分析,得出两个物种而非两种组织的结论,但是调整了批次效应以后,得到了相反的结论。
  • 大量的方法被提出:
  1. Johnson et al. (2007) proposed the empirical Bayes algorithm of ComBat, which removes the additive and multiplicative batch effects for each gene from each batch. 当前的黄金标准方法
  2. Gagnon-Bartsch and Speed (2012) applied the removal of unwanted variation method to make adjustments according to the variations of the control genes, which are not differentially expressed (DE) among the batches.
  • 大部分方法包含ComBat的缺点:attempt to modify the data matrix (N subjects􏰂p genes) so that the measurements from different batches become comparable. ComBat appears to be more effective for the microarray data, which is less skewed than RNA-seq data. 对microarray data更有效,但是对RNA-seq也许没有那么有效
  • Moreover, real data may have high irregularity such that the additive and the multiplicative parameters are insufficient to capture all batch effects. 加法和乘法参数不足以捕获所有的批次效应
  • ad hoc approaches based on quantile normalization are introduced in this manuscript 本文提出了一种分位数归一化的方法
  • According to simulation results, clustering based on the normalized dissimilarity matrix obtained by our methods outperformed ComBat in recapturing the underlying cluster structure in the data, especially when the data were more challenging as the percentage of genes that differentiate the underlying clusters was small. 仿真实验结果显示优于ComBat。尤其适用于一些有挑战的数据集。
  • In real data analysis, we analyzed two datasets with dominating batch effects (Gilad and Mizrahi- Man, 2015; Zhang et al., 2016) and two scRNA-seq datasets where the batch effects are relatively weak (Muraro et al., 2016; Usoskin et al., 2015). Our methods improved the clustering accuracy and outperformed ComBat in both situations。在几个实际数据集上,也优于ComBat。

正文组织架构:

1. Introduction

2. Materials and methods

2.1 Problem setup

2.2 Preprocessing

2.3 Interpolating quantile(内插分位数归一化) normalization for vectors of different lengths

2.4 Dissimilarity matrix correction

2.5 Clustering and evaluation methods

3. Results

3.1 Simulation study

3.2 ENCODE data for human and mouse tissues

3.3 Human-mouse brain RNA-seq data

3.4 Mouse neuron scRNA-seq data

3.5 Human pancreas scRNA-seq data

4. Discussion

正文部分内容摘录:

疑问:

1.什么是分位数归一化?

  •  

2.RNA-seq和scRNA-seq的区别?

  •  
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
1. 智慧社区背景与挑战 随着城市化的快速发展,社区面临健康、安全、邻里关系和服务质量等多方面的挑战。华为技术有限公司提出智慧社区解决方案,旨在通过先进的数字化技术应对这些问题,提升城市社区的生活质量。 2. 技术推动智慧社区发展 技术进步,特别是数字化、无线化、移动化和物联化,为城市社区的智慧化提供了可能。这些技术的应用不仅提高了社区的运行效率,也增强了居民的便利性和安全性。 3. 智慧社区的核心价值 智慧社区承载了智慧城市的核心价值,通过全面信息化处理,实现对城市各个方面的数字网络化管理、服务与决策功能,从而提升社会服务效率,整合社会服务资源。 4. 多层次、全方位的智慧社区服务 智慧社区通过构建和谐、温情、平安和健康四大社区模块,满足社区居民的多层次需求。这些服务模块包括社区医疗、安全监控、情感沟通和健康监测等。 5. 智慧社区技术框架 智慧社区技术框架强调统一平台的建设,设立数据中心,构建基础网络,并通过分层建设,实现平台能力及应用的可持续成长和扩展。 6. 感知统一平台与服务方案 感知统一平台是智慧社区的关键组成部分,通过统一的RFID身份识别和信息管理,实现社区服务的智能化和便捷化。同时,提供社区内外监控、紧急救助服务和便民服务等。 7. 健康社区的构建 健康社区模块专注于为居民提供健康管理服务,通过整合医疗资源和居民接入,实现远程医疗、慢性病管理和紧急救助等功能,推动医疗模式从治疗向预防转变。 8. 平安社区的安全保障 平安社区通过闭路电视监控、防盗报警和紧急求助等技术,保障社区居民的人身和财产安全,实现社区环境的实时监控和智能分析。 9. 温情社区的情感沟通 温情社区着重于建立社区居民间的情感联系,通过组织社区活动、一键呼叫服务和互帮互助平台,增强邻里间的交流和互助。 10. 和谐社区的资源整合 和谐社区作为社会资源的整合协调者,通过统一接入和身份识别,实现社区信息和服务的便捷获取,提升居民生活质量,促进社区和谐。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值