WGCNA分析之一-------理清概念

1,定义

WGCNA即Weighted gene co-expression network analysis,加权基因共表达网络分析。

2,有什么用

2.1 将共表达的一组基因放在一起研究,可以得到比单个上调、下调基因更多的信息;
2.2 鉴定"hub gene"(即与其它基因关系密切的基因、处于中心位置的基因、有重要作用的基因);
2.3 探究基因模块(一组共表达的基因)与性状(疾病状态)之间的关系。

3,输入数据的格式

RPKM,FPKM和标准化之后的counts值等等都可以。但必须是以样本为单位进行normalize之后的结果。

Whether one uses RPKM, FPKM, or simply normalized counts doesn’t make a whole lot of difference for WGCNA analysis as long as all samples were processed the same way. These normalization methods make a big difference if one wants to compare expression of gene A to expression of gene B; but WGCNA calculates correlations for which gene-wise scaling factors make no difference. (Sample-wise scaling factors of course do, so samples do need to be normalized.)

If data come from different batches, we recommend to check for batch effects and, if needed, adjust for them. We use ComBat for batch effect removal but other methods should also work.

Finally, we usually check quantile scatterplots to make sure there are no systematic shifts between samples; if sample quantiles show correlations (which they usually do), quantile normalization can be used to remove this effect.

4,样本数要求

不低于15个样本

5,怎样对基因进行过滤

建议使用平均表达值或中位绝对偏差对基因进行过滤(去掉表达值低的基因或者去掉方差小的基因),不建议使用差异表达倍数进行过滤。

Probesets or genes may be filtered by mean expression or variance (or their robust analogs such as median and median absolute deviation, MAD) since low-expressed or non-varying genes usually represent noise. Whether it is better to filter by mean expression or variance is a matter of debate; both have advantages and disadvantages, but more importantly, they tend to filter out similar sets of genes since mean and variance are usually related.

We do not recommend filtering genes by differential expression. WGCNA is designed to be an unsupervised analysis method that clusters genes based on their expression profiles. Filtering genes by differential expression will lead to a set of correlated genes that will essentially form a single (or a few highly correlated) modules. It also completely invalidates the scale-free topology assumption, so choosing soft thresholding power by scale-free topology fit will fail.

Reference

https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/Rpackages/WGCNA/faq.html
https://deneflab.github.io/HNA_LNA_productivity/WGCNA_analysis.html#1_data_inputcleaning

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值