RNA 30. SCI文章中基于TCGA和GTEx数据挖掘神器(GEPIA2)

93223c1c7639c770692aafb941cc1075.gif



这期介绍一个基于TCGA和GTEx数据挖掘神器(GEPIA2),个人觉得如果没有编程基础的可以直接利用这个在线小工具分析自己的研究的单个基因或者多个基因,效果还是蛮好的!

桓峰基因公众号推出转录组分析教程,有需要生信的老师可以联系我们!转录分析教程整理如下:

RNA 1. 基因表达那些事--基于 GEO

RNA 2. SCI文章中基于GEO的差异表达基因之 limma

RNA 3. SCI 文章中基于T CGA 差异表达基因之 DESeq2

RNA 4. SCI 文章中基于TCGA 差异表达之 edgeR

RNA 5. SCI 文章中差异基因表达之 MA 图

RNA 6. 差异基因表达之-- 火山图 (volcano)

RNA 7. SCI 文章中的基因表达——主成分分析 (PCA)

RNA 8. SCI文章中差异基因表达--热图 (heatmap)

RNA 9. SCI 文章中基因表达之 GO 注释

RNA 10. SCI 文章中基因表达富集之--KEGG

RNA 11. SCI 文章中基因表达富集之 GSEA

RNA 12. SCI 文章中肿瘤免疫浸润计算方法之 CIBERSORT

RNA 13. SCI 文章中差异表达基因之 WGCNA

RNA 14. SCI 文章中差异表达基因之 蛋白互作网络 (PPI)

RNA 15. SCI 文章中的融合基因之 FusionGDB2

RNA 16. SCI 文章中的融合基因之可视化

RNA 17. SCI 文章中的筛选 Hub 基因 (Hub genes)

RNA 18. SCI 文章中基因集变异分析 GSVA

RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)

RNA 20. SCI 文章中单样本免疫浸润分析 (ssGSEA)

RNA 21. SCI 文章中单基因富集分析

RNA 22. SCI 文章中基于表达估计恶性肿瘤组织的基质细胞和免疫细胞(ESTIMATE)

RNA 23. SCI文章中表达基因模型的风险因子关联图(ggrisk)

RNA 24. SCI文章中基于TCGA的免疫浸润细胞分析 (TIMER)

RNA 25. SCI文章中估计组织浸润免疫细胞和基质细胞群的群体丰度(MCP-counter)

RNA 26. SCI文章中基于转录组数据的基因调控网络推断 (GENIE3)

RNA 27 SCI文章中转录因子结合motif富集到调控网络 (RcisTarget)

FigDraw 28. SCI文章中绘制雷达图/蛛网图 (RadarChart)

RNA 29. SCI文章中基于TCGA的免疫浸润细胞分析 (TIMER2.0)

GEPIA2是GEPIA的更新版本,用于分析TCGA和GTEx项目中9736个肿瘤和8587个正常样本的RNA测序表达数据,使用标准处理管道。GEPIA2提供可定制的功能,如肿瘤/正常差异表达分析、根据癌症类型或病理分期进行分析、患者生存分析、相似基因检测、相关性分析和降维分析。(http://gepia2.cancer-pku.cn/)

GEPIA2使用的RNA-Seq数据集基于UCSC Xena项目(http://xena.ucsc.edu),通过标准管道计算。

d4ac333e57f3dce79135c351cfe04eaf.png

GEPIA有四个模块,都可以处理数据:

  • Single Gene Analysis

  • Cancer Type Analysis

  • Custom Data Analysis

  • Multiple Gene Analysis

这个在线工具特别适合做单基因研究想发文章的需求,简单好用,输入一个基因就可以看到泛癌中的变化情况!

我们就看下给出来的例子,都能做哪些分析以及最后得到哪些结果:

Examples for GEPIA2 Usage

By using GEPIA2, experimental biologists can easily explore the large TCGA and GTEx datasets, ask specific questions, and test their hypotheses in a higher resolution.

For the isoform analysis in boxplot and survival analyses, users can easily get the result that POMT1-003 isoform in ACC cancer type was over expressed compared with the normal tissue. Meanwhile, given the high expression of POMT1-003 isoform, the patients in ACC had a worse prognostic outcome.

d93dd6deaad48df662482c64cf4c614e.png


In addition, based on the Isoform Usage, users can find that SLC7A2-202 in SLC7A2 gene has a isoform switch event in LIHC compared with other cancer types.

cd34b7cedd1e2f7bda58938e4df66fda.png


Users also can use Isoform Structure find that 3 isoforms in ERCC1 have different isoform structures.

378bdb9b0b526ac9f050f7546c20f2a1.png


For Survival Map, users can get the survival significance map of gene HSPB6, which have significant results in BLCA, KIRP, LGG and SARC.

f1c1192406b8f14d946e0978c47dd9af.png


For gene signature analysis in similar genes detection, users can find that MIR155HG, CD8A, IL21R, CD27 and PTPN7 have highest correlation with T-cell exhausted signature in LIHC cancer type.

8c2b43f47c73f8c7cb09d40926f3920b.png


For the combination of signature and subtype analysis in boxplot, GEPIA2 provides the expression distribution of Th-1 like signature in the 3 COAD subtypes.

e87ed673915949576477bf6557f56e69.png


For analyzing the user-upload data, the features in custom data analysis enables users classify their uploaded data into cancer subtype or compare their own data with TCGA and GTEx data.

3f97ee4a29ee81911349f763c5701430.png


For doing the analyses in the local machine, GEPIA2 provides the python package gepia in API. Users can get the batch of analysis results using this package.

a6c06918de679aeaac28213162985182.png


GEPIA2 also retained the original features of GEPIA:

In differential analysis and expression profile, users can easily discover differentially expressed genes, such as MPO in leukemia and UPK2 in bladder cancer.

MPO specifically expressed in leukemia:

4f1203446cce5f72290c9e197463101e.png


d29fb9ccd26b13757cca97a9b8ba4638.png


UPK2 specifically expressed in bladder cancer:

6a9dffe7f1c6929ed0eab1387a51bf0b.png


39c0af8cb61b2842497d003ea147335c.png


The chromosomal distribution of over- or under- expressed genes can be plotted in Differential Genes.

Over-expressed genes:

5700d67ad774fdbad226566606c2cd18.png


Under-expressed genes:

c9e16cfdfb3cd5e1026088a855bd9d63.png


Both over-expressed and under-expressed genes:

1b3da2aa13ac1f54774da1396ad0b046.png


In Survival analysis, genes with the most significant association with patient survival can be identified, such as MCTS1 in breast cancer and HILPDA in liver cancer. Code

MCTS1 in breast cancer

c9dc868569332b469d46c8b1444b1231.png

65bbd0ee95f843f0c83f04fc7fb78e78.png


HILPDA in liver cancer:

571bd8048157fd04560463f8bcf6e6cd.png

cfa32eb1c65e5ecf0cff81d283a8e195.png


Gene expression is visualized by both a bodymap and a bar plot in General.

58e7d6670e21f918e95b46869c90623f.png


479fcb15eb98d68d40a9238ed8fa39ad.png


Gene expression by pathological stage is plotted in Stage plot. Code

be4d7c056ecc3148372b67ad5b55afd7.png


Users can compare the expression of one gene in multiple cancers by Boxplot, or compare multiple genes by a matrix plot in Multiple gene comparison. Code

Boxplot:

826373c226fb4532023b784b12b5589e.png


Matrix plot:

dd70768c7886f0cce8bc10ba5d9253bc.png


GEPIA provides pair-wise gene correlation analysis of a given set of TCGA and/or GTEx expression data. Normalization is optional and customizable. Code

79b9d867e390efe91a201483bfdc684e.png


GEPIA provides Principal Component Analysis of multiple genes and cancer types in PCA, and presents results by 2D or 3D plots.

2D plots:

0fb5f2750a41ddd94f0b0f8202974a02.png


3D plots:

ca9c80370015f3de659ec215eb71937f.png


Variances distribution:

c8722025c7e08e64151fa7881629a317.png


Genes with similar expression pattern can be identified in Similar Genes, for example, PGAP3 and GRB7 are similar to ERBB2.

d3eaed8ab09d77590f144b0167ed62db.png


ERBB2:

e3e94a05278bb40db795b909f5aa9ef6.png


PGAP3:

f08deafb623e6c8f9add5a62a8f70c95.png


GRB7:

ed52874d68509631bd2c2ed754da1fd3.png

使用起来还是非常方便,避免了自己写代码,又找数据又作图,有需要的老师可以参考使用!

桓峰基因,铸造成功的您!

未来桓峰基因公众号将不间断的推出转录组系列生信分析教程,

敬请期待!!

有想进生信交流群的老师可以扫最后一个二维码加微信,备注“单位+姓名+目的”,有些想发广告的就免打扰吧,还得费力气把你踢出去!

85a7402c1450746eda975be7cb551fcc.png

References:

1. Chenwei Li, Zefang Tang, Wenjie Zhang, Zhaochen Ye, Fenglin Liu, GEPIA2021: integrating multiple deconvolution-based analysis into GEPIA, Nucleic Acids Research, Volume 49, Issue W1, 2 July 2021, Pages W242–W246, https://doi.org/10.1093/nar/gkab418

2. Tang, Z. et al. (2019) GEPIA2: an enhanced web server for large-scale expression profiling and interactive analysis. Nucleic Acids Res, 10.1093/nar/gkz430.

  • 0
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值