关于用R语言提取KEGG通路基因可不可靠
一直有个问题就是用R语言提取的KEGG通路的基因到底可不可靠
用的R包是’KEGGREST’
library(KEGGREST)
listDatabases()
keggList("pathway","hsa")
keggGet("hsa00020")
[[1]]$GENE
[1] "1431"
[2] "CS; citrate synthase [KO:K01647] [EC:2.3.3.1]"
[3] "47"
[4] "ACLY; ATP citrate lyase [KO:K01648] [EC:2.3.3.8]"
[5] "50"
[6] "ACO2; aconitase 2 [KO:K01681] [EC:4.2.1.3]"
[7] "48"
[8] "ACO1; aconitase 1 [KO:K01681] [EC:4.2.1.3]"
[9] "3417"
[10] "IDH1; isocitrate dehydrogenase (NADP(+)) 1 [KO:K00031] [EC:1.1.1.42]"
[11] "3418"
[12] "IDH2; isocitrate dehydrogenase (NADP(+)) 2 [KO:K00031] [EC:1.1.1.42]"
[13] "3420"
[14] "IDH3B; isocitrate dehydrogenase (NAD(+)) 3 non-catalytic subunit beta [KO:K00030] [EC:1.1.1.41]"
[15] "3421"
[16] "IDH3G; isocitrate dehydrogenase (NAD(+)) 3 non-catalytic subunit gamma [KO:K00030] [EC:1.1.1.41]"
[17] "3419"
[18] "IDH3A; isocitrate dehydrogenase (NAD(+)) 3 catalytic subunit alpha [KO:K00030] [EC:1.1.1.41]"
[19] "55753"
[20] "OGDHL; oxoglutarate dehydrogenase L [KO:K00164] [EC:1.2.4.2]"
[21] "4967"
[22] "OGDH; oxoglutarate dehydrogenase [KO:K00164] [EC:1.2.4.2]"
[23] "1743"
[24] "DLST; dihydrolipoamide S-succinyltransferase [KO:K00658] [EC:2.3.1.61]"
[25] "1738"
[26] "DLD; dihydrolipoamide dehydrogenase [KO:K00382] [EC:1.8.1.4]"
[27] "8802"
[28] "SUCLG1; succinate-CoA ligase GDP/ADP-forming subunit alpha [KO:K01899] [EC:6.2.1.4 6.2.1.5]"
[29] "8801"
[30] "SUCLG2; succinate-CoA ligase GDP-forming subunit beta [KO:K01900] [EC:6.2.1.4 6.2.1.5]"
[31] "8803"
[32] "SUCLA2; succinate-CoA ligase ADP-forming subunit beta [KO:K01900] [EC:6.2.1.4 6.2.1.5]"
[33] "6389"
[34] "SDHA; succinate dehydrogenase complex flavoprotein subunit A [KO:K00234] [EC:1.3.5.1]"
[35] "6390"
[36] "SDHB; succinate dehydrogenase complex iron sulfur subunit B [KO:K00235] [EC:1.3.5.1]"
[37] "6391"
[38] "SDHC; succinate dehydrogenase complex subunit C [KO:K00236]"
[39] "6392"
[40] "SDHD; succinate dehydrogenase complex subunit D [KO:K00237]"
[41] "2271"
[42] "FH; fumarate hydratase [KO:K01679] [EC:4.2.1.2]"
[43] "4190"
[44] "MDH1; malate dehydrogenase 1 [KO:K00025] [EC:1.1.1.37]"
[45] "4191"
[46] "MDH2; malate dehydrogenase 2 [KO:K00026] [EC:1.1.1.37]"
[47] "5091"
[48] "PC; pyruvate carboxylase [KO:K01958] [EC:6.4.1.1]"
[49] "5105"
[50] "PCK1; phosphoenolpyruvate carboxykinase 1 [KO:K01596] [EC:4.1.1.32]"
[51] "5106"
[52] "PCK2; phosphoenolpyruvate carboxykinase 2, mitochondrial [KO:K01596] [EC:4.1.1.32]"
[53] "5161"
[54] "PDHA2; pyruvate dehydrogenase E1 subunit alpha 2 [KO:K00161] [EC:1.2.4.1]"
[55] "5160"
[56] "PDHA1; pyruvate dehydrogenase E1 subunit alpha 1 [KO:K00161] [EC:1.2.4.1]"
[57] "5162"
[58] "PDHB; pyruvate dehydrogenase E1 subunit beta [KO:K00162] [EC:1.2.4.1]"
[59] "1737"
[60] "DLAT; dihydrolipoamide S-acetyltransferase [KO:K00627] [EC:2.3.1.12]"
最后得到这30个基因,那到底可不可靠呢,就让我们回到KEGG网站吧
我进行了逐个比较发现其实都是有的,比如1.1.1.41在分析结果有三个,但这里只有1个,点进去之后就能找到3个了
实在不行就用使用KEGG API进行对比,也可以看到这30个基因都在
所以用R包提取KEGG通路的基因是可靠的,其实这个包使用的就是KEGG的API接口
reference
434383719#:~:text=提取某条KEGG