GWAS数据下载详解（2）

weixin_49320263

已于 2023-07-27 13:47:45 修改

阅读量1.8w

点赞数 33

分类专栏： GWAS 文章标签：数据库

于 2023-07-24 16:23:41 首次发布

本文链接：https://blog.csdn.net/weixin_49320263/article/details/131862053

版权

GWAS 专栏收录该内容

10 篇文章

订阅专栏

1、FinnGen数据库：Risteys FinnGen R11 + FinRegistryhttps://risteys.finregistry.fi/

以检索"glaucoma"（青光眼）为例：https://risteys.finregistry.fi/

下载数据：

链接：https://storage.googleapis.com/finngen-public-data-r9/summary_stats/finngen_R11_H7_GLAUCOMA.gz

特点：R11代表数据库，H7_GLAUCOMA是endpoint name，其他数据可以替换后下载。

使用R语言整理数据：

字段含义：Data description - FinnGen DocumentationFile naming pattern and file structurehttps://finngen.gitbook.io/documentation/data-description#summary-association-statistics

"#chrom" ：染色体；"pos"位置；"ref"对照等位基因；"alt"：效应等位基因；"rsids"：变量标识符；"nearest_genes"：最近基因； "pval" p值；"mlogp" -log10P；"beta"：效应大小；sebeta效应大小标准差；"af_alt"：效应等位基因频率；"af_alt_cases"：病例中的效应等位基因频率；"af_alt_controls"：对照组中效应等位基因频率。

用TwoSampleMR整理出暴露数据和结局数据。

#读取下载内容
setwd("D:\\")#查看R语言当前工作路径，将txt文件放置给文件夹
library('data.table')
a <- fread("finngen_R9_O15_PRE_OR_ECLAMPSIA.gz",header = T)
save(a,file="Finngen.RData")
#获取数据变量
colnames(a)
#筛选强相关的变量：若5E-8筛选出来的变量较少，可适当调大P值（须有文献根据）
ab<-subset(a,pval<5e-8)
ab$phenotype<-"PRE_OR_ECLAMPSIA"
#load("整理.RData")
save(ab,file="整理.RData")
#整理为TwoSampleMR所需要的双样本数据
library(TwoSampleMR)
#暴露数据
exposure<-format_data(ab,
                      type = "exposure",
                      snp_col = "rsids",
                      phenotype_col = "phenotype",
                      beta_col = "beta",
                      se_col = "sebeta",
                      eaf_col="af_alt",
                      effect_allele_col = "alt",
                      other_allele_col = "ref",
                      pval_col = "pval")
#去除连锁不平衡（linkage disequilibrium）
exposure_data<-clump_data(exposure,clump_r2 = 0.001)

#结局数据
outcome<-format_data(ab,
                     snps=exposure_data$SNP,
                     type = "outcome",
                     snp_col = "rsids",
                     phenotype_col = "phenotype",
                     beta_col = "beta",
                     se_col = "sebeta",
                     eaf_col="af_alt",
                     effect_allele_col = "alt",
                     other_allele_col = "ref",
                     pval_col = "pval")

整理出数据后即可进行分析。