DESeq2是一个用于分析基因表达差异的R包,具体操作要在R语言中运行
1.R语言安装DESeq2>source("https://bioconductor.org/biocLite.R")
>biocLite("DESeq2")
2.载入基因表达量文件,添加列名> setwd("C:\\Users\\18019\\Desktop\\counts")
> options(stringsAsFactors=FALSE)
> control1
> head(control1)
gene_id control11 ENSG00000000003.14_2 15762 ENSG00000000005.5_2 03 ENSG00000000419.12_2 7564 ENSG00000000457.13_3 3015 ENSG00000000460.16_5 7646 ENSG00000000938.12_2 0> control2
> treat1
>treat2
3.数据整合> raw_count
> head(raw_count)
gene_id control1 control2 treat1 treat21 __alignment_not_unique 7440131 2973831 7861484 86768842 __ambiguous 976485 412543 1014239 11790513 __no_feature 1860117 768637 1289737 18120564 __not_aligned 1198545 572588 1256232 13480685 __too_low_aQual 0 0 0 06 ENSG00000000003.14_2 1576 713 1589 1969#删除前五行>raw_count_filt aENSEMBL row.names(raw_count_filt)
> raw_count_filt
> colnames(raw_count_filt)[1]
>head(raraw_count_filt )
ensembl_gene_id gene_id control1 control2 treat1 treat2
ENSG00000000003 ENSG00000000003 ENSG00000000003.14_2 1576 713 1589 1969ENSG00000000005 ENSG00000000005 ENSG00000000005.5_2 0 0 0 1ENSG00000000419 ENSG00000000419 ENSG00000000419.12_2 756 384 806 984ENSG00000000457 ENSG00000000457 ENSG00000000457.13_3 301 151 217 324ENSG00000000460 ENSG00000000460 ENSG00000000460.16_5 764 312 564 784ENSG00000000938 ENSG00000000938 ENSG00000000938.12_2 0 0 0 0
4.对基因进行注释-获取gene_symbol
用bioMart对ensembl_id转换成gene_symbol> library("biomaRt")
> library("curl")
> mart
> my_ensembl_gene_id
> options(timeout = 4000000)
> hg_symbols
> head(readcount)
ensembl_gene_id gene_id control1 control2 treat1 treat2 hgnc_symbol chromosome_name start_position1 ENSG00000000003 ENSG00000000003.14_2 1576 713 1589 1969 TSPAN6