【基因表达数据处理】从RAW测序数据,到FPKM的过程

FPKM, Fragments Kilobase of exon model per  millon mapped reads, which can be used to indicate the expression (abundance) characteristics of genes. Now I will describe operation about obtaining interested gene FPKM value.

1.Software Download

1).fastq-dump: convert sra file to fastq file.

 website:http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=software 

2).bowtie:an ultrafast and memory efficient tool for aligning sequencing reads to long reference sequences.

 website:http://bowtie-bio.sourceforge.net/bowtie2/index.shtml 

3).cufflinks:assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples.

 website:http://cufflinks.cbcb.umd.edu/ 

4).gffread: convert gff3 file to gtf file.

 website:http://cufflinks.cbcb.umd.edu/ (This program is included with cufflinks package)

2. Operation

1) Download genome.fa and genes.gff3 file from genome website; Download sra file from NCBI

2) Format conversion

 $ fastq-dump -I --split-files SRR123456789.sra # convert sra file to fastq file

 $ gffread -E genes.gff3 -o genes.gtf # convert gff3 file to gtf file

3) Index files

 $bowtie2-build genome.fa genome

4) Alignment

 $bowtie2 -x genome -1 SRR123456789_1.fastq -2 SRR123456789_2.fastq -S SRR123456789.sam

 $samtools view -bS SRR123456789.sam > SRR123456789.bam

 $samtools sort SRR123456789.bam SRR123456789

5) FPKM values

 $cufflinks SRR123456789.bam -G genes.gtf -o result

After these operations, we can extract FPKM values from genes.frkm_tracking file based on gene ID.

  • 1
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值