RNA-seq上下游分析snakemake流程

学习完snakemake后写的第一个流程是RNA-seq上游定量和下游的质控和差异分析。

使用fastp处理fastq文件,在使用START比对到基因组同时得到raw count,使用非冗余外显子长度作为基因的长度计算FPKM、TPM,同时也生成了CPM的结果。

非冗余外显子长度计算可以参考之前的推文转录组实战02: 计算非冗余外显子长度之和

对定量结果质控使用生信技能树的三张图(PCA、树状图、热图)。

使用python版的DEseq2对组间做差异分析(火山图和MA图)。

流程代码在https://jihulab.com/BioQuest/SnakeMake-RNA-seq 或https://github.com/BioQuestX/SnakeMake-RNA-seq

A SnakeMake workflow for Bulk RNA-seq

Reads were mapped onto ensembl genome with STAR, and adapters were removed with fastp.

For nomalisztion, gtftools was used to calculate gene_length and bioninfokit was used to give TPM, FPKM and CPM results.

For quality control, PCA plot, dendrogram plot and heatmap were used to show differences among samples or groups.

PyDESeq2 was used to perform differential expression anlysis.

General settings

To configure this workflow, modify config/config.yaml according to your needs, following the explanations provided in the file.

Sample sheet

  • Add samples to config/samples.tsv. Only the column Sample is mandatory, but any additional columns can be added.

  • For each sample, add one or more sequencing units (runs, lanes or replicates) to the Unit column of config/samples.tsv.

  • For each sample, define Group column(experimental or clinical attribute).

Report

6b66c8100a49457391bd57bf1e0c4894.png

 

04a5b63c6694423d8cfb52b2172ca4b7.jpg

12c19abd91eb4d01888e4d3551042971.png 

27374a8bf8d54cd0bd6e19e8555052b0.png 

 

  • 21
    点赞
  • 12
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值