TransDecoder：转录本基因预测（真菌）

最新推荐文章于 2024-07-14 18:35:17 发布

CAAS_IFR_zp

最新推荐文章于 2024-07-14 18:35:17 发布

阅读量242

点赞数 8

文章标签： linux

本文链接：https://blog.csdn.net/m0_53945548/article/details/140287163

版权

安装

Home · TransDecoder/TransDecoder Wiki · GitHub

wget -c https://data.broadinstitute.org/Trinity/CTAT_SINGULARITY/MISC/TransDecoder/transdecoder.v5.7.1.simg
mamba create -n TransDecoder
mamba activate TransDecoder
mamba install -c conda-forge singularity=3.8
singularity exec -e transdecoder.v5.7.1.simg  TransDecoder.LongOrfs -h

使用

pfam-A数据库下载可见PULpy安装与使用_plup安装-CSDN博客

Swissprot数据库下载可见 Index of /pub/databases/uniprot/knowledgebase/complete

#从头合成的转录本fasta作为输入，比如Trinity的输出文件
singularity exec -e transdecoder.v5.7.1.simg  TransDecoder.LongOrfs -t transcripts.fasta
#搜索上一步候选肽的同源性（更精准），上一步结果在transcripts.fasta.transdecoder_dir/文件夹中
#需要有pfam-A数据库（我已经有）或者Swissprot、Uniref90数据库（以Swissprot为例，快）
#需要有diamond或hmmsearch
mamba install -c bioconda diamond hmmer
wget -c https://ftp.uniprot.org/pub/databases/uniprot/knowledgebase/complete/uniprot_sprot.fasta.gz
gunzip uniprot_sprot.fasta.gz
diamond makedb --in uniprot_sprot.fasta --db uniprot_sprot.fasta
#diamond把假定的肽比对到Swissprot
diamond blastp -d uniprot_sprot.fasta -q transcripts.fasta.transdecoder_dir/longest_orfs.pep --evalue 1e-10 --max-target-seqs 1 --threads 90 > blastp.outfmt6
#hmmsearch把假定的肽比对到pfam
hmmsearch --cpu 90 -E 1e-10 --domtblout pfam.domtblout /path/to/Pfam-A.hmm transdecoder_dir/longest_orfs.pep
#将 Blast 和 Pfam 搜索结果整合到编码区域选择中
singularity exec -e transdecoder.v5.7.1.simg TransDecoder.Predict -t target_transcripts.fasta --retain_pfam_hits pfam.domtblout --retain_blastp_hits blastp.outfmt6
#最终候选编码区域集可以在文件“ .transdecoder ”中找到，其扩展名包括 .pep、.cds、.gff3 和 .bed