抄自:http://www.bio-info-trainee.com/tag/soapfuse
一,软件安装
下载压缩包,解压后即可使用。
将source文件夹中的bin加入环境变量;
export PATH=/home/biotools/SOAPfuse-v1.27/source/
bin/:$PATH
PERL5LIB=$PERL5LIB: /home// biotools/SOAPfuse-v1.27/source/bin/perl_module; export PERL5LIB
PERL5LIB=$PERL5LIB: /home// biotools/SOAPfuse-v1.27/source/bin/perl_module; export PERL5LIB
二 、下载数据
首先下载5个文件:
我把这些文件都放在的当前文件夹下面的raw这个子文件夹,因为我要当前文件夹作为该软件的database文件夹!!!
然后运行命令!
我在SOAPfuse-v1.27文件下面运行:
perl ../SOAPfuse-v1.27/source/SOAPfuse-S00-Generate_SOAPfuse_database.pl \
-wg raw/hg19.fa -gtf raw/Homo_sapiens.GRCh37.75.gtf.gz -cbd raw/cytoBand.txt.gz -gf raw/HGNC_Gene_Family_dataset \
-rft raw/HumanRef_refseg_symbols_relationship.list \
-sd ../SOAPfuse-v1.27 -dd ./
这一步耗时很长,4~6小时,创造了transcript.fa和gene.fa,然后还对他们建立bwa和soap的index,所以有点慢!
构建成功会有提示:
Congratulations!You have constructed SOAPfuse database files successfully.These database files are all stored in directory you supplied:/home/jmzeng/biosoft/SOAPfuse/db_v1.27/They are all generated based on public data files you supplied:whole_genome_fasta_file: /home/jmzeng/biosoft/SOAPfuse/db_v1.27/raw/hg19.fagtf_annotation_file: /home/jmzeng/biosoft/SOAPfuse/db_v1.27/raw/Homo_sapiens.GRCh37.75.gtf.gzChr_Bandregion_file: /home/jmzeng/biosoft/SOAPfuse/db_v1.27/raw/cytoBand.txt.gzHGNC_gene_family_file: /home/jmzeng/biosoft/SOAPfuse/db_v1.27/raw/HGNC_Gene_Family_datasetgtf_segname2refseg_list: /home/jmzeng/biosoft/SOAPfuse/db_v1.27/raw/HumanRef_refseg_symbols_relationship.list
这些目录很重要,接下来制作配置文件会用得着!
To use these database files, just set the 'DB_db_dir' in config file as belowed:
DB_db_dir = /home/jmzeng/biosoft/SOAPfuse/db_v1.27
配置文件需要修改下面5个
DB_db_dir = /DATABASE_DIR/PG_pg_dir = /TOOL_DIR/source/binPS_ps_dir = /TOOL_DIR/sourcePD_all_out = /out_directory/PA_all_fq_postfix = PostFix
其实你仔细阅读了说明书,你就知道该修改成什么样子了!
最后制作sample list文件
我这里只有一个sample,所以文件就一句话即可
####http://soap.genomics.org.cn/soapfuse.html
test test test 100
所以我的有下面两个文件,都是为了顺应作者的需求我才搞了test/test/test这么无聊的东西!!!
/home/jmzeng/test_for_soapfuse/test/test/test_1.fq.gz
/home/jmzeng/test_for_soapfuse/test/test/test_2.fq.gz
如果你有多个sample需要一起运行,你就要仔细读作者的readme了,它把这个配置文件搞得特别复杂!!!
三,运行命令
如果文件都准备好了,运行命令非常简单!!
perl SOAPfuse-RUN.pl -c <config_file> -fd <WHOLE_SEQ-DATA_DIR> -l <sample_list> -o <out_directory> [Options]
运行的非常慢!!!
因为需要重新比对,知道
四,数据结果解读