github链接:https://github.com/sanger-pathogens/Roary
Roary教程链接:http://sanger-pathogens.github.io/Roary/
1.Roary的简介
比较快速的分析范基因组的工具,输入格式为gff格式,通常与prokka一起使用
2、Roary的安装
我比较喜欢用conda安装,主要比较方便
GitHub上给的conda下载
conda config --add channels r
conda config --add channels defaults
conda config --add channels conda-forge
conda config --add channels bioconda
conda install roary
下载很慢 总是失败
conda install -c "bioconda/label/cf201901" roary
用这个命令下载成功
注:用conda的时候记得linux里用conda环境
3.Roary的数据
roary通常与prokka联合使用
roary的输入个格式为gff格式。但是在ncbi上下载的gff数据只有注释但是没有序列。通常要用prokka进行注释
4、Roary的命令
Usage: roary [options] *.gff
Options: -p INT number of threads [1]
-o STR clusters output filename [clustered_proteins]
-f STR output directory [.]
-e create a multiFASTA alignment of core genes using PRANK
-n fast core gene alignment with MAFFT, use with -e
-i minimum percentage identity for blastp [95]
-cd FLOAT percentage of isolates a gene must be in to be core [99]
-qc generate QC report with Kraken
-k STR path to Kraken database for QC, use with -qc
-a check dependancies and print versions
-b STR blastp executable [blastp]
-c STR mcl executable [mcl]
-d STR mcxdeblast executable [mcxdeblast]
-g INT maximum number of clusters [50000]
-m STR makeblastdb executable [makeblastdb]
-r create R plots, requires R and ggplot2
-s dont split paralogs
-t INT translation table [11]
-ap allow paralogs in core alignment
-z dont delete intermediate files
-v verbose output to STDOUT
-w print version and exit
-y add gene inference information to spreadsheet, doesnt work with -e
-iv STR Change the MCL inflation value [1.5]
-h this help message
Example: Quickly generate a core gene alignment using 8 threads
roary -e --mafft -p 8 *.gff
注意:roary的使用最少要有两条序列,还要注意报错的位置,容易找不到
Default usage – create a pan genome without a core alignment
roary *.gff
Quickly generate a core gene alignment using 8 threads:
roary -e --mafft -p 8 *.gff
Save results to a different directory
roary –f output_dir *.gff
Change the minimum blastp percentage identity. ’ not advised to go below 90% unless you know what you’re doing.
roary –i 90 *.gff
Run a QC check to see if all the samples are what you think they are
roary –qc –k /path/to/kraken/db *.gff
don’t split clusters containing paralogs
roary -s *.gff
这些简单的命令官方文档已经写出
官方链接:http://sanger-pathogens.github.io/Roary/
5、输出文件
在官方文档已经很详细