AUGUSTUS is a gene prediction program. It can be used as an ab initio (ab initio, 从头开始) program, which means it bases its prediction purely on the sequence. AUGUSTUS may also incorporate hints on the gene structure coming from extrinsic sources such as EST, MS/MS, protein alignments and syntenic genomic alignments. Since version 3.0 AUGUSTUS can also predict the genes simultaneously in several aligned genomes (see README-cgp.md)
main page
manual
一、安装
使用docker安装(我们的服务器已用docker安装)
git clone https://github.com/Gaius-Augustus/Augustus.git ##受网络因素,需要手动下载安装包
cd Augustus
docker build -t augustus .
使用singularity安装
git clone https://github.com/Gaius-Augustus/Augustus.git
cd Augustus
singularity build augustus.sif Singularity.def
使用conda安装
conda create -n augustus
conda install -c bioconda augustus ##但是采用这种安装有报错
二、使用
2.1 文件准备
- fasta格式的基因组文件或其他核苷酸序列,e.g. 野生番茄基因组文件,存放在/data/wild_tomato.fa
2.2 运行
2.2.1 查看已经构建好的物种模型,选择最适合的物种
docker run --species=help
运行后会输出许多已经构建好模型的物种,其中植物有9种。本示例是去鉴定野生番茄的基因组中的蛋白编码基因,与野生番茄亲缘关系最近的是Solanum lycopersicum,所以选择tomato模型。
标识 | 物种 |
---|---|
arabidopsis | Arabidopsis thaliana |
chlamy2011 OR chlamydomonas | Chlamydomonas reinhardtii |
galdieria | Galdieria sulphuraria |
coyote_tobacco | Nicotiana attenuata |
rice | Oryza brachyantha |
tomato | Solanum lycopersicum |
cacao | Theobroma cacao |
wheat | Triticum aestivum |
maize OR maize5 | Zea mays |
2.2.2 augustus预测基因
docker run -v /data:/docker_data augustus augustus --species=tomato /docker_data/wild_tomato.fa > augustus.out 2>augustus.err
docker -v,挂载宿主机的一个目录,使用方法为
-v 本地目录:容器目录
。本示例中的基因组文件存放在/data/wild_tomato.fa,因此需要挂载/data目录,容器目录命名为docker_data–species=tomato,本示例选择tomato
输出文件为augustus.out,输出err文件为augustus.err
三、结果
augustus.out文件去掉 #注释的内容,实质上是gff文件