生信小白记录日常问题:Conda 安装 Busco步骤及遇到的问题

本文详细介绍了如何在新的conda环境中安装Python3.7,以及在该环境中安装BUSCO、augustus、hmmer等工具,包括解决conda版本问题和测试过程。同时提供了配置conda环境变量和安装最新版BUSCO5.2.2的方法。
摘要由CSDN通过智能技术生成

目录

前言

一、busco安装

二、安装步骤

1. 建一个新的conda环境,环境里装一个python3.7

2.在busco-py3.7环境中安装augustus、hmmer、busco、biopython1.77

3、测试

4、安装最新版本的busco5.2.2

5、配制conda环境变量

三、参考链接



前言

BUSCO - Benchmarking Universal Single-Copy Orthologs
一款使用python语言编写的对转录组和基因组组装质量进行评估的软件。
在相近的物种之间总有一些保守的序列,而BUSCO就是使用这些保守序列与组装的结果进行比对,鉴定组装的结果是否包含这些序列,包含单条、多条还是部分或者不包含等等情况来给出结果,而我主要用来比对数据库以获得单拷贝基因家族。

作者:ayunga
链接:https://www.jianshu.com/p/7d32d01e7a02
来源:简书
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。


提示:以下是本篇文章正文内容,下面案例可供参考

一、busco安装

用conda安装busco

二、安装步骤

1. 建一个新的conda环境,环境里装一个python3.7

代码如下:

$ conda create -n busco-py3.7 python=3.7
$ conda activate busco-py3.7
#如果conda activate busco-py3.7命令无法激活则可使用:source activate busco-py3.7

2.在busco-py3.7环境中安装augustus、hmmer、busco、biopython1.77

代码如下:

conda install -c bioconda augustus
conda install -c bioconda hmmer
conda install -c bioconda busco
conda install -c bioconda biopython=1.77

在安装augustus时报错:conda install -c bioconda augustus Collecting package metadata (current_repodata.json): done Solving environment: failed with initial frozen solve. Retrying with flexible solve. Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.

原因:conda -V查看conda版本,版本太低。更新conda即可

conda update -n base conda 
conda update --all
#重复两次更新
​

3、测试

busco -h
usage: busco -i [SEQUENCE_FILE] -l [LINEAGE] -o [OUTPUT_NAME] -m [MODE] [OTHER OPTIONS]

Welcome to BUSCO 5.2.2: the Benchmarking Universal Single-Copy Ortholog assessment tool.
For more detailed usage information, please review the README file provided with this distribution and the BUSCO user guide. Visit this page https://gitlab.com/ezlab/busco#how-to-cite-busco to see how to cite BUSCO

optional arguments:
  -i SEQUENCE_FILE, --in SEQUENCE_FILE
                        Input sequence file in FASTA format. Can be an assembled genome or transcriptome (DNA), or protein sequences from an annotated gene set. Also possible to use a path to a directory containing multiple input files.
  -o OUTPUT, --out OUTPUT
                        Give your analysis run a recognisable short name. Output folders and files will be labelled with this name. WARNING: do not provide a path
  -m MODE, --mode MODE  Specify which BUSCO analysis mode to run.
                        There are three valid modes:
                        - geno or genome, for genome assemblies (DNA)
                        - tran or transcriptome, for transcriptome assemblies (DNA)
                        - prot or proteins, for annotated gene sets (protein)
  -l LINEAGE, --lineage_dataset LINEAGE
                        Specify the name of the BUSCO lineage to be used.
  --augustus            Use augustus gene predictor for eukaryote runs
  --augustus_parameters "--PARAM1=VALUE1,--PARAM2=VALUE2"
                        Pass additional arguments to Augustus. All arguments should be contained within a single pair of quotation marks, separated by commas.
  --augustus_species AUGUSTUS_SPECIES
                        Specify a species for Augustus training.
  --auto-lineage        Run auto-lineage to find optimum lineage path
  --auto-lineage-euk    Run auto-placement just on eukaryote tree to find optimum lineage path
  --auto-lineage-prok   Run auto-lineage just on non-eukaryote trees to find optimum lineage path
  -c N, --cpu N         Specify the number (N=integer) of threads/cores to use.
  --config CONFIG_FILE  Provide a config file
  --datasets_version DATASETS_VERSION
                        Specify the version of BUSCO datasets, e.g. odb10
  --download [dataset [dataset ...]]
                        Download dataset. Possible values are a specific dataset name, "all", "prokaryota", "eukaryota", or "virus". If used together with other command line arguments, make sure to place this last.
  --download_base_url DOWNLOAD_BASE_URL
                        Set the url to the remote BUSCO dataset location
  --download_path DOWNLOAD_PATH
                        Specify local filepath for storing BUSCO dataset downloads
  -e N, --evalue N      E-value cutoff for BLAST searches. Allowed formats, 0.001 or 1e-03 (Default: 1e-03)
  -f, --force           Force rewriting of existing files. Must be used when output files with the provided name already exist.
  -h, --help            Show this help message and exit
  --limit N             How many candidate regions (contig or transcript) to consider per BUSCO (default: 3)
  --list-datasets       Print the list of available BUSCO datasets
  --long                Optimization Augustus self-training mode (Default: Off); adds considerably to the run time, but can improve results for some non-model organisms
  --metaeuk_parameters "--PARAM1=VALUE1,--PARAM2=VALUE2"
                        Pass additional arguments to Metaeuk for the first run. All arguments should be contained within a single pair of quotation marks, separated by commas.
  --metaeuk_rerun_parameters "--PARAM1=VALUE1,--PARAM2=VALUE2"
                        Pass additional arguments to Metaeuk for the second run. All arguments should be contained within a single pair of quotation marks, separated by commas.
  --offline             To indicate that BUSCO cannot attempt to download files
  --out_path OUTPUT_PATH
                        Optional location for results folder, excluding results folder name. Default is current working directory.
  -q, --quiet           Disable the info logs, displays only errors
  -r, --restart         Continue a run that had already partially completed.
  --tar                 Compress some subdirectories with many files to save space
  --update-data         Download and replace with last versions all lineages datasets and files necessary to their automated selection
  -v, --version         Show this version and exit

4、安装最新版本的busco5.2.2

conda create -n busco5.2.2 -c conda-forge -c bioconda busco=5.2.2

5、配制conda环境变量

vi ~/.bashrc
#添加如下内容到环境变量中
. /mnt/RAID-5/MD0/luoky/mambaforge/etc/profile.d/conda.sh

三、参考链接

User guide BUSCO v5.5.0

Conda 安装 Busco - 简书

Index of /v4/data/lineages/(busco数据集下载地址)

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值