从ncbi下载数据
One of the most important steps in genome analysis is gathering the data required for downstream research. This sometimes requires us to have the assembled reference genomes (mostly bacterial) so we can verify the classifiers trained or bins detected are correct and useful. This is often achieved using a BLAST search against the candidate reference genomes. However, it is very convenient to have our own BLAST database set up in advance if you are going to make a lot of search queries in future. Using the NCBI Web BLAST might not be a viable option if the project is long-running one with many experiments.
基因组分析中最重要的步骤之一就是收集下游研究所需的数据。 有时这需要我们具有组装的参考基因组(主要是细菌),因此我们可以验证训练的分类器或检测到的垃圾箱是正确且有用的。 这通常是通过对候选参考基因组进行BLAST搜索来实现的。 但是,如果将来您要进行很多搜索查询,则预先设置我们自己的BLAST数据库非常方便。 如果该项目长期运行并进行了许多实验,则使用NCBI Web BLAST可能不是一个可行的选择。
In this article, we will see how we can download the set of all the available bacterial references (or assemblies) from either GenBank or RefSeq databases. This wasn't quite straightforward, hence we present an article dedicated to this particular task.
在本文中,我们将看到如何从GenBank或RefSeq数据库下载所有可用细菌参考(或程序集)的集合。 这并不是很简单,因此我们提供了一篇专门针对此特定任务的文章。