SRA Toolkit(从NCBI下载数据)
参考:SRA Toolkit - prefetch 快速下载NCBI SRA数据 - 简书 (jianshu.com)
按需下载 ·ncbi/sra-tools Wiki (github.com)
安装
$ wget https://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/2.9.6/sratoolkit.2.9.6-ubuntu64.tar.gz
$ tar zxvf sratoolkit.2.9.6-ubuntu64.tar.gz
$ cd sratoolkit.2.9.6-ubuntu64
#加入环境路径
# ''中换成实际的绝对地址
$ echo 'export export PATH=$PATH:YOUR_PATH/sratoolkit.2.9.6-ubuntu64/bin' >> ~/.bash_profile
$ source ~/.bash_profile
#检查是否安装成功
$ prefetch -V
prefetch : 2.9.6
使用
以下载SRR8956151
为例
(biosoft) zhangziheng@jsj305-2:~/software$ prefetch SRR8956151
2023-08-07T07:29:59 prefetch.3.0.6: Current preference is set to retrieve SRA Normalized Format files with full base quality scores.
2023-08-07T07:30:00 prefetch.3.0.6: 1) Downloading 'SRR8956151'...
2023-08-07T07:30:00 prefetch.3.0.6: SRA Normalized Format file is being retrieved, if this is different from your preference, it may be due to current file availability.
2023-08-07T07:30:00 prefetch.3.0.6: Downloading via HTTPS...
2023-08-07T07:36:00 prefetch.3.0.6: HTTPS download succeed
2023-08-07T07:36:02 prefetch.3.0.6: 'SRR8956151' is valid
2023-08-07T07:36:02 prefetch.3.0.6: 1) 'SRR8956151' was downloaded successfully
2023-08-07T07:36:02 prefetch.3.0.6: 'SRR8956151' has 0 unresolved dependencies
下载完成之后,当前文件夹下会生成一个SRR8956151
文件夹,里面保存有SRR8956151.sra
文件
转换为fastq文件:
(biosoft) zhangziheng@jsj305-2:~/software$ fastq-dump SRR8956151
Read 14532081 spots for SRR8956151
Written 14532081 spots for SRR8956151
然后同级文件夹下会生成SRR8956151.fastq
文件
安装Aspera
(biosoft) zhangziheng@jsj305-2:~/software$ wget https://download.asperasoft.com/download/sw/connect/3.9.1/ibm-aspera-connect-3.9.1.171801-linux-g2.12-64.tar.gz
--2023-08-07 11:38:37-- https://download.asperasoft.com/download/sw/connect/3.9.1/ibm-aspera-connect-3.9.1.171801-linux-g2.12-64.tar.gz
正在解析主机 download.asperasoft.com (download.asperasoft.com)... 184.72.56.59
正在连接 download.asperasoft.com (download.asperasoft.com)|184.72.56.59|:443... 已连接。
已发出 HTTP 请求,正在等待回应... 200 OK
长度: 37588991 (36M) [application/x-gzip]
正在保存至: “ibm-aspera-connect-3.9.1.171801-linux-g2.12-64.tar.gz”
ibm-aspera-connect-3.9.1.171801-linux-g2.12-6 100%[================================================================================================>] 35.85M 4.10MB/s 用时 8.8s
2023-08-07 11:38:48 (4.07 MB/s) - 已保存 “ibm-aspera-connect-3.9.1.171801-linux-g2.12-64.tar.gz” [37588991/37588991])
(biosoft) zhangziheng@jsj305-2:~/software$ tar zxvf ibm-aspera-connect-3.9.1.171801-linux-g2.12-64.tar.gz
ibm-aspera-connect-3.9.1.171801-linux-g2.12-64.sh
(biosoft) zhangziheng@jsj305-2:~/software$ bash ibm-aspera-connect-3.9.1.171801-linux-g2.12-64.sh
Installing IBM Aspera Connect
Deploying IBM Aspera Connect (/home/zhangziheng/.aspera/connect) for the current user only.
Install complete.
(biosoft) zhangziheng@jsj305-2:~/software$ echo 'export PATH=$PATH:~/.aspera/connect/bin' >> ~/.bash_profile
(biosoft) zhangziheng@jsj305-2:~/software$ source ~/.bash_profile