[lijing@master ~]$ cd lijing202110
[lijing@master lijing202110]$ mkdir kraken2
[lijing@master lijing202110]$ cd kraken2
[lijing@master kraken2]$ conda install -y kraken2
##Wget下载文件并解压
[lijing@master kraken2]$ wget https://github.com/DerrickWood/kraken2/archive/master.zip
[lijing@master kraken2]$ unzip master.zip
[lijing@master kraken2]$ ll
总用量 224
drwxrwxr-x 2 lijing lijing 10 10月 19 11:06 kraken2
drwxrwxr-x 6 lijing lijing 241 5月 10 14:11 kraken2-master
-rw-rw-r-- 1 lijing lijing 227378 10月 19 11:01 master.zip
[lijing@master kraken2]$ cd kraken2
[lijing@master kraken2]$ ll
总用量 0
[lijing@master kraken2]$ cd ..
[lijing@master kraken2]$ cd kraken2-master
[lijing@master kraken2-master]$ ll
总用量 32
-rw-rw-r-- 1 lijing lijing 5786 5月 10 14:11 CHANGELOG.md
-rw-rw-r-- 1 lijing lijing 618 5月 10 14:11 CMakeLists.txt
drwxrwxr-x 2 lijing lijing 310 5月 10 14:11 data
drwxrwxr-x 2 lijing lijing 165 5月 10 14:11 docs
-rwxr-xr-x 1 lijing lijing 1265 5月 10 14:11 install_kraken2.sh
-rw-rw-r-- 1 lijing lijing 1084 5月 10 14:11 LICENSE
-rw-rw-r-- 1 lijing lijing 2258 5月 10 14:11 README.md
drwxrwxr-x 2 lijing lijing 4096 5月 10 14:11 scripts
drwxrwxr-x 2 lijing lijing 4096 5月 10 14:11 src
[lijing@master kraken2-master]$ ./install_kraken2.sh /home/lijing/lijing202110/kraken2/kraken2
Kraken 2 installation complete.
To make things easier for you, you may want to copy/symlink the following
files into a directory in your PATH:
/home/lijing/lijing202110/kraken2/kraken2/kraken2
/home/lijing/lijing202110/kraken2/kraken2/kraken2-build
/home/lijing/lijing202110/kraken2/kraken2/kraken2-inspect
[lijing@master kraken2-master]$ ll
配置环境变量:
安装完成:
版本:
Kraken2说明:
/home/lijing/lijing202110/kraken2/kraken2
/home/lijing/lijing202110/kraken2/db/kraken2/20211018
/home/lijing/lijing202110/kraken2/db/kraken2/20211019
创建标准库:
[lijing@master kraken2]$ mkdir -p /home/lijing/lijing202110/kraken2/db/kraken2/20211018
[lijing@master 20211018]$ kraken2-build --standard --threads 24 --db /home/lijing/lijing202110/kraken2/db/kraken2/20211018
还是有问题:
重建标准库:
出现结果显示:rsync_from_ncbi.pl: unexpected FTP path (new server?) for https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/762/265/GCF_000762265.1_ASM76226v1
解决上面出现的问题:https://github.com/DerrickWood/kraken2/issues/508
后台下载:
[lijing@master kraken2]$ nohup kraken2-build --use-ftp --standard --threads 24 - -db /home/lijing/lijing202110/kraken2/db/kraken2/standard &
可用 jobs -l 进行窗口查看
使用ps 和top关闭窗口后进行查看
在cd的文件中出现说明文件:
还是出现问题:
gzip: nucl_gb.accession2taxid.gz: invalid compressed data–format violated
1.上面出现问题:建立个人库:
/home/lijing/lijing202110/kraken2/db/kraken2/private
下载taxonomy分类学注释:
先下载真菌库:
Viral:
建立索引:
2.建立非标准的most库失败:
3.建立标准库:分别下载古菌archaea、细菌bacteria、人类human、载体UniVec_Core、病毒viral这些库然后再下索引:
结果:因为标准库较大,可以被分类的序列更多,2%左右被分类
用个人库private(个人库里有viral和fungi)进行测试fasta文件(27.6m):
[lijing@master private]$ kraken2 --db /home/lijing/lijing202110/kraken2/db/kraken2/private /home/lijing/lijing202110/20211019test-data/NY_test.fa
用个人库private测试fastq文件(45.8m):
结果:因为库比较小只有较少的序列被分类
结果: 输出classify序列和unclassify序列以及一个.kraken文件
[lijing@master ~]$ kraken2 --db /home/lijing/lijing202110/kraken2/db/kraken2/private --fastq-input /home/lijing/lijing202110/20211019test-data/NY_test.fq --classified-out /home/lijing/lijing202110/20211019test-data/NY_test/output_classify --unclassified-out /home/lijing/lijing202110/20211019test-data/NY_test/output_unclassify> /home/lijing/lijing202110/20211019test-data/NY_test/NY_test.kraken
输出结果:
输出报告:三个文件
用标准库测试原始raw_fastq.gz文件:
(base) [lijing@master ~]$ kraken2 --threads 10 --db /home/lijing/lijing202110/kraken2/db/kraken2/standard --fastq-input --gzip-compressed /home/lijing/lijing202110/20211019test-data/NY_rawdata.fq.gz --classified-out /home/lijing/lijing202110/20211019test-data/NY_raw/output_classify --unclassified-out /home/lijing/lijing202110/20211019test-data/NY_raw/output_unclassify> /home/lijing/lijing202110/20211019test-data/NY_raw/NY_raw.kraken
结果:有77%被识别
报告输出:
(base) [lijing@master NY_raw]$ cat /home/lijing/lijing202110/20211019test-data/NY_raw/NY_raw.kraken | head -n 20
或(base) [lijing@master NY_raw]$ sed -n '1,20p' /home/lijing/lijing202110/20211019test-data/NY_raw/NY_raw.kraken
查看前20行
其他:
测试文件上传:
使用psftp失败:
使用filezilla进行上传:
打开即可上传:
转换fastq-fasta格式:
awk '{if(NR%4 == 1){print ">" substr($0, 2)}}{if(NR%4 == 2){print}}' fastq > fasta
失败的情况:
中途断掉似乎下不了文件:
查看数据库的情况:
[lijing@master ~]$ kraken2-inspect --db /home/lijing/lijing202110/kraken2/db/kraken2/private | head -5
附加:
写在结尾中的情况:
https://ccb.jhu.edu/software/kraken2/index.shtml?t=downloads
查看内存所占大小:
[lijing@master standard]$ du -bsh /home/lijing/lijing202110
建好标准库需要standard需要199g:
[lijing@master ~]$ du -bsh /home/lijing/lijing202110/kraken2/db/kraken2/standard
199G /home/lijing/lijing202110/kraken2/db/kraken2/standard
去掉后台的某个任务:
去掉某个任务用kill -9 任务名,可以去掉某个任务
[lijing@master ~]$ kill -9 106825
使用[lijing@master ~]$ rm -rf /home/lijing/lijing202110/kraken2/db/kraken2/standard/library
,删除文件夹