COBRA:安装与使用

本文介绍了一种名为COBRA的工具,它改进了从宏基因组中组装的病毒基因组的完整性与连续性。通过使用多种序列分析软件和覆盖度评估方法,如bwa-mem2,samtools和CoverM,COBRA提高了病毒基因组拼接的质量。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

 

COBRA improves the completeness and contiguity of viral genomes assembled from metagenomes | Nature Microbiology

mkdir COBRA
cd COBRA
git clone https://github.com/linxingchen/cobra.git
cd cobra
python setup.py install
python cobra.py -h
#修改下cobra.py
from Bio.SeqUtils import GC 改成 from Bio.SeqUtils import gc_fraction
#依赖项
wget https://github.com/wwood/CoverM/releases/download/v0.6.1/coverm-x86_64-unknown-linux-musl-0.6.1.tar.gz
tar -xzvf coverm-x86_64-unknown-linux-musl-0.6.1.tar.gz
(完整路径)/coverm

 值得注意的是,已经更新为1.2.3最新版本

fp_path=$(pwd)
cd final_assembly
fasta_raw_path=$(pwd)
cd clean_TPM_Decontam_contig
cd Vir_result
fasta_out_path=$(pwd)
software_path="/home/zhongpei/hard_disk_sda2/zhongpei/Software"

ruanjian_list=("megahit" "idba-ud" "metaSPAdes")
zubie_list=("AF" "OS" "VS" "IC" "RC" "M" "P")

for zubie in "${zubie_list[@]}"; do
	for i in ${zubie}*_metaSPAdes_split.fa
	do	
		num=${i%%_metaSPAdes_split.fa}
		${software_path}/bwa-mem2/bwa-mem2 index ${fasta_raw_path}/${num}_metaSPAdes_ALN_contigs_line_rn.fa
		${software_path}/bwa-mem2/bwa-mem2 mem -t 180 ${fasta_raw_path}/${num}_metaSPAdes_ALN_contigs_line_rn.fa ${fp_path}/${num}_clean_1.fastq ${fp_path}/${num}_clean_2.fastq > ${num}_metaSPAdes_aligen.sam
		samtools view -Sb -@180 ${num}_metaSPAdes_aligen.sam > ${num}_metaSPAdes_aligen.bam
		samtools sort -@180 ${num}_metaSPAdes_aligen.bam -o ${num}_metaSPAdes_aligen_sort.bam
		${software_path}/CoverM/coverm/coverm contig -b ${num}_metaSPAdes_aligen_sort.bam -p bwa-mem -t 180 --output-file ${num}_metaSPAdes_CoverM.txt
		sed -i '1d' ${num}_metaSPAdes_CoverM.txt
		${software_path}/COBRA/cobra/cobra.py -q ${i} -f ${fasta_raw_path}/${num}_metaSPAdes_ALN_contigs_line_rn.fa -a metaspades -mink 21 -maxk 121 -m ${num}_metaSPAdes_aligen_sort.bam -c ${num}_metaSPAdes_CoverM.txt -o ${num}_metaSPAdes_COBRA -t 180
		rm ${fasta_raw_path}/*fa.*
		rm ${num}_metaSPAdes_aligen.sam
		rm *metaSPAdes*.bam
		rm *metaSPAdes*_CoverM.txt
	done
done

for zubie in "${zubie_list[@]}"; do
	for i in ${zubie}*_megahit_split.fa
	do	
		num=${i%%_megahit_split.fa}
		${software_path}/bwa-mem2/bwa-mem2 index ${fasta_raw_path}/${num}_megahit_ALN_contigs_line_rn.fa
		${software_path}/bwa-mem2/bwa-mem2 mem -t 180 ${fasta_raw_path}/${num}_megahit_ALN_contigs_line_rn.fa ${fp_path}/${num}_clean_1.fastq ${fp_path}/${num}_clean_2.fastq > ${num}_megahit_aligen.sam
		samtools view -Sb -@180 ${num}_megahit_aligen.sam > ${num}_megahit_aligen.bam
		samtools sort -@180 ${num}_megahit_aligen.bam -o ${num}_megahit_aligen_sort.bam
		${software_path}/CoverM/coverm/coverm contig -b ${num}_megahit_aligen_sort.bam -p bwa-mem -t 180 --output-file ${num}_megahit_CoverM.txt
		sed -i '1d' ${num}_megahit_CoverM.txt
		${software_path}/COBRA/cobra/cobra.py -q ${i} -f ${fasta_raw_path}/${num}_megahit_ALN_contigs_line_rn.fa -a megahit -mink 21 -maxk 141 -m ${num}_megahit_aligen_sort.bam -c ${num}_megahit_CoverM.txt -o ${num}_megahit_COBRA -t 180
		rm ${fasta_raw_path}/*fa.*
		rm ${num}_megahit_aligen.sam
		rm *megahit*.bam
		rm *megahit*_CoverM.txt
	done
done

for zubie in "${zubie_list[@]}"; do
	for i in ${zubie}*_idba-ud_split.fa
	do	
		num=${i%%_idba-ud_split.fa}
		${software_path}/bwa-mem2/bwa-mem2 index ${fasta_raw_path}/${num}_idba-ud_ALN_contigs_line_rn.fa
		${software_path}/bwa-mem2/bwa-mem2 mem -t 180 ${fasta_raw_path}/${num}_idba-ud_ALN_contigs_line_rn.fa ${fp_path}/${num}_clean_1.fastq ${fp_path}/${num}_clean_2.fastq > ${num}_idba-ud_aligen.sam
		samtools view -Sb -@180 ${num}_idba-ud_aligen.sam > ${num}_idba-ud_aligen.bam
		samtools sort -@180 ${num}_idba-ud_aligen.bam -o ${num}_idba-ud_aligen_sort.bam
		${software_path}/CoverM/coverm/coverm contig -b ${num}_idba-ud_aligen_sort.bam -p bwa-mem -t 180 --output-file ${num}_idba-ud_CoverM.txt
		sed -i '1d' ${num}_idba-ud_CoverM.txt
		${software_path}/COBRA/cobra/cobra.py -q ${i} -f ${fasta_raw_path}/${num}_idba-ud_ALN_contigs_line_rn.fa -a idba -mink 21 -maxk 121 -m ${num}_idba-ud_aligen_sort.bam -c ${num}_idba-ud_CoverM.txt -o ${num}_idba-ud_COBRA -t 180
		rm ${fasta_raw_path}/*fa.*
		rm ${num}_idba-ud_aligen.sam
		rm *idba-ud*.bam
		rm *idba-ud*_CoverM.txt
	done
done

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值