安装
#直接conda简单粗暴了 or 去官网下载安装
conda install bwa
简介
即Burrows-Wheeler-Alignment Tool。BWA 是一种能够将差异度较小的序列比对到一个较大的参考基因组上的软件包。它有三个不同的算法:
- BWA-MEM: 推荐使用的算法,支持较长的read长度,同时支持剪接性比对(split alignments),但是BWA-MEM是更新的算法,也更快,更准确,且 BWA-MEM 对于 70bp-100bp 的 Illumina 数据来说,效果也更好些。
- BWA-backtrack: 是用来比对 Illumina 的序列的,reads 长度最长能到 100bp。
- BWA-SW: 用于比对 long-read ,支持的长度为 70bp-1Mbp;同时支持剪接性比对。–这个算法没用过
BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM.
The first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to 1Mbp.
BWA-MEM and BWA-SW share similar features such as long-read support and split alignment, but BWA-MEM, which is the latest, is generally recommended for high-quality queries as it is faster and more accurate. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads.
index
-a STR BWT construction algorithm: bwtsw, is or rb2 [auto]
Warning:-a bwtsw' does not work for short genomes, while
-a is’ and
`-a div’ do not work not for long genomes.
不加-a参数的话就是auto
人类参考基因组要用bwtsw算法建库
bwa index -a bwtsw hg19.fa
BWA-MEM算法
single sequence
#bwa mem
bwa mem -t 5 hg19.nostrange.fa result9675589.fq |samtools sort -@ 5 -o result9675589.bam -
paired sequence
bwa mem -t 20 hg19.nostrange.fa *_1.fq.gz *_2.fq.gz \
|samtools sort -@ 5 -o 6376-5.bam -
BWA-backtrack算法
短序列-双端比对
#sai:是将fastq比对后出来的文件,用于最后输出比对结果sam文件的
#短序列建库
bwa index -a is ref.fna
#寻找 SA coordinates
bwa aln ref.fna reads.1.fq.gz > reads1.sai
bwa aln ref.fna reads.2.fq.gz > reads2.sai
# 转换SA coordinates输出为sam
bwa sampe ref.fna reads1.sai reads2.sai reads.1.fq.gz reads.2.fq.gz >all.sam
参考
可以去查看BWA的三篇学术论文,Li H.博士的