基因数据处理19之BWA匹配算法串产生、匹配、评价等整体流程

环境:

ubunut

BWA

samtools

wgsim

bwa.kit工具


1.数据下载:需要在bwa.kit下

bwa.kit/run-gen-ref hs38DH  

2.串产生:

hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH$ wgsim -N 1000 -1 10 hs38DH.fa <span style="font-family: Arial, Helvetica, sans-serif; font-size: 12px;">hs38DHSE1N10000L10F1.fq  </span> /dev/null 

将文件移到

<pre name="code" class="plain">~/cloud/adam/xubo/data/hs38DH/test20160415

 

3.匹配+评价

hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH/test20160415$ bwa aln ../hs38DH.fa hs38DHSE1N10000L10F1.fq >hs38DHSE1N10000L10F1.sai
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 0.10 sec
[bwa_aln_core] write to the disk... 0.00 sec
[bwa_aln_core] 9935 sequences have been processed.
[main] Version: 0.7.12-r1039
[main] CMD: bwa aln ../hs38DH.fa hs38DHSE1N10000L10F1.fq
[main] Real time: 0.195 sec; CPU: 0.119 sec
hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH/test20160415$ bwa samse ../hs38DH.fa hs38DHSE1N10000L10F1.sai hs38DHSE1N10000L10F1.fq >hs38DHSE1N10000L10F1.sam
[bwa_aln_core] convert to sequence coordinate... 0.04 sec
[bwa_aln_core] refine gapped alignments... 0.00 sec
[bwa_aln_core] print alignments... 0.02 sec
[bwa_aln_core] 9935 sequences have been processed.
[main] Version: 0.7.12-r1039
[main] CMD: bwa samse ../hs38DH.fa hs38DHSE1N10000L10F1.sai hs38DHSE1N10000L10F1.fq
[main] Real time: 0.154 sec; CPU: 0.086 sec
hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH/test20160415$ samtools flagstat hs38DHSE1N10000L10F1.sam
9935 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
9907 + 0 mapped (99.72% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH/test20160415$ bwa aln ../hs38DH.fa hs38DHSE1N10000L
hs38DHSE1N10000L10F1.fq   hs38DHSE1N10000L10F1.sai  hs38DHSE1N10000L10F1.sam  hs38DHSE1N10000L70F1.fq   
hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH/test20160415$ bwa aln ../hs38DH.fa hs38DHSE1N10000L70F1.fq >hs38DHSE1
hs38DHSE1L10000F1.fq      hs38DHSE1N10000L10F1.fq   hs38DHSE1N10000L10F1.sai  hs38DHSE1N10000L10F1.sam  hs38DHSE1N10000L70F1.fq   
hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH/test20160415$ bwa aln ../hs38DH.fa hs38DHSE1N10000L70F1.fq >hs38DHSE1N10000L70F1.sai
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 0.72 sec
[bwa_aln_core] write to the disk... 0.00 sec
[bwa_aln_core] 9935 sequences have been processed.
[main] Version: 0.7.12-r1039
[main] CMD: bwa aln ../hs38DH.fa hs38DHSE1N10000L70F1.fq
[main] Real time: 0.779 sec; CPU: 0.734 sec
hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH/test20160415$ bwa samse ../hs38DH.fa hs38DHSE1N10000L70F1.sai hs38DHSE1N10000L70F1.fq >hs38DHSE1N10000L70F1.sam
[bwa_aln_core] convert to sequence coordinate... 0.04 sec
[bwa_aln_core] refine gapped alignments... 0.01 sec
[bwa_aln_core] print alignments... 0.03 sec
[bwa_aln_core] 9935 sequences have been processed.
[main] Version: 0.7.12-r1039
[main] CMD: bwa samse ../hs38DH.fa hs38DHSE1N10000L70F1.sai hs38DHSE1N10000L70F1.fq
[main] Real time: 0.207 sec; CPU: 0.109 sec
hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH/test20160415$ samtools flagstat hs38DHSE1N10000L70F1.sam 
9935 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
9567 + 0 mapped (96.30% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

最后得出匹配的准确率在96%左右

评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值