BWA程序调试记录

  • 1 wgsim生成read文件

./wgsim -N100000 -1100 -d0 -S11 -e0 -r10 xx.fa xx.fq /dev/null

  • 2 Index生成库文件

在Program arguments中输入 index xx.fa,运行后生成5个文件

pac:2位编码后的文件

ann、amb:临时文件

bwt:包括BWT数组和OCC信息

sa:由bwt文件生成的后缀数组

  • 3 bwa mem 比对

bwa mem -R '@RG\tID:foo\tSM:bar\tLB:library1' <ref.fa> <read1.fa> <read2.fa>  >  lane.sam

例子:

bwa mem -t 12 -o test.sam 15.fa 15.fq

        参数详解:

-o FILE-   sam file to output results to [stdout]

-t INTNumber of threads [1]  比对线程数目,每个线程默认读取1千万bp
-k INTMinimum seed length. Matches shorter than INT will be missed. The alignment speed is usually insensitive to this value unless it significantly deviates 20. [19]
-w INTBand width. Essentially, gaps longer than INT will not be found. Note that the maximum gap length is also affected by the scoring matrix and the hit length, not solely determined by this option. [100]
-d INTOff-diagonal X-dropoff (Z-dropoff). Stop extension when the difference between the best and the current extension score is above |i-j|*A+INT, where i and j are the current positions of the query and reference, respectively, and A is the matching score. Z-dropoff is similar to BLAST’s X-dropoff except that it doesn’t penalize gaps in one of the sequences in the alignment. Z-dropoff not only avoids unnecessary extension, but also reduces poor alignments inside a long good alignment. [100]
-r FLOATTrigger re-seeding for a MEM longer than minSeedLen*FLOAT. This is a key heuristic parameter for tuning the performance. Larger value yields fewer seeds, which leads to faster alignment speed but lower accuracy. [1.5]
-c INTDiscard a MEM if it has more than INT occurence in the genome. This is an insensitive parameter. [10000]
-PIn the paired-end mode, perform SW to rescue missing hits only but do not try to find hits that fit a proper pair.
双末端序列,运行 SW 模式,进一步比对未成功匹配的reads,成功匹配的reads保留原样。
-A INTMatching score. [1]
-B INTMismatch penalty. The sequence error rate is approximately: {.75 * exp[-log(4) * B/A]}. [4]
-O INTGap open penalty. [6]
-E INTGap extension penalty. A gap of length k costs O + k*E (i.e. -O is for opening a zero-length gap). [1]
-L INTClipping penalty. When performing SW extension, BWA-MEM keeps track of the best score reaching the end of query. If this score is larger than the best SW score minus the clipping penalty, clipping will not be applied. Note that in this case, the SAM AS tag reports the best SW score; clipping penalty is not deducted. [5]
-U INTPenalty for an unpaired read pair. BWA-MEM scores an unpaired read pair as scoreRead1+scoreRead2-INT and scores a paired as scoreRead1+scoreRead2-insertPenalty. It compares these two scores to determine whether we should force pairing. [9]
-pAssume the first input query file is interleaved paired-end FASTA/Q. See the command description for details.
-R STRComplete read group header line. ’\t’ can be used in STR and will be converted to a TAB in the output SAM. The read group ID will be attached to every read in the output. An example is ’@RG\tID:foo\tSM:bar’. [null]
-T INTDon’t output alignment with score lower than INT. This option only affects output. [30]
-aOutput all found alignments for single-end or unpaired paired-end reads. These alignments will be flagged as secondary alignments.
-CAppend append FASTA/Q comment to SAM output. This option can be used to transfer read meta information (e.g. barcode) to the SAM output. Note that the FASTA/Q comment (the string after a space in the header line) must conform the SAM spec (e.g. BC:Z:CGTAC). Malformated comments lead to incorrect SAM output.
-HUse hard clipping ’H’ in the SAM output. This option may dramatically reduce the redundancy of output when mapping long contig or BAC sequences.
-MMark shorter split hits as secondary (for Picard compatibility).
将较短的 split hits 标记为secondary;用于兼容picard
-v INTControl the verbose level of the output. This option has not been fully supported throughout BWA. Ideally, a value 0 for disabling all the output to stderr; 1 for outputting errors only; 2 for warnings and errors; 3 for all normal messages; 4 or higher for debugging. When this option takes value 4, the output is not SAM. [3]

 

转载于:https://my.oschina.net/u/3732258/blog/1591512

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值