SAM/BAM ALIGNMENT FORMAT 格式说明
The output of the ‘aln’ command is binary and designed for BWA use only.
BWA outputs the final alignment in the SAM (Sequence Alignment/Map) format. Each line consists of:
Col | Field | Description |
1 | QNAME | Query (pair) NAME |
2 | FLAG | bitwise FLAG |
3 | RNAME | Reference sequence NAME |
4 | POS | 1-based leftmost POSition/coordinate of clipped sequence |
5 | MAPQ | MAPping Quality (Phred-scaled) |
6 | CIAGR | extended CIGAR string |
7 | MRNM | Mate Reference sequence NaMe (‘=’ if same as RNAME) |
8 | MPOS | 1-based Mate POSistion |
9 | ISIZE | Inferred insert SIZE |
10 | SEQ | query SEQuence on the same strand as the reference |
11 | QUAL | query QUALity (ASCII-33 gives the Phred base quality) |
12 | OPT | variable OPTional fields in the format TAG:VTYPE:VALUE |
Each bit in the FLAG field is defined as:
Chr | Flag | Description |
p | 0x0001 | the read is paired in sequencing |
P | 0x0002 | the read is mapped in a proper pair |
u | 0x0004 | the query sequence itself is unmapped |
U | 0x0008 | the mate is unmapped |
r | 0x0010 | strand of the query (1 for reverse) |
R | 0x0020 | strand of the mate |
1 | 0x0040 | the read is the first read in a pair |
2 | 0x0080 | the read is the second read in a pair |
s | 0x0100 | the alignment is not primary |
f | 0x0200 | QC failure |
d | 0x0400 | optical or PCR duplicate |
BWA generates thefollowing optional fields. Tags starting with ‘X’ are specific to BWA.
Tag | Meaning |
NM | Edit distance |
MD | Mismatching positions/bases |
AS | Alignment score |
BC | Barcode sequence |
X0 | Number of best hits |
X1 | Number of suboptimal hits found by BWA |
XN | Number of ambiguous bases in the referenece |
XM | Number of mismatches in the alignment |
XO | Number of gap opens |
XG | Number of gap extentions |
XT | Type: Unique/Repeat/N/Mate-sw |
XA | Alternative hits; format: (chr,pos,CIGAR,NM;)* |
XS | Suboptimal alignment score |
XF | Support from forward/reverse alignment |
XE | Number of supporting seeds |
Note that XO and XGare generated by BWT search while the CIGAR string by Smith-Waterman alignment.
These two tags may be inconsistent with the CIGAR string. This is not a bug.