接上一篇Chapter7
Text Processing with Awk
Two basic concepts——records and fields, and pattern-action pairs
Assigns the entire record to the variable $0, and field one’s value is assigned to $1, field two’s value is assigned to $2, field three’s value is assigned to $3, and so forth.
pattern { action };pattern { action };…
If we omit the pattern, Awk will run the action on all records. If we omit the action but specify a pattern, Awk will print all records that match the pattern.
1. mimic cat
$ awk '{ print $0 }' example.bed
chr1 26 39
chr1 32 47
chr3 11 28
chr1 40 49
chr3 16 27
chr1 9 28
chr2 35 54
chr1 10 19
2. mimic cut
$ awk '{ print $2 "\t" $3 }' example.bed
26 39
32 47
11 28
40 49
16 27
9 28
35 54
10 19
3. output lines where the length of the feature (end position - start position) was greater than 18
$ awk '$3 - $2 > 18' example.bed
chr1 9 28
chr2 35 54
4. all lines on chromosome 1 with a length greater than 10
$ awk '$1 ~ /chr1/ && $3 - $2 > 10' example.bed
chr1 26 39
chr1 32 47
chr1 9 28
5. 为chr2和chr3加入基因长度列
$ awk '$1 ~ /chr2|chr3/ { print $0 "\t" $3 - $2 }' example.bed
chr3 11 28 17
chr3 16 27 11
chr2 35 54 19
6. Two special patterns: BEGIN and END
The BEGIN pattern specifies what to do before the first record is read in, and END spe