接上一篇Chapter 7
The All-Powerful Grep
- grep “pattern” files
–color=auto
grep 是贪婪匹配,用**-w**进行准确匹配(constraining our matches to be words),默认输出行。
$ cat example.txt
bio
bioinfo
bioinformatics
computational biology
$ grep -v bioinfo example.txt
bio
computational biology
$ grep -v -w bioinfo example.txt
bio
bioinformatics
computational biology
get around this context before (-B), context: after (-A), and context before and after (-C). Each of these arguments takes how many lines of context to provide:
$ grep -B1 "AGATCGG" contam.fastq | head -n 6
@DJB775P1:248:D0MDGACXX:7:1202:12362:49613
TGCTTACTCTGCGTTGATACCACTGCTTAGATCGGAAGAGCACACGTCTGAA
--
@DJB775P1:248:D0MDGACXX:7:1202:12782:49716
CTCTGCGTTGATACCACTGCTTACTCTGCGTTGATACCACTGCTTAGATCGG
--
$ grep -A2 "AGATCGG" contam.fastq | head -n 6
TGCTTACTCTGCGTTGATACCACTGCTTAGATCGGAAGAGCACACGTCTGAA
+
JJJJJIIJJJJJJHIHHHGHFFFFFFCEEEEEDBD?DDDDDDBDDDABDDCA
--
CTCTGCGTTGATACCACTGCTTACTCTGCGTTGATACCACTGCTTAGATCGG
+
$ grep "Olfr141[13]" Mus_musculus.GRCm38.75_chr1_genes.txt
ENSMUSG00000058904 Olfr1413
ENSMUSG00000062497 Olfr1411
grep allows us to turn on ERE with the -E option
$ grep -E "(Olfr1413|Olfr1411)" Mus_musculus.GRCm38.75_chr1_genes.txt
ENSMUSG00000058904 Olfr1413
ENSMUSG00000062497 Olfr1411
计数:grep -c
$ grep -c "\tOlfr" Mus_musculus.GRCm38.75_chr1_genes.txt
27
Alternatively, we could pipe the matching lines to wc -l:
$ grep "\tOlfr" Mus_musculus.GRCm38.75_chr1_genes.txt | wc -l
27
only the matching part of the pattern:grep -o