Table
Regular Expression | Usage |
---|---|
^ | Matches the beginning of a line |
$ | Matches the end of the line |
. | Matches any character |
\s | Matches whitespace |
\S | Matches any non-whitespace character |
* | Repeats a character zero or more times |
*? | Repeats a character zero or more times (non-greedy) |
+ | Repeats a character one or more times |
+? | Repeats a character one or more times (non-greedy) |
[aeiou] | Matches a single character in the listed set |
[^XYZ] | Matches a single character not in the listed set |
[a-z0-9] | The set of characters can include a range |
( | Indicates where string extraction is to start |
) | Indicates where string extraction is to end |
Module
- Must import the library using
import re
before using it. - use
re.search()
to see if a string matches a regular expression, similar to using thefind()
method for stringshand = open('mbox-short.txt') for line in hand: if line.find('From:') >= 0: print(line)
import re hand = open('mbox-short.txt') for lin in hand: line = line.rstrip() if re.search('From:', line): print(line)
- use
re.findall()
to extract portions of a string that match your regular expression, similar to a combination offind()
and slicing:var[5:10]
Examples
- using
re.search()
likestartswith()
:
we fine-tune what is matched by adding special characters to the stringhand = open('mbox-short.txt') for line in hand: line = line.rstrip() if line.startswith('From:'): print(line)
import re hand = open('mbox-short.txt') for line in hand: line = line.restrip() if re.search('^From:', line): print(line)
- Depending on how “clean” your data is and the purpose of your application, you may want to narrow your match down a bit.
- To search a regular
$
use ‘$’
Matching and extracting data
Practical applications
- String parsing examples