正则表达式Regular Expression
来看一下正则表达式的语法
我们导入re
Import re
来看一下一些方法
我们来写代码看看
import re
str='abc123'
search = re.search('abc', str)
print(search)
search = re.search('\w', str)
print(search)
search = re.search('\w\w', str)
print(search)
search = re.search('\w\w\w', str)
print(search)
search = re.search('\d', str)
print(search)
search = re.search('\d\d', str)
print(search)
search = re.search('\d\d\d', str)
print(search)
结果是
第1行匹配’abc’
第2行是找1个字母
第3行是找2个字母
第4行是找3个字母
第5行是找1个数字
第6行是找2个数字
第7行是找3个数字
然后我们来看看re.findall方法
import re
str='abc123'
findall = re.findall('abc', str)
print(findall)
findall = re.findall('\w', str)
print(findall)
findall = re.findall('\w\w', str)
print(findall)
findall = re.findall('\w\w\w', str)
print(findall)
findall = re.findall('\d', str)
print(findall)
findall = re.findall('\d\d', str)
print(findall)
findall = re.findall('\d\d\d', str)
print(findall)
结果是
再举一些例子
import re
text = 'He was carefully disguised but captured quickly by police.'
print(re.findall('ly', text))
结果是
那么我们应该如何匹配完整的单词
Carefully 和 quickly
如果改成这样
import re
text = 'He was carefully disguised but captured quickly by police.'
print(re.findall('\w\w\w\w\wly', text))
结果是
这样的话,单词还是不对
再改一下
import re
text = 'He was carefully disguised but captured quickly by police.'
print(re.findall('\w+ly', text))
结果是
例子
我们要找文本中,两个连续的字母
import re
text = 'mississippi'
print(re.findall(r'((?P<letter>[a-z])(?P=letter))', text))
结果是
可以改成
import re
text = 'mississippi'
print(re.findall(r'(([a-z])\2)', text))
结果是
再来个例子
文本是
He is taller than a building and stronger than a rock.
要找到这个文本中 ADJer than x
代码
import re
text = 'He is taller than a building and stronger than a rock.'
rx = '(\w+er)\sthan\sa\s(\w+)'
print(re.findall(rx, text))
结果是