您的原始代码按原样工作。不过,您应该使用HTML解析器。在import re
p = re.compile('(.*?)\', re.IGNORECASE)
z = 'foo'
text = re.findall(p, z)
print text
输出:
^{pr2}$
编辑
正如蒂姆指出的,应该使用re.DOTALL,否则下面的方法将失败:import re
p = re.compile('(.*?)\', re.IGNORECASE|re.DOTALL)
z = ''' a more
complicated foo'''
text = re.findall(p, z)
print text
即使这样,嵌套跨度也会失败:import re
p = re.compile('(.*?)\', re.IGNORECASE|re.DOTALL)
z = ''' a more
complicatedotherfoo'''
text = re.findall(p, z)
print text
输出(失败):[' a more\ncomplicatedother']
因此,请使用类似BeautifulSoup的HTML解析器:from BeautifulSoup import BeautifulSoup
soup = bs(z)
p = re.compile('(.*?)\', re.IGNORECASE|re.DOTALL)
z = ''' a more
complicatedotherfoo'''
soup = BeautifulSoup(z)
print soup.findAll('span',{'class':''})
print soup.findAll('span',{'class':'other'})
输出:[ a more
complicatedotherfoo]
[other]