I'm compiling prices from a website using regex.
PriceFinder = re.compile('(?<=\n\s\\$)(\d*\.\d{2})(?=\<\/)|(?<=\"FF0000">\$)(\d*\.\d{2})(?=\<\/)')
Price = re.findall(PriceFinder, str(soup))
print Price
I'm getting the following result:
[('', '30.99'), ('', '30.99'), ('', '30.99'), ('34.99', ''), ('34.99', '')
I would like to know what I have to add to my regex in order to obtain a list without any empty element.
['30.99','30.99','30.99','34.99','34.99']
Thanks
解决方案
Ok, I've written my first python to answer this question:
#!/usr/bin/python
import re
r = re.compile('(?:\n\s\\$|\"FF0000">\$)(\d*\.\d{2})(?=\<\/)')
p = re.findall(r, '$30.99\n $31.99')
print p
prints out ['30.99', '31.99']