I wrote some code to parse html, but the result was not what I wanted:
import urllib2
html = urllib2.urlopen('http://dummy').read()
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(html)
for definition in soup.findAll('span', {"class":'d'}):
definition = definition.renderContents()
print "", definition
for exampleofuse in soup.find('span',{"class":'x'}):
print "", exampleofuse, ""
print ""
Is there any kind of way that when class attribute is "d" or "x" to then get the string?
The following html code is what I want to parse:
calculated by adding several amounts together
an average rate
at an average speed of 100 km/h
typical or normal
average intelligence
20 pounds for dinner is average
Then, this is the result I want:
calculated by adding several amounts together
an average rate
at an average speed of 100 km/h
typical or normal
average intelligence
20 pounds for dinner is average
解决方案
yes, you can get all of the spans in the html, then for each check for a class of "d" or "x", and if they do, print them.
something like this might work (untested):
for span in soup.findAll('span'):
if span.find("span","d").string:
print "" + span.find("span","d").string + ""
elif span.find("span","x").string:
print "" + span.find("span","x").string + ""