如果你有一大块HTML来解析BeautifulSoup,这应该很简单.一般的想法是使用findChildren方法导航到你的表,然后你可以使用string属性获取单元格内的文本值.
>>> from BeautifulSoup import BeautifulSoup
>>>
>>> html = """
...
...
...
...
column 1column 2...
value 1value 2...
...
...
... """
>>>
>>> soup = BeautifulSoup(html)
>>> tables = soup.findChildren('table')
>>>
>>> # This will get the first (and only) table. Your page may have more.
>>> my_table = tables[0]
>>>
>>> # You can find children with multiple tags by passing a list of strings
>>> rows = my_table.findChildren(['th', 'tr'])
>>>
>>> for row in rows:
... cells = row.findChildren('td')
... for cell in cells:
... value = cell.string
... print "The value in this cell is %s" % value
...
The value in this cell is column 1
The value in this cell is column 2
The value in this cell is value 1
The value in this cell is value 2
>>>