如果无法访问您实际要删除的表,我使用了以下示例:
Header1Header2Header3
Row 11Row 12Row 13Row 21Row 22Row 23Row 31Row 32Row 33并用以下方法刮去:from bs4 import BEautifulSoup as BS
content = #contents of that table
soup = BS(content, 'html5lib')
rows = [tr.findAll('td') for tr in soup.findAll('tr')]
这个rows对象是一个列表列表:[
[
Header1, Header2, Header3],[
Row 11, Row 12, Row 13],[
Row 21, Row 22, Row 23],[
Row 31, Row 32, Row 33]]
…您可以将其写入文件:for it in rows:
with open('result.csv', 'a') as f:
f.write(", ".join(str(e).replace('
','').replace('','') for e in it) + '\n')看起来是这样的:Header1, Header2, Header3
Row 11, Row 12, Row 13
Row 21, Row 22, Row 23
Row 31, Row 32, Row 33