python读取html的表格_解析HTML表格到Python列表?

I'd like to take an HTML table and parse through it to get a list of dictionaries. Each list element would be a dictionary corresponding to a row in the table.

If, for example, I had an HTML table with three columns (marked by header tags), "Event", "Start Date", and "End Date" and that table had 5 entries, I would like to parse through that table to get back a list of length 5 where each element is a dictionary with keys "Event", "Start Date", and "End Date".

Thanks for the help!

解决方案

You should use some HTML parsing library like lxml:

from lxml import etree

s = """

EventStart DateEnd Date
abc
def
ghi

"""

table = etree.HTML(s).find("body/table")

rows = iter(table)

headers = [col.text for col in next(rows)]

for row in rows:

values = [col.text for col in row]

print dict(zip(headers, values))

prints

{'End Date': 'c', 'Start Date': 'b', 'Event': 'a'}

{'End Date': 'f', 'Start Date': 'e', 'Event': 'd'}

{'End Date': 'i', 'Start Date': 'h', 'Event': 'g'}

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值