python如何解析table标签_如何使用Python解析带有表的HTML文件

最新推荐文章于 2023-01-08 15:56:31 发布

weixin_39722070

最新推荐文章于 2023-01-08 15:56:31 发布

阅读量695

点赞数

文章标签： python如何解析table标签

I have got a html file with table ( its a large one, so only sample code is given ). I want to retrieve the values in tables. I tried the HTMLParser library from python.

I started coding like below. Then I found that the attribute "class" is same as system defined keyword. So its giving me error.

class MyHTMLParser(HTMLParser):

def handle_starttag(self, tag, attrs):

if tag == 'tr':

for class in attrs:

if class == 'Table_row'

p = MyHTMLParser()

p.feed(ht)

HTML code for table

STATION CODE	STATION NAME	SCHEDULED ARRIVAL	SCHEDULED DEPARTURE	ACTUAL/ EXPECTED ARRIVAL	ACTUAL/ EXPECTED DEPARTURE
TVC	ORIGON	Starting Station	05:00, 07 May 2011	Starting Station	05:00, 07 May 2011
TVP	NEY YORK	05:04, 07 May 2011	05:05, 07 May 2011	05:04, 07 May 2011	05:05, 07 May 2011

UPDATE

How could I get data between the tags?

解决方案

Note that the documentation of the handle_starttag method states:

The tag argument is the name of the

tag converted to lower case. The attrs

argument is a list of (name, value)

pairs containing the attributes found

inside the tag’s <> brackets.

So, you're probably looking for something like:

from HTMLParser import HTMLParser

class MyHTMLParser(HTMLParser):

def handle_starttag(self, tag, attrs):

if tag == 'tr':

for name, value in attrs:

if name == 'class':

print 'Found class', value

p = MyHTMLParser()

p.feed(ht)

Prints:

Found class Table_Heading

Found class Table_row

Found class alternat_table_row

P.S. I also recommend BeautifulSoup for parsing HTML with Python.

weixin_39722070

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python如何解析table标签_如何使用Python解析带有表的HTML文件

I have got a html file with table ( its a large one, so only sample code is given ). I want to retrieve the values in tables. I tried the HTMLParser library from python.I started coding like below. Th...
复制链接

扫一扫