python 提取页面标签 /源码－－－－－简单代码

最新推荐文章于 2023-07-16 00:30:00 发布

freefis

最新推荐文章于 2023-07-16 00:30:00 发布

阅读量909

点赞数

文章标签： python html import class

本文链接：https://blog.csdn.net/freefis/article/details/1956638

版权

# ! /usr/bin/env python

# -*- coding: utf-8 -*-

import urllib

sock = urllib.urlopen( " http://www.bitunion.org/ " )

html = sock.read()

sock.close()

html = unicode(html , " gbk " )

print html

# !/usr/bin/env python

from sgmllib import SGMLParser

import urllib

sock = urllib.urlopen( " http://www.bitunion.org/ " )

html = sock.read()

sock.close()

html = unicode(html , " gbk " )

# print html

s = html

class Parse(SGMLParser):

def reset(self):

self.found_title = 0

SGMLParser.reset(self)

def start_title(self, attrs):

self.found_title += 1

def end_title(self):

self.found_title -= 1

def handle_data(self, text):

if self.found_title > 0:

print ' Title: %s ' % text

p = Parse()

p.feed(s)

最后注视

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

关注关注