Python正则匹配HTML,python获取并正则匹配html中的内容 – 摄影与挖洞

最新推荐文章于 2022-09-24 11:04:33 发布

记录生活的蛋黄派

最新推荐文章于 2022-09-24 11:04:33 发布

阅读量1k

点赞数

文章标签： Python正则匹配HTML

#coding=utf-8

import urllib

import re

def getHtml(url):

page = urllib.urlopen(url)

html = page.read()

return html

def getlink(html):

reg = raw_input('Please input Regular Expression:')

linkre = re.compile(reg)

linklist = re.findall(linkre,html)

return linklist

address = raw_input('Please input url http://')

html = getHtml('http://'+address)

res = getlink(html)

newfile = file('Result.txt','w')

for i in res:

newfile.write(i+'n')

print i

newfile.close()

print 'Find',len(res),'nOutput file : Result.txt'

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

关注关注