抓取网页内容并保存，获取两个字符串之间的内容

最新推荐文章于 2019-08-14 16:16:35 发布

yue492008824

最新推荐文章于 2019-08-14 16:16:35 发布

阅读量608

点赞数

分类专栏： python

本文链接：https://blog.csdn.net/yue492008824/article/details/21244291

版权

python 专栏收录该内容

11 篇文章 0 订阅

订阅专栏


抓取网页内容import urllib2
url = "http://www.w3cschool.cc/python/python-tutorial.html"
urlfile = urllib2.urlopen(url)
html = urlfile.read()


获取 >>> 和 # 之间的内容：


import redef getlist(filename): myfile=open(filename) contents=myfile.read() mylist= re.findall(r"(?<=>>>).*?(?=#)",contents,re.DOTALL) myfile.close() return mylist