图书馆最新购买书籍

最新推荐文章于 2024-06-17 08:00:00 发布

weixin_30612769

最新推荐文章于 2024-06-17 08:00:00 发布

阅读量159

点赞数

原文链接：http://www.cnblogs.com/xieldy/p/6680742.html

版权

欢迎拜访我的新博客～～
http://blog.xieldy.cn

上周写的一个练手的小爬虫，用来自动抓取西电图书馆的最新购买的书籍，程序很简单，直接贴代码：

#encoding=utf8
import urllib

def getHtml(url):
    page = urllib.urlopen(url)
    html = page.read()
    return html

def content(html):
    content=[]
    nextpart=html
    flag=1
    while flag==1:
        str1= ',t:"'
        nextpart = nextpart.partition(str1)[2]
        str2 = '"}'
        if nextpart.partition(str2)[1]==str2:
            flag=1
        else:
            flag=0
        content.append(nextpart.partition(str2)[0])
    return content
    
def main():
    html=getHtml("http://al.lib.xidian.edu.cn/cgi-bin/newbook.cgi?base=ALL&cls=ALL&date=180")
    a = content(html)
    print "以下为图书馆最新购买书籍："
    for i in a:
        print i

main()

转载于:https://www.cnblogs.com/xieldy/p/6680742.html