python BeautifulSoup 抓取python中文开发者社区中的所有高级教程

最新推荐文章于 2024-05-16 06:52:57 发布

抉择命运

最新推荐文章于 2024-05-16 06:52:57 发布

阅读量1.2k

点赞数

文章标签： python

本文链接：https://blog.csdn.net/u014245412/article/details/53410039

版权

话不多说直接上代码：

#coding=utf-8
from bs4 import BeautifulSoup
import urllib2

url = 'http://www.pythontab.com/html/pythonhexinbiancheng/index.html'
url_list = [url]
for i in range(2,19):
    url_list.append('http://www.pythontab.com/html/pythonhexinbiancheng/%s.html'%i)
source_list = []
for j in url_list:
    request = urllib2.urlopen(j)
    html = request.read()
    suop = BeautifulSoup(html,'lxml')
    titles = suop.select('#catlist > li > a')
    links = suop.select('#catlist > li > a')
    for title, link in zip(titles, links):
        data = {
            "title" : title.get_text(),
            "link" : link.get('href')
        }
        source_list.append(data)
    for l in source_list:
        request = urllib2.urlopen(l['link'])
        html = request.read()
        suop = Bea

最低0.47元/天解锁文章

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

抉择命运

关注关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
python BeautifulSoup 抓取python中文开发者社区中的所有高级教程

话不多说直接上代码：#coding=utf-8from bs4 import BeautifulSoupimport urllib2url = 'http://www.pythontab.com/html/pythonhexinbiancheng/index.html'url_list = [url]for i in range(2,19): url_list.append
复制链接

扫一扫