python爬取网页上的超链接

用bs4中的BeautifulSoup解析网页

from urllib.request import urlopen
from bs4 import BeautifulSoup

html = urlopen('https://blog.csdn.net/zzc15806/') #获取网页
bs = BeautifulSoup(html, 'html.parser') #解析网页
hyperlink = bs.find_all('a')  #获取所有超链接
for h in hyperlink:
    hh = h.get('href')
    print(hh)

结果如下:

https://blog.csdn.net/zzc15806
javascript:void(0);
https://blog.csdn.net/zzc15806?orderby=UpdateTime
https://blog.csdn.net/zzc15806?orderby=ViewCount
https://blog.csdn.net/zzc15806/rss/list
https://blog.csdn.net/yoyo_liyy/article/details/82762601
https://blog.csdn.net/yoyo_liyy/article/details/82762601
https://blog.csdn.net/zzc15806/article/details/84996039
https://blog.csdn.net/zzc15806/article/details/84996039
https://blog.csdn.net/zzc15806/article/details/84975709
https://blog.csdn.net/zzc15806/article/details/84975709
https://blog.csdn.net/zzc15806/article/details/84975539
https://blog.csdn.net/zzc15806/article/details/84975539
https://blog.csdn.net/zzc15806/article/details/84975137
https://blog.csdn.net/zzc15806/article/details/84975137
https://blog.csdn.net/zzc15806/article/details/84974458
https://blog.csdn.net/zzc15806/article/details/84974458
https://blog.csdn.net/zzc15806/article/details/84973370
https://blog.csdn.net/zzc15806/article/details/84973370
https://blog.csdn.net/zzc15806/article/details/84972108
https://blog.csdn.net/zzc15806/article/details/84972108
https://blog.csdn.net/zzc15806/article/details/84971215
https://blog.csdn.net/zzc15806/article/details/84971215
https://blog.csdn.net/zzc15806/article/details/84875070
https://blog.csdn.net/zzc15806/article/details/84875070
https://blog.csdn.net/zzc15806/article/details/84779131
https://blog.csdn.net/zzc15
  • 16
    点赞
  • 54
    收藏
    觉得还不错? 一键收藏
  • 6
    评论
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值