网页抓取学习（3）BeautifulSoup

最新推荐文章于 2022-03-21 07:28:35 发布

pySVN8A

最新推荐文章于 2022-03-21 07:28:35 发布

阅读量227

点赞数

分类专栏： Python 文章标签： BeautifulSoup

Python 专栏收录该内容

92 篇文章 0 订阅

订阅专栏

from bs4 import BeautifulSoup
from urllib.request import urlopen

# if has Chinese, apply decode()
html = urlopen("https://morvanzhou.github.io/static/scraping/basic-structure.html").read().decode('utf-8')
print(html)

#

soup = BeautifulSoup(html, features='lxml')
print(soup.h1)

#\n(匹配一个换行符)
print('\n', soup.p)

all_href = soup.find_all('a')
all_href = [l['href'] for l in all_href]

#只抓取了

<body>下的超链接

print ( ' \n ' , all_href )

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
网页抓取学习（3）BeautifulSoup

from bs4 import BeautifulSoupfrom urllib.request import urlopen# if has Chinese, apply decode()html = urlopen("https://morvanzhou.github.io/static/scraping/basic-structure.html").read().decode('ut...
复制链接

扫一扫

专栏目录

pySVN8A CSDN认证博客专家 CSDN认证企业博客

码龄6年

66: 原创

10万+: 周排名

7万+: 总排名

37万+: 访问

: 等级

2807: 积分

89: 粉丝

126: 获赞

35: 评论

190: 收藏

私信

关注

分类专栏

Python 92篇
pycharm 2篇
程序 3篇
Excel 10篇
大智慧

最新评论

简单根据时间戳生成随机数
pySVN8A: time.clock() 可以提高很高的时间精度~~!!
测试TensorFlow 是否安装成功
Tisfy: 总结得十分精辟，就像那：汉水东流，都洗尽、髭胡膏血。
测试TensorFlow 是否安装成功
Allo_瑞: 解决了，谢谢！
测试TensorFlow 是否安装成功
Childhood_Sweetheart 回复 EchoTRN: 可以看一下这个解决tensorflow出现AttributeError和RuntimeError：The Session graph is empty. https://blog.csdn.net/thebeautyofmath/article/details/104092085?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-2.channel_param&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-2.channel_param
主题 IDLE Dark 下载网盘分享
zhk1211: 没用诶

您愿意向朋友推荐“博客详情页”吗？

强烈不推荐
不推荐
一般般
推荐
强烈推荐

提交

最新文章

目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。