python网络数据采集练习1

最新推荐文章于 2024-07-16 19:26:54 发布

kleine23fpts_zz

最新推荐文章于 2024-07-16 19:26:54 发布

阅读量138

点赞数

分类专栏： python 练习文章标签： python

本文链接：https://blog.csdn.net/kleine23fpts_zz/article/details/80139545

版权

练习同时被 2 个专栏收录

5 篇文章 0 订阅

订阅专栏

python

2 篇文章 0 订阅

订阅专栏

基于书籍《python网络数据采集》（[美]Ryan Mitchell)

第二章2.2.1

用bs4在网站中读取所有颜色为绿色的字段。

from urllib.request import urlopen
from bs4 import BeautifulSoup

html = urlopen("http://www.pythonscraping.com/pages/warandpeace.html")
bsObj = BeautifulSoup(html,"html.parser")

nameList = bsObj.findAll("span",{"class":"green"})

for name in nameList:
	print(name.get_text())

urlopen用来读取一个从网络获取的远程对象。

findAll函数一共有6个参数，这里用的前两个，第一个标签名，第二个为字典封装的一个标签的若干属性和对应的属性值，这里是“绿色”。

get_text()用来把html文档的所有标签都清除，返回一个只含文字的字符串。

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

kleine23fpts_zz

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python网络数据采集练习1

基于书籍《python网络数据采集》（[美]Ryan Mitchell)第二章2.2.1用bs4在网站中读取所有颜色为绿色的字段。from urllib.request import urlopenfrom bs4 import BeautifulSouphtml = urlopen("http://www.pythonscraping.com/pages/warandpeace.html"...
复制链接

扫一扫