python 爬取多个html,用BeautifulSoup在python中抓取多个页面

最新推荐文章于 2022-07-28 12:26:48 发布

MY SunBorn

最新推荐文章于 2022-07-28 12:26:48 发布

阅读量497

点赞数

文章标签： python 爬取多个html

我已经成功地编写了从第一页抓取数据的代码，现在我不得不在这段代码中编写一个循环来刮取下一个“n”页。下面是代码

如果有人能指导/帮助我编写代码，从剩余页面中获取数据，我将不胜感激。在

谢谢！在from bs4 import BeautifulSoup

import requests

import csv

url = requests.get('https://wsc.nmbe.ch/search?sFamily=Salticidae&fMt=begin&sGenus=&gMt=begin&sSpecies=&sMt=begin&multiPurpose=slsid&sMulti=&mMt=contain&searchSpec=s').text

soup = BeautifulSoup(url, 'lxml')

elements = soup.find_all('div', style="border-bottom: 1px solid #C0C0C0; padding: 10px 0;")

#print(elements)

csv_file = open('wsc_scrape.csv', 'w')

csv_writer = csv.writer(csv_file)

csv_writer.writerow(['sp_name', 'species_author', 'status', 'family'])

for element in elements:

sp_name = element.i.text.strip()

print(sp_name)

status = element.find('span', class_ = ['success label', 'error label']).text.strip()

print(status)

author_family = element.i.next_sibling.strip().split('|')

species_author = author_family[0].strip()

family = author_family[1].strip()

print(species_author)

print(family)

print()

csv_writer.writerow([sp_name, species_author, status, family])

csv_file.close()

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python 爬取多个html,用BeautifulSoup在python中抓取多个页面

我已经成功地编写了从第一页抓取数据的代码，现在我不得不在这段代码中编写一个循环来刮取下一个“n”页。下面是代码如果有人能指导/帮助我编写代码，从剩余页面中获取数据，我将不胜感激。在谢谢！在from bs4 import BeautifulSoupimport requestsimport csvurl = requests.get('https://wsc.nmbe.ch/search?sFami...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。