python scrapy框架中如何提升爬虫获取数据效率

jim_lucky

已于 2022-01-19 16:14:23 修改

阅读量112

点赞数

分类专栏：爬虫文章标签： python 爬虫

于 2021-09-07 16:21:34 首次发布

本文链接：https://blog.csdn.net/jim_lucky/article/details/120160776

版权

爬虫专栏收录该内容

11 篇文章 0 订阅

订阅专栏

仅作参考

修改setting.py中的一些线程参数配置，

# Configure maximum concurrent requests performed by Scrapy (default: 16)
CONCURRENT_REQUESTS = 100

# Configure a delay for requests for the same website (default: 0)
# See https://docs.scrapy.org/en/latest/topics/settings.html#download-delay
# See also autothrottle settings and docs
DOWNLOAD_DELAY = 0
# The download delay setting will honor only one of:
CONCURRENT_REQUESTS_PER_DOMAIN = 100
CONCURRENT_REQUESTS_PER_IP = 100

DOWNLOAD_DELAY默认为3

CONCURRENT_REQUESTS，CONCURRENT_REQUESTS_PER_DOMAIN，CONCURRENT_REQUESTS_PER_IP等默认为16，可以根据自己的任务需求来进行修改配置参数。

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

jim_lucky

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python scrapy框架中如何提升爬虫获取数据效率

修改setting.py中的一些线程参数配置，# Configure maximum concurrent requests performed by Scrapy (default: 16)CONCURRENT_REQUESTS = 100# Configure a delay for requests for the same website (default: 0)# See https://docs.scrapy.org/en/latest/topics/settings.html#d
复制链接

扫一扫