Python爬虫：splash的安装与简单示例

最新推荐文章于 2024-08-13 08:36:59 发布

彭世瑜

最新推荐文章于 2024-08-13 08:36:59 发布

阅读量1w

点赞数 1

分类专栏： Spider爬虫工程化入门到进阶

本文为博主原创文章，欢迎转载，请注明出处

本文链接：https://blog.csdn.net/mouday/article/details/81625326

版权

Spider爬虫工程化入门到进阶专栏收录该内容

4 篇文章 2 订阅

订阅专栏

安装splash

1、安装docker（参考：mac安装docker）
2、安装splash

docker pull scrapinghub/splash  # 安装

docker run -p 8050:8050 scrapinghub/splash  # 运行

访问测试： http://localhost:8050/
这里写图片描述

代码示例

import requests
import time
from scrapy import Selector


def timer(func):
    def inner(*args):
        start = time.time()
        response = func(*args)
        print("time: %s" % (time.time() - start))
        return response
    return inner


@timer
def use_request(url):
     return requests.get(url)


@timer
def use_splash(url):
    splash_url = "http://localhost:8050/render.html"

    args = {
        "url": url,
        "timeout": 5,
        "image": 0
    }

    return requests.get(splash_url, params=args)


if __name__ == '__main__':

    url = "http://quotes.toscrape.com/js/"

    r1 = use_request(url)
    sel1 = Selector(r1)
    text = sel1.css(".quote .text::text").extract_first()
    print(text)

    r2 = use_splash(url)
    sel2 = Selector(r2)
    text = sel2.css(".quote .text::text").extract_first()
    print(text)

"""
time: 0.632809877396
None

time: 0.685022830963
“The world as we have created it is a process of our thinking. 
    It cannot be changed without changing our thinking.”
"""