Scrapy 调试代码

最新推荐文章于 2023-09-12 16:25:59 发布

chongaishi2879

最新推荐文章于 2023-09-12 16:25:59 发布

阅读量171

点赞数

文章标签： python shell

原文链接：https://my.oschina.net/sii/blog/655856

版权

取自

Scrapy终端(Scrapy shell)

#判断 url是否是想要的

def parse(self, response):
    if ".org" in response.url:
        from scrapy.shell import inspect_response    #调试语句
        inspect_response(response, self)
>>> response.url
'http://example.org'

测试提取代码:

>>> sel.xpath('//h1[@class="fn"]')
[]

浏览器打开链接

>>> view(response)
True

最后您可以点击Ctrl-D(Windows下Ctrl-Z)来退出终端，恢复爬取:

>>> ^D2014-01-23 17:50:03-0400 [myspider] DEBUG: Crawled (200) <GET http://example.net> (referer: None)

在浏览器中打开URL

from scrapy.utils.response import open_in_browser
    def parse(self, response):
        if "item name" not in response.body:
            open_in_browser(response)

转载于:https://my.oschina.net/sii/blog/655856