Scrapy抓取数据时报错
Traceback (most recent call last):
File "C:\software\Python\Python35\lib\site-packages\twisted\internet\defer.py", line 1386, in _inlineCallbacks
result = g.send(result)
File "C:\software\Python\Python35\lib\site-packages\scrapy\core\downloader\middleware.py", line 43, in process_request
defer.returnValue((yield download_func(request=request,spider=spider)))
File "C:\software\Python\Python35\lib\site-packages\twisted\internet\defer.py", line 1363, in returnValue
raise _DefGen_Return(val)
twisted.internet.defer._DefGen_Return: <200 http://ios.jobbole.com/all-posts/page/2/>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\software\Python\Python35\lib\site-packages\scrapy\utils\defer.py", line 45, in mustbe_deferred
result = f(*args, **kw)
File "C:\software\Python\Python35\lib\site-packages\scrapy\core\spidermw.py", line 49, in process_spider_input
return scrape_func(response, request, spider)
File "C:\software\Python\Python35\lib\site-packages\scrapy\core\scraper.py", line 146, in call_spider
dfd.addCallbacks(request.callback or spider.parse, request.errback)
File "C:\software\Python\Python35\lib\site-packages\twisted\internet\defer.py", line 303, in addCallbacks
assert callable(callback)
AssertionError
思考后,根据assert callable(callback)猜测是调用回调函数时发生了错误。检查源代码
def parse(self, response):
selector = Selector(response)
# 获取文章的链接
article_urls = selector.xpath('//a[@class="archive-title"]/@href').extract()
for article_url in article_urls:
yield Request(url=article_url, callback=self.parse_content)
# 调用下一页的链接
next_page_url = selector.xpath('//a[contains(@class, "next")]/@href').extract()
if next_page_url:
yield Request(url=next_page_url[0], callback="parse")#self.parse
else:
print("已经是最后一页了...........")
由于后面一个函数没有发挥作用,猜测这就是问题所在。所以将
callback="parse"改为callback=self.parse后,问题解决