error: nothing to repeat at position 0
Scrapy之奇葩坑你爹:CrawlSpider 提取规则正则表达式转义处理
rules = ( Rule(LinkExtractor(allow=r'?start=\d+&filter='), callback='parse_item', follow=True), )
...其他代码省略
运行爬虫
Rule(LinkExtractor(allow=r'?start=\d+&filter='), callback='parse_item', follow=True),
File "/usr/local/lib/python3.7/site-packages/scrapy/linkextractors/lxmlhtml.py", line 116, in __init__
canonicalize=canonicalize, deny_extensions=deny_extensions)
File "/usr/local/lib/python3.7/site-packages/scrapy/linkextractors/__init__.py", line 57, in __init__
for x in arg_to_iter(allow)]
File "/usr/local/lib/python3.7/site-packages/scrapy/linkextractors/__init__.py", line 57, in <listcomp>
for x in arg_to_iter(allow)]
。。。
File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/sre_parse.py", line 651, in _parse
source.tell() - here + len(this))
re.error: nothing to repeat at position 0
错误是re 正则表达式的错误;
随后将表达式的?进行转义处理
rules = ( Rule(LinkExtractor(allow=r'\?start=\d+&filter='), callback='parse_item', follow=True), )
就解决了。。进入了callback 回调