解决:
将settings.py文件中的ROBOTSTXT_OBEY = False
修改为
ROBOTSTXT_OBEY = True
详细:
settings.py文件中的ROBOTSTXT_OBEY = True
在terminal输入>scrapy shell "https://www.baidu.com/"
[scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.baidu.com/robots.txt> (referer: None)
[scrapy.downloadermiddlewares.robotstxt] DEBUG: Forbidden by robots.txt: <GET https://www.baidu.com/>
由上面可知Forbidden by robots.txt,被禁止了
修改settings.py为ROBOTSTXT_OBEY = False
时,
[scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (meta refresh) to <GET http://m.baidu.com/?cip=220.178.74.6&baiduid=39BFD0B47012C3EBFBFF2E5DD9CA5BA9&from=84
4b&vit=fps?from=844b&vit=fps&index=&ssi