最近在学习scrapy,之前在安装scrapy的时候遇到了一个坑,在这里给大家提醒一下!
安装scrapy之前需要安装三个依赖库,分别是lxml,twisted和pywin32
刚开始都是在 https://www.lfd.uci.edu/~gohlke/pythonlibs/ 上面下载好whl文件,然后pip install安装
安装好了之后就正式pip install scrapy安装scrapy,successfully install 之后,在cmd里敲下scrapy也出现了如下的提示
Scrapy 1.4.0 - no active project
Usage:
scrapy <command> [options] [args]
Available commands:
bench Run quick benchmark test
fetch Fetch a URL using the Scrapy downloader
genspider Generate new spider using pre-defined templates
runspider Run a self-contained spider (without creating a project)
settings Get settings values
shell Interactive scraping console
startproject Create new project
version Print Scrapy version
view Open URL in browser, as seen by Scrapy
[ more ] More commands available when run from project directory
Use "scrapy <command> -h" to see more info about a command
欣喜若狂!终于安装成功了!然后兴奋地敲一行 scrapy shell https://www.baidu.com,然后就gg了(因为忘记截图了,所以只能以文本形式呈现)
File "c:\program files\python36\lib\site-packages\scrapy\downloadermiddlewares\retry.py", line 20, in <module> from twisted.web.client import ResponseFailed
File "c:\program files\python36\lib\site-packages\twisted\web\client.py", line 42, in <module> from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\program files\python36\lib\site-packages\twisted\internet\endpoints.py", line 41, in <module> from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\program files\python36\lib\site-packages\twisted\internet\stdio.py", line 30, in <module> from twisted.internet import _win32stdio
File "c:\program files\python36\lib\site-packages\twisted\internet\_win32stdio.py", line 9, in <module>
import win32api
然后百度一下,发现可能是版本原因,于是我又重新装了一遍依赖库,发现问题还是如此
接下来玄学来了,我换了一种方式安装pywin32,把原来的pywin32卸载,然后去pywin32官网下了一个pywin32exe安装文件进行安装,再次scrapy shell http://www.baidu.com就成功地进入了"交互模式"!
本人刚开始学scrapy,如果哪位大神看到晚辈这篇文章,还望告诉一下我原因,万分感谢!