首先安装依赖环境。
有些需要root权限。
python2.7、pip、lxml和openssl。
貌似我这里除了openssl什么都缺。所以依次安装python2.7、pip和lxml。
python2.7和pip很好安装。不过要注意的是安装pip前要先安装python2.7,大概因为我的python2.4太旧了。
安装lxml时遇到一些麻烦。根据上面的参考页面要首先安装libxml2和libxslt。这两个都是下载源码然后切换到目录后执行如下三个命令:
./configure
make
make install
然后才是安装libxml。
pip install libxml
这样才算成功安装了libxml。
然后安装scrapy。不过我在执行pip install scrapy
遇到了如下错误:
Package libffi was not found in the pkg-config search path.
Perhaps you should add the directory containing `libffi.pc' to the PKG_CONFIG_PATH environment variable
No package 'libffi' found
上面的错误是也是因为某些包没有,执行如下命令安装:
yum install libffi-devel
最后安装scrapy
pip install scrapy
最后验证是否安装成功,结果遇到如下错误:
Traceback (most recent call last):
File "/usr/local/bin/scrapy", line 7, in <module>
from scrapy.cmdline import execute
File "/usr/local/lib/python2.7/site-packages/scrapy/__init__.py", line 48, in <module>
from scrapy.spiders import Spider
File "/usr/local/lib/python2.7/site-packages/scrapy/spiders/__init__.py", line 10, in <module>
from scrapy.http import Request
File "/usr/local/lib/python2.7/site-packages/scrapy/http/__init__.py", line 11, in <module>
from scrapy.http.request.form import FormRequest
File "/usr/local/lib/python2.7/site-packages/scrapy/http/request/form.py", line 9, in <module>
import lxml.html
File "/usr/local/lib/python2.7/site-packages/lxml/html/__init__.py", line 42, in <module>
from lxml import etree
ImportError: /usr/local/lib/python2.7/site-packages/lxml/etree.so: undefined symbol: exsltMathXpathCtxtRegister
解决方案:
export LD_LIBRARY_PATH=/usr/local/lib/:$LD_LIBRARY_PATH
再次验证就没问题了
参考
1.http://scrapy-chs.readthedocs.org/zh_CN/latest/intro/install.html#intro-install
2.http://lxml.de/installation.html
3.http://blog.csdn.net/renyp8799/article/details/46915753