pyspider安装及使用

mac系统安装:

1.先下载https://bootstrap.pypa.io/get-pip.py,然后直接sudo python get-pip.py,pip就安装好了。

cd到下载目录
sudo python get-pip.py

2.pyspider安装

sudo pip install pyspider#安装  #卸载pip uninstall pyspider

安装报错:

Collecting pycurl (from pyspider)

  Downloading https://files.pythonhosted.org/packages/e8/e4/0dbb8735407189f00b33d84122b9be52c790c7c3b25286826f4e1bdb7bde/pycurl-7.43.0.2.tar.gz (214kB)

    100% |████████████████████████████████| 215kB 12kB/s 

    Complete output from command python setup.py egg_info:

Using curl-config (libcurl 7.30.0)

    Traceback (most recent call last):

      File "<string>", line 1, in <module>

      File "/private/tmp/pip-install-uJlOEi/pycurl/setup.py", line 913, in <module>

        ext = get_extension(sys.argv, split_extension_source=split_extension_source)

      File "/private/tmp/pip-install-uJlOEi/pycurl/setup.py", line 582, in get_extension

        ext_config = ExtensionConfiguration(argv)

      File "/private/tmp/pip-install-uJlOEi/pycurl/setup.py", line 99, in __init__

        self.configure()

      File "/private/tmp/pip-install-uJlOEi/pycurl/setup.py", line 316, in configure_unix

        specify the SSL backend manually.''')

    __main__.ConfigurationError: Curl is configured to use SSL, but we have not been able to determine which SSL backend it is using. Please see PycURL documentation for how to specify the SSL backend manually.

    

    ----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /private/tmp/pip-install-uJlOEi/pycurl/

解决:参考https://blog.csdn.net/mr_yang__/article/details/81109348

sudo python -m pip install --upgrade --force pip#强制重新安装更新pip
sudo pip install pycurl==7.43.0#安装的时候的报错可能因为下载的是7.43,但自动用7.30配置安装了所以报错(Using curl-config (libcurl 7.30.0)),所以指定下pycurl版本安装
#问题解决后重新安装pyspider
sudo pip install pyspider

3.启动pyspider

zdjdeMacBook-Pro:~ zdj$ pyspider
Traceback (most recent call last):
  File "/usr/local/bin/pyspider", line 5, in <module>
    from pkg_resources import load_entry_point
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 2603, in <module>
    working_set.require(__requires__)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 666, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 565, in resolve
    raise DistributionNotFound(req)  # XXX put more info here
pkg_resources.DistributionNotFound: pyquery
zdjdeMacBook-Pro:~ zdj$ pyspider -all
Traceback (most recent call last):
  File "/usr/local/bin/pyspider", line 5, in <module>
    from pkg_resources import load_entry_point
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 2603, in <module>
    working_set.require(__requires__)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 666, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 565, in resolve
    raise DistributionNotFound(req)  # XXX put more info here
pkg_resources.DistributionNotFound: pyquery

报错解决:参考https://blog.csdn.net/patrickzheng/article/details/73478082

sudo pip install -U setuptools

4.退出pyspider,control+c

如果未能正常退出,重新启动会报端口占用错,解决方法:
 

lsof -i:5000 #查看端口占用的进程
kill 进程id    #关闭进程#linux中kill -9 进程id #强制杀

5.mac 安装配置phantomjs

1.下载phantomjs(http://phantomjs.org/download.html)官网下载mac版本
2.下载后将phantomjs-2.1.1-macosx文件夹放到自己目录下
3.终端输入vim .bash_profile 添加语句(i插入,esc退出,:wq保存关闭)
export PATH=/Applications/phantomjs-2.1.1-macosx/bin:$PATH
('/Applications/phantomjs-2.1.1-macosx/bin'换成自己文件路径)
5.终端输入source .bash_profile(网上大部分教程没有这一步)
6.终端输入 phantomjs --version 检测是否配置成功

linux系统安装:

python系统版本是2.6,(输入python查看系统python版本),网上说2.6版本无法打开pyspider的webui,但我还是想尝试下,开始了2.6下安装,结果还是以安装失败告终,最后还是在2.7版本下安装成功,具体看下面:

1.pip安装

yum install python-pip #安装pip时,没有setuptools会自动安装,安装后如果单独升级setuptools,会升级到高版本,高版本只支持python2.7以上版本,会造成pip命令无法使用,后续有详诉
===================================================================================================================================================================================
Installing:
 python-pip                                       noarch                                7.1.0-1.el6                                      epel                                1.5 M
Installing for dependencies:
 python-setuptools                                noarch                                0.6.10-4.el6_9                                   os                                  336 k

2.pyspider安装

#pip install pyspider#安装中报错,缺少libxslt开发环境

 ** make sure the development packages of libxml2 and libxslt are installed **
    
    Using build configuration of libxslt
.
.
.
creating build/temp.linux-x86_64-2.6/src/lxml
    gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -DCYTHON_CLINE_IN_TRACEBACK=0 -Isrc -Isrc/lxml/includes -I/usr/include/python2.6 -c src/lxml/etree.c -o build/temp.linux-x86_64-2.6/src/lxml/etree.o -w
    src/lxml/etree.c:98:20: 错误:Python.h:没有那个文件或目录
    src/lxml/etree.c:100:6: 错误:#error Python headers needed to compile C extensions, please install development version of Python.
    Compile failed: command 'gcc' failed with exit status 1
    creating tmp
    cc -I/usr/include/libxml2 -c /tmp/xmlXPathInitKTUiCW.c -o tmp/xmlXPathInitKTUiCW.o
    cc tmp/xmlXPathInitKTUiCW.o -lxml2 -o a.out
    error: command 'gcc' failed with exit status 1
    
    ----------------------------------------
Command "/usr/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip-build-s1dXin/lxml/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-opr2KI-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-s1dXin/lxml

#pip install --upgrade pip  #一开始以为是pip版本低,想升级下,升级后,pip直接无法使用报错,所以别乱升级
#yum remove python-pip #卸载pip
#yum install python-pip #重新安装pip后,可以使用
#yum install gcc libffi-devel python-devel openssl-devel -y #不知道是否有影响,运行pip install pyspider结果又报编译的新错

#yum install libxslt-devel #运行后,再安装pyspider没有报错
# pip install pyspider #安装pyspider成功,开心的太早了

#pyspider #启动失败,报错,一开始以为pyquery版本低了,开始升级指路。。。

Traceback (most recent call last):
  File "/usr/bin/pyspider", line 5, in <module>
    from pkg_resources import load_entry_point
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in <module>
    working_set.require(__requires__)
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve
    raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: pyquery<1.3.0

# pip install pyquery==1.3.0 #装了1.3,1.4都没有用

# pip install -U setuptools #脑子傻掉了,去升级了下setuptools,结果升级到了40.6.3版本,啊,python2.6不支持的啊,2.7以上版本才支持。结果pip又无法使用了。由于pip无法使用了,想卸载setuptools(命令pip uninstall setuptools无法使用)

You are using pip version 7.1.0, however version 18.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
/usr/lib/python2.6/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
Collecting setuptools
/usr/lib/python2.6/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out. (read timeout=15)",)': /packages/37/06/754589caf971b0d2d48f151c2586f62902d93dc908e2fd9b9b9f6aa3c9dd/setuptools-40.6.3-py2.py3-none-any.whl
/usr/lib/python2.6/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Retrying (Retry(total=3, connect=None, read=None, redirect=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out. (read timeout=15)",)': /packages/37/06/754589caf971b0d2d48f151c2586f62902d93dc908e2fd9b9b9f6aa3c9dd/setuptools-40.6.3-py2.py3-none-any.whl
/usr/lib/python2.6/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Downloading https://files.pythonhosted.org/packages/37/06/754589caf971b0d2d48f151c2586f62902d93dc908e2fd9b9b9f6aa3c9dd/setuptools-40.6.3-py2.py3-none-any.whl (573kB)
    100% |████████████████████████████████| 573kB 669kB/s 
Installing collected packages: setuptools
  Found existing installation: setuptools 0.6rc11
    DEPRECATION: Uninstalling a distutils installed project (setuptools) has been deprecated and will be removed in a future version. This is due to the fact that uninstalling a distutils project will only partially uninstall the project.
    Uninstalling setuptools-0.6rc11:
      Successfully uninstalled setuptools-0.6rc11
Successfully installed setuptools-40.6.3

# pyspider #允许pip也是报这个错
Traceback (most recent call last):
  File "/usr/bin/pyspider", line 5, in <module>
    from pkg_resources import load_entry_point
  File "/usr/lib/python2.6/site-packages/pkg_resources/__init__.py", line 957, in <module>
    class Environment:
  File "/usr/lib/python2.6/site-packages/pkg_resources/__init__.py", line 961, in Environment
    self, search_path=None, platform=get_supported_platform(),
  File "/usr/lib/python2.6/site-packages/pkg_resources/__init__.py", line 188, in get_supported_platform
    plat = get_build_platform()
  File "/usr/lib/python2.6/site-packages/pkg_resources/__init__.py", line 391, in get_build_platform
    from sysconfig import get_platform
ImportError: No module named sysconfig

# yum remove python-pip
#yum install python-pip #尝试重新安装pip,安装后还是无法使用,崩溃,没找到如何降级setuptools,2.6环境就这样搞坏了,无法使用pip了。只好再装个2.7版本了。

3.由于在2中没能装成pyspider,所以装个python2.7环境,参考:https://www.cnblogs.com/Yiutto/p/5962906.html

# yum install git #安装git环境
# git clone git://github.com/yyuu/pyenv.git ~/.pyenv
# echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
# echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
# echo 'eval "$(pyenv init -)"' >> ~/.bashrc
# exec $SHELL -l

# pyenv install --list #查看可安装的版本
bash: pyenv: command not found
# source ~/.bashrc #如果pyenv命令无法找到,那使用pyenv前需要先运行这句

安装依赖包
# yum install readline readline-devel readline-static
# yum install openssl openssl-devel openssl-static
# yum install sqlite-devel
# yum install bzip2-devel bzip2-libs

安装python2.7.10
# pyenv install 2.7.10 -v 
/tmp/python-build.20181228230316.15719 ~
Downloading Python-2.7.10.tar.xz...
-> https://www.python.org/ftp/python/2.7.10/Python-2.7.10.tar.xz
curl: (35) SSL connect error
error: failed to download Python-2.7.10.tar.xz

BUILD FAILED (CentOS release 6.5 (Final) using python-build 1.2.8-12-g775a4b6)


ssl错误,安装
# yum update nss

# pyenv rehash #更新数据库,注意:使用 pip 安装模块后,可能需要执行 pyenv rehash 更新数据库
# pyenv versions #查看当前已安装的python版本
# pyenv local 2.7.10 #切换使用python2.7.10版本
# python 确认使用的python版本
# pip install pyspider #在python2.7.10版本中安装pyspider
# pyenv rehash #pyspider安装成功后需要更新数据库,不如无法使用
# pyspider #启动pyspider成功。

pymysql安装

sudo pip install PyMySQL

参考:

https://blog.csdn.net/shaoyingchendsg/article/details/77897261

https://blog.csdn.net/weixin_37947156/article/details/76495144

http://www.runoob.com/python3/python3-mysql.html

https://blog.csdn.net/lmb1612977696/article/details/78166180

https://mp.weixin.qq.com/s?__biz=MjM5MjAwODM4MA==&mid=2650711750&idx=1&sn=0c489bca4dc83186803210e4b2eda1b8&chksm=bea6d71589d15e03009e787ec3a0c22eaf05eb0a87157d0d2201041f2ef82473d79c651ddb4a&mpshare=1&scene=1&srcid=0104TQm2BbdPEMlRX0v8duAf&key=04eb54b857de55b35b46752fdb9f6328fd9bc4590f0bef713e3ffb85430fda17e77960410b7afc5aefbed86d73ed235ede4f7fe90620deba8226277b9948641ccfaf6b330069c9ed64db62f2ae7e2568&ascene=0&uin=NzQzMjA1MDM3&devicetype=iMac+MacBookPro11%2C1+OSX+OSX+10.9.5+build(13F34)&version=11020012&lang=zh_CN&pass_ticket=brRNQvztqkGS%2Boyu%2Bl3VGtQGhSBEYSVuqO9gesdU3nMqtJVncgjWi9xORwpOvqJ4

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值