树莓派的低功耗、适合长时间运行的特性,很适合作为爬虫运行的平台,家里闲置了一台树莓派3,这次打算安装配置python3.6和相关依赖包来搭建一个爬虫环境。
需要安装的python版本和依赖包如下:
Python 3.6.2
lxml
BeautifulSoup4
安装Python3.6
从源码安装Python3的过程就不赘述,基本上是下面几个步骤
1
2
3
4
5
[crayon-5fc116051cdab696886142inline="true"]# wget https://www.python.org/ftp/python/3.6.2/Python-3.6.2.tgz
# tar xvf Python-3.6.2.tgz
# cd Python-3.6.2
# ./configure
# make && make install
[/crayon]
pip命令遇到ssl module in Python is not available的错误
如果用pip命令安装所依赖的
lxml和
BeautifulSoup4包,会遇到下面这个奇怪的错误.
1
2
[crayon-5fc116051cdb3373305640inline="true"]# /usr/local/bin/pip3.6 install lxml
pipisconfiguredwithlocationsthatrequireTLS/SSL,howeverthesslmoduleinPythonisnotavailable.
[/crayon]
1. 在网上搜了一圈,原因是树莓派上运行的基于Deiban的Raspbian系统里,默认没有安装openssl包,运行下面一条命令装上。
1
[crayon-5fc116051cdb6543192976inline="true"]# apt-get install openssl openssl-dev
[/crayon]
2. 编译Python需要加入ssl的模块,打开Modules/Setup,修改204行左右成下面这个样子.
1
2
3
4
5
6
7
8
9
[crayon-5fc116051cdb9200282066inline="true"]# Socket module helper for socket(2)
_socketsocketmodule.c
# Socket module helper for SSL support; you must comment out the other
# socket line above, and possibly edit the SSL variable:
#SSL=/usr/local/ssl
_ssl_ssl.c\
-DUSE_SSL-I$(SSL)/include-I$(SSL)/include/openssl\
-L$(SSL)/lib-lssl-lcrypto
[/crayon]
3. 重新编译安装
1
[crayon-5fc116051cdbd646291656inline="true"]# make && make install
[/crayon]
pip安装lxml遇到的错误
首先遇到Could not find function xmlCheckVersion in library libxml2的错误,很明显,libxml2的软件包没有安装,运行下面一条命令搞定.
1
[crayon-5fc116051cdc0674768484inline="true"]# apt-get install libxml2 libxml2-dev
[/crayon]
再次安装又遇到另外一条错误
src/lxml/includes/etree_defs.h:14:31: fatal error: libxml/xmlversion.h: No such file or directory
运行这条命令解决
1
[crayon-5fc116051cdc2245670571inline="true"]# export C_INCLUDE_PATH=/usr/include/libxml2/
[/crayon]
再次安装lxml,又遇到一个错误
src/lxml/includes/etree_defs.h:23:32: fatal error: libxslt/xsltconfig.h: No such file or directory
看来是缺少
libxslt软件包,安装上去
1
[crayon-5fc116051cdc7286065064inline="true"]apt-getinstalllibxslt1-dev
[/crayon]
再次安装lxml成功完成,没有产生其他错误。
安装BeautifulSoup4
BeautifulSoup4安装起来比较顺利,没有报任何错误。
1
[crayon-5fc116051cdcc158374422inline="true"]# /usr/local/bin/pip3 install bs4
[/crayon]
验证安装完成
最后进入Python解释器验证一下两个包是否正确运行,一切OK,基本的爬虫环境搭建完成。
1
2
[crayon-5fc116051cdce631372157inline="true"]importlxml
frombs4importBeautifulSoup
[/crayon]