参考
官方文档1.0
官方文档0.24
安装
python 2
pip install scrapy
python3
sudo apt-get install python-dev python-pip libxml2-dev libxslt1-dev zlib1g-dev libffi-dev libssl-dev
sudo apt-get install python3 python3-dev
wget https:
tar -jxvf Twisted-17.1.0.tar.bz2
cd Twisted-17.1.0
python setup.py install
pip install scrapy
基本步骤
#进入项目的spiders目录下(itcast是脚本名称)
scrapy genspider itcast "作用域"
scrapy check itcast
scrapy crawl itcast
scrapy carwl itcast -o file.json
logging 使用
logging.basicConfig(
level=logging.ERROR,
format=
'%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s',
datefmt='%a, %d %b %Y %H:%M:%S',
filename='cataline.log',
filemode='w')
logger = logging.getLogger('eric')
logger.warning("++++++++++++++++++++++++++")
logger.info("----------------------------")
scrapy 与 mongodb
参考