爬虫之crawlers

最新推荐文章于 2024-10-16 17:48:16 发布

庭少

最新推荐文章于 2024-10-16 17:48:16 发布

阅读量494

点赞数 1

分类专栏：自己琢磨文章标签：爬虫源码

本文链接：https://blog.csdn.net/dwx953571268/article/details/53402754

版权

自己琢磨专栏收录该内容

1 篇文章 0 订阅

订阅专栏

爬虫源码：

https://github.com/dwx953571268/crawlers/tree/master/crawl3/crawl3

看了一篇微信推文，心血来潮，作为一名大男子，血气方刚！是时候来一波，声明一下：我不是司机哦
http://mp.weixin.qq.com/s?__biz=MzA3NTEzMTUwNA==&mid=2651081164&idx=1&sn=a5fffffbc10195ece7d74b14827e1577&scene=0#wechat_redirect

1.git下载crawlers
https://github.com/dwx953571268/crawlers/tree/master/crawl3/crawl3/spiders

想用rosi.py这个，报错：未 import scrapy

2.windows下搭建爬虫框架scrapy
参考
*a* http://blog.csdn.net/playstudy/article/details/17296473
*b* http://www.tuicool.com/articles/ayyUver(主线)
*c* http://www.cnblogs.com/txw1958/archive/2012/07/12/scrapy_installation_introduce.html（注意这里的easy*是32位的，必须按照上面的a走）

1）easy_install
参考：http://jingyan.baidu.com/article/b907e627e78fe146e7891c25.html
http://blog.csdn.net/dreamzml/article/details/8847879

在Python27 文件夹下面生成Scripts，里面有easy_install.exe
2）安装lxml

……

2

按照方法安装后，报错没有PIL模块！
用pillow代替http://www.programgo.com/article/33435442222/
先安装好pip(eazy_install.exe pip)
然后eazy_install pillow

3.scrapy crawl rosi 成功了，保存在d:/data/rosi文件夹，不是自己设定的