scrapy框架
小主早安
这个作者很懒,什么都没留下…
展开
-
抓取所有的文本信息yuke
```bashimport scrapyimport jsonclass QiubaiSpider(scrapy.Spider): name = 'qiubai' start_urls = ['https://mp.weixin.qq.com/s/n7E2PCcXppbLMXtR_Um-FA'] def parse(self, response): # div_list = response.xpath( # '//*[@id=".原创 2021-01-10 11:02:29 · 112 阅读 · 0 评论 -
抓取微信里面小程序里的文字信息--yuke
import scrapyimport jsonclass QiubaiSpider(scrapy.Spider): name = 'qiubai' start_urls = ['https://mp.weixin.qq.com/s/n7E2PCcXppbLMXtR_Um-FA'] def parse(self, response): div_list = response.xpath( '//*[@id="js_content"]/s原创 2021-01-09 18:58:44 · 1535 阅读 · 0 评论 -
scrapy爬取数据 数据是这样的 /html/body/div[4]/div[1]/div[1]/div[2]/p[1]
同我之前爬取的数据不一样import scrapyimport json# 执行命令:scrapy crawl qiubaiclass QiubaiSpider(scrapy.Spider): name = 'qiubai' start_urls = ['http://www.XXXXX'] def parse(self, response): id_list = [] a = 0 # /html/body/div[4]/d原创 2020-11-30 09:56:06 · 3959 阅读 · 0 评论 -
scrapy爬取文件
import scrapyimport json# 执行命令:scrapy crawl qiubaiclass QiubaiSpider(scrapy.Spider): name = 'qiubai' start_urls = ['https://www.baidu.com'] def parse(self, response): id_list = [] a = 0 li_list = response.xpath('//*原创 2020-11-02 16:13:18 · 795 阅读 · 0 评论 -
scrapy重新爬取数据报错:ValueError: port should be of type int
scrapy重新爬取数据 又报错,心累C:\Users\Administrator\PycharmProjects\pythonProject\scrapy框架\qiubaiPro>scrapy crawl qiubai开始爬虫。。。。。。结束爬虫!2020-11-02 11:24:18 [scrapy.core.engine] ERROR: Scraper close failureTraceback (most recent call last): File "d:\program原创 2020-11-02 13:05:51 · 8918 阅读 · 5 评论 -
糗图爬取作者名字和内容
运行后一个下午的成果下面是大神写的原创 2020-09-28 16:04:21 · 121 阅读 · 0 评论 -
创建第一个scrapy的文件
其实都已经安装了,不过不知道为啥还是提示我没导入包,我重新在pycharm里面再次 pip install scrapy现在又提示这个真是抓狂啊直接又输入python -m pip install --upgrade pip重新pip install scrapy 报错到下面的目录下载因为提示需要c++ 14.0,下载14.0.https://visualstudio.microsoft.com/zh-hans/visual-cpp-build-tools/不知道是什么总之需.原创 2020-09-28 14:03:46 · 321 阅读 · 0 评论 -
安装Twisted-20.3.0-cp38-cp38-win_amd64报错
C:\Users\Administrator>pip install Twisted-20.3.0-cp38-none-any.whlProcessing c:\users\administrator\twisted-20.3.0-cp38-none-any.whlERROR: Exception:Traceback (most recent call last):File “d:\program files (x86)\python38\lib\site-packages\pip_inter原创 2020-09-27 17:47:01 · 1742 阅读 · 0 评论