![](https://img-blog.csdnimg.cn/20201014180756928.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
爬虫
weixin_45958231
这个作者很懒,什么都没留下…
展开
-
scrapy爬取股票出错
stockscrapy项目中的stock.pyimport scrapyimport reclass StockSpider(scrapy.Spider): name = 'stock' #allowed_domains = ['http://quote.eastmoney.com'] start_urls = ['http://quote.eastmoney.com/stock_list.html'] def parse(self, response):原创 2020-05-22 09:35:49 · 183 阅读 · 0 评论 -
scrapy爬虫403错误
scrapy出错显示403错误,ip被禁,需要加入user-agentscrapy shell 'xxx_url' -s USER_AGENT='Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'setting.py中需要加入user-agent参考原文...原创 2020-05-21 17:55:10 · 373 阅读 · 0 评论 -
urllib3+request+bs4+pandas+获取图书信息,入门笔记
出现warning需要ssl验证解决方法写入文件时出错解决方法json中dump和loads的区别urllib3import urllib3urllib3.disable_warnings()http=urllib3.PoolManager()#创建实例rq=http.request('GET',url='https://edu.tipdm.org/')print(rq.data...原创 2020-03-19 13:53:09 · 340 阅读 · 0 评论 -
用selenium和pandas获取新闻
selenium optionselenium获取属性获取属性值为get_attribute,获取文章为textfrom selenium.webdriver import Chromefrom selenium.webdriver.chrome.options import Optionsimport pandas as pdoptions = Options()# optio...原创 2020-03-19 05:57:11 · 268 阅读 · 0 评论