weixin_43891121-CSDN博客

原创 Python urllib.parse中urlencode的使用

params = { 'aid': '24', 'app_name': 'web_search', 'offset': '0', 'format': 'json', 'keyword': '%E9%87%91%E6%AF%9B', 'autoload': 'true', 'count': '20', 'en_qc': '1', ...

2019-03-10 14:09:25 3179

翻译 Running Selenium with Headless Chrome(用Chrome无头模式启动Selenium)

由于Selenium已经不支持PhantomJS,现在可以用Headless Chrome代替PhantomJS代码来源网址：https://intoli.com/blog/running-selenium-with-headless-chrome/#setup代码：from selenium import webdriveroptions = webdriver.ChromeOption...

2019-03-02 11:37:10 503

原创 Python爬虫优化：加快运行速度、显示进度条、显示错误信息

1.加快爬虫运行速度：代码中存在r.encoding = r.apparent_encoding，因此每使用一次request请求，都会分析一遍页面内容来确定可能采用的编码方式，这样很耗时间，可以先得到网页编码方式，直接使r.encoding = ‘utf-8’这样可以节省不少时间。2.显示进度条：爬取股票信息，显示进度条：print(’\r当前进度:{:.2f}%’.format(co...

2019-02-28 09:35:27 1656 1

原创 BeautifulSoup中 .string与.text的一点区别

今天用BeautifulSoup解析页面时遇到了.string返回None的问题，待解析的源码如下：< aclass =“bets-name” href="/stock/sh601766.html" >中国中车( < span > 601766 < / span >)< / a >使用如下代码来获得tag中的字符串：soup = Beau...

2019-02-27 20:28:33 5826 1

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人