爬虫
落幕之前
这个作者很懒,什么都没留下…
展开
-
Selenium locating elements and web element
loacte one elementiddriver.find_element(By.ID, "cheese")cheese = driver.find_element(By.ID, "cheese")cheddar = cheese.find_elements_by_id("cheddar")2.css locatorscheddar = driver.find_element_by_css_selector("#cheese #cheddar")locating multipl原创 2021-11-28 14:37:17 · 168 阅读 · 0 评论 -
Selenium switch to Frame and iframes
frames:deperatediframes:still commonly usedExample<div id="modal"> <iframe id="buttonframe" name="myframe" src="https://seleniumhq.github.io"> <button>Click here</button> </iframe></div>第一种方法:using webele.原创 2021-11-27 23:03:00 · 138 阅读 · 0 评论 -
《python网络爬虫入门实践》笔记:chp3 静态网页抓取(下)实例:豆瓣电影top250
import requestsfrom bs4 import BeautifulSoupdef get_movies(): Headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)' ' Chrome/95.0.4638.69 Safari/537.36 Edg/95.0.10原创 2021-11-24 11:50:08 · 524 阅读 · 0 评论 -
《python网络爬虫入门实践》笔记:chp3 静态网页抓取(上)
获取响应内容import requestsr = requests.get("http://www.baidu.com")print("文笔编码", r.encoding)print("响应状态码", r.status_code)print("字符串方式的响应体", r.text)定制Requests传递url参数import requestskeydict = {'key1': 'value1', 'key2': 'value2'}r = requests.get("http:原创 2021-11-23 19:53:19 · 421 阅读 · 0 评论 -
《python网络爬虫从入门到实践》读书笔记 chp2 第一个简单的爬虫
环境:win10ide:pycharm1.获取页面import requests as requestslink = "http://www.santostang.com"r = requests.get(link)print(r)print(r.text)2.提取需要的数据import requests as requestsfrom bs4 import BeautifulSouplink = "http://www.santostang.com"r = reques原创 2021-11-22 16:33:07 · 217 阅读 · 0 评论