笔记
小妖怪_
这个作者很懒,什么都没留下…
展开
-
爬虫学习,selenium2
from selenium import webdriverfrom time import sleepfrom lxml import etreeurl = 'http://125.35.6.84:81/xk/'bro = webdriver.Chrome(executable_path='./chromedriver')bro.get(url)page_text_list = []#每一页的页面源码数据sleep(2)#捕获当前页面对应的页面源代码数据page_text = bro原创 2020-06-17 09:47:54 · 167 阅读 · 0 评论 -
爬虫学习,selenium1
chromedriver 文件放置from selenium import webdriverfrom time import sleep#基于浏览器的驱动程序实例化一个浏览对象bro = webdriver.Chrome(executable_path='./chromedriver')#对目标网站发起请求bro.get('https://www.jd.com/')search_text = bro.find_element_by_xpath('//*[@id="key"]')searc原创 2020-06-16 14:23:24 · 122 阅读 · 0 评论 -
爬虫学习,xpath练习
from lxml import etreeimport requestsimport osdirName = 'Girlslib'if not os.path.exists(dirName): os.mkdir(dirName)headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) '原创 2020-05-26 22:05:48 · 251 阅读 · 0 评论 -
爬虫学习,BeautifulSoup下载小说
import requestsfrom bs4 import BeautifulSoupfp = open('./sanguo.txt', 'w', encoding='utf-8')headers ={ 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36'}#获取章节url原创 2020-05-24 19:35:51 · 254 阅读 · 0 评论 -
爬虫学习,BeautifulSoup
原创 2020-05-23 17:51:02 · 114 阅读 · 0 评论 -
爬虫学习,比我媳妇还丑的校花下载
import requestsimport reimport urllibimport osdirName = 'Libs'if not os.path.exists(dirName): os.mkdir(dirName)#http://www.521609.com/uploads/allimg/111019/11046303404-1-lp.jpgurl = 'http://www.521609.com/qingchunmeinv/'headers ={ 'use.原创 2020-05-23 17:18:51 · 238 阅读 · 0 评论 -
爬虫学习,图片的两种下载方法
import requestsimport urllibheaders ={ 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36'}img_url = 'https://t8.baidu.com/it/u=1484500186,1503043093&fm=79&app原创 2020-05-23 17:15:59 · 304 阅读 · 0 评论 -
爬虫练习,化妆品生产许可信息管理系统服务平台信息
#url = 'http://125.35.6.84:81/xk/'#Request URL: http://125.35.6.84:81/xk/itownet/portalAction.do?method=getXkzsById#id: ff83aff95c5541cdab5ca6e847514f88#import requestsurl = 'http://125.35.6.84:81/xk/itownet/portalAction.do?method=getXkzsList'he原创 2020-05-21 23:27:42 · 2803 阅读 · 0 评论 -
爬虫学习,肯德基餐厅信息查询
import requests#url = 'http://www.kfc.com.cn/kfccda/storelist/index.aspx'url = 'http://www.kfc.com.cn/kfccda/ashx/GetStoreList.ashx?op=keyword'for page in range(1,9): data = { 'cname': '', 'pid': '', 'keyword': '北京',原创 2020-05-20 23:33:31 · 921 阅读 · 0 评论 -
爬虫学习,豆瓣电影排行
import requestsurl = 'https://movie.douban.com/typerank?type_name=%E5%8A%A8%E4%BD%9C&type=5&interval_id=100:90&action='headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.39.原创 2020-05-20 17:16:01 · 346 阅读 · 0 评论 -
爬虫学习,requests模块
import requestskeyWord = input('enter a key word: ')params = { 'query': keyWord }headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36' }url = 'https原创 2020-05-20 15:28:07 · 133 阅读 · 0 评论