- 博客(12)
- 收藏
- 关注
原创 百度文库下载动态加载
import time,randomfrom selenium import webdriverfrom selenium.webdriver.common.keys import Keysurl = 'https://wenku.baidu.com/view/46dcaa1c9a6648d7c1c708a1284ac850ad0204d4.html#'options = webdriver.ChromeOptions()options.add_argument("'User-Agent':'M
2020-10-14 21:54:29 121
原创 scrapy笔记
scrapy startproject name # 创建工程scrapy genspider name www.xxx.com # 创建爬虫应用scrapy genspider -t crawl name www.xxx.com # 创建爬虫应用,基于全站式scrapy crawl name # 执行爬虫Downloader Middlewares # 下载中间件,可设置代理ip,ua等下载扩展Spider Middlewares # 可定义requestst和拦截篡改response
2020-10-07 09:54:16 90
原创 多任务协程
import timeimport asyncioimport aiohttpstart = time.time()urls = ['https://www.baidu.com/ ','https://www.kugou.com/','https://www.bootcss.com/']async def get_page(url): async with aiohttp.ClientSession() as session: # get()/post()都有
2020-09-20 16:31:56 94
原创 requests的post请求
import timeimport randomimport requestsimport jsonurl = 'http://scxk.nmpa.gov.cn:81/xk/itownet/portalAction.do?method=getXkzsList'headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84
2020-09-19 22:27:52 164
原创 超级鹰示例代码
#!/usr/bin/env python# coding:utf-8import requestsfrom hashlib import md5class Chaojiying_Client(object): def __init__(self, username, password, soft_id): self.username = username password = password.encode('utf8') self.passwo
2020-09-17 22:14:21 454
原创 超级鹰验证码识别
import requestsfrom lxml import etreefrom chaojiying import Chaojiying_Clientheaders = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36'}url = 'https://so.gushiwen.cn/us
2020-09-16 22:57:33 801
原创 抓取电梯公司名和网址
import requestsfrom bs4 import BeautifulSoupimport lxmldef downpagedata(n): url = 'https://www.maigoo.com/brand/list_1111.html?maxpage=&tabnum=&sort=&defaultids=&start=&thirdaction=&subaction=resultlist&action=searchli
2020-08-21 21:20:29 101
原创 抓取PDF
import timeimport requestsfrom bs4 import BeautifulSoupfrom lxml import etreefrom selenium import webdriverfrom selenium.webdriver.common.keys import Keyshd = "User-Agent: Mozilla/5.0 (Win...
2020-08-21 21:20:11 302
原创 requests,bs4
import requestsurl_re = “http://www.baidu.com”hd = {‘User-Agent’ : ‘Mozilla/5.0 (4Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36’}kv ...
2020-08-21 21:19:56 111
原创 seelight自动操作
"""输入起始数据和步长,进行实验,取得数据"""from selenium import webdriverfrom selenium.webdriver.common.action_chains import ActionChainsimport timeclass SeptucExp(object): def __init__(self): self = self def login(url,user,pw): browser..
2020-08-20 11:50:59 537
空空如也
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人