selenium已知cookie模拟登录,通过selenium,requests或者session下载pdf,selenium获取html元素,JavaScript获取html元素

selenium已知cookie模拟登录

前置了解方法和工具
(1)在线url解析
(2)在线json解析
(3)使用浏览器开发工具
(4)requests发起请求
(5)session发起请求
(6)selenium模拟登录
(7)selenium使用get_cookies方法,需要注意的是:
当使用selenium登录到网站首页之后,此时使用get_cookies方法获取的cookie,和当你在点击下载pdf按钮之后的cookie是不一样的;
selenium获取的cookie是包含domain,path参数的,selenium的cookie要转换成requests能用的cookie,一种方法是直接遍历selenium的cookie,只取name,value键值对,生成一个cookie字典或者cookie字符串,另一种方法是有专门的方法可以将selenium的cookie转换成requests可以使用的cookie,如cookiejar_from_dict
(8)发起请求的时候,cookie可以是"Cookie":“cookie值”,作为headers的键值对;
cookie也可以是一个字典,直接传入requests,session里作为参数
(9)知道如何用requests,session,selenium三种方法来下载pdf

import json
from selenium import webdriver

option = webdriver.ChromeOptions()  # 创建谷歌浏览器加载项对象
"""
(1)添加扩展应用
option.add_argument()
(2)添加扩展应用
option.add_extension()
option.add_encoded_extension()
(3)添加实验性质的设置参数
option.add_experimental_option()
"""

"""
常用参数如下:
option.add_argument("--headless")  # 为Chrome配置无头模式,无可视化页面
option.add_argument('--disable-javascript')  # 禁用javascript
option.add_argument('--no-sandbox')#以最高权限运行
option.binary_location = r"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" # 手动指定使用的浏览器位置
option.add_argument('lang=zh_CN.UTF-8')  # 设置默认编码为utf-8
option.add_argument('--hide-scrollbars')  # 隐藏滚动条, 应对一些特殊页面
option.add_argument('blink-settings=imagesEnabled=false')  # 禁止加载图片
option.add_argument('window-size=1980x1060')  # 指定浏览器分辨率
option.add_argument("--proxy-server=http://xxx.xxx.xx.xx:xxxx")  # 设置代理
option.add_argument('--ignore-certificate-errors')  # 关闭https安全提示
"""

"""
现在很多网站对selenium采取了监测机制,网站后台可以根据window.navigator.webdirver返回值来判断是否是selenium发起的请求
返回值有两种结果:
(1)undefinded:不是selenium发起的请求
(2)true:是selenium发起的请求
规避检测的方法是设置开发者模式启动:
option.add_experimental_option("excludeSwitches", ['enable-automation'])
"""
option.add_experimental_option("excludeSwitches", ['enable-automation'])

"""
chrome文件下载:指定下载路径,不弹出弹窗
(1)download.default_directory:设置下载路径
(2)profile.default_content_settings.popups:设置为0,是禁止弹出所有窗口
(3)download.prompt_for_download:设置为False,取消浏览器下载时保存路径弹窗,配合download.directory_upgrade使用
(4)download.directory_upgrade:设置为True,配合download.prompt_for_download使用
(5)plugins.always_open_pdf_externally:设置为True,禁用pdf直接浏览,直接下载文件
(6)safebrowsing.enabled:下载文件时,如xml文件,可能会提示安全警告,如文件类型损害计算机,不显示消息警告的弹出窗口,设置为True
"""
prefs = {
    'download.default_directory': "D:\\python_project\\test",
    "download.prompt_for_download": False,
    "download.directory_upgrade": True,
    "plugins.always_open_pdf_externally": True,
    "safebrowsing.enabled": True
}
option.add_experimental_option('prefs', prefs)

driver = webdriver.Chrome(chrome_options=option)  # 创建浏览器对象,并添加加载项对象,IE浏览器对象方法:driver = webdriver.Ie()
# driver.set_window_size(width=1000, height=800, windowHandle="current") # 设置浏览器的宽度和高度
# driver.maximize_window() #最大化浏览器
# driver.refresh() #刷新当前页面

url = 'https://etax.guangxi.chinatax.gov.cn:9723/web/dzswj/taxclient/main.html'
pdf_url = "https://wsbs.guangxi.chinatax.gov.cn:7006/download.sword?ctrl=SB702SbdyCtrl_printByPzxh&pzxh=10014522000024692257&sbuuid=3B1B74B15F10288A130A980ACE7682D9&format=PDF&skssqq=2022-07-01 00:00:00&skssqz=2022-07-31 00:00:00&yzpzzlDm=BDA0610606"
driver.get(url)
driver.delete_all_cookies()  # 删除所有cookie信息
# with open('cookies_fofa.json', 'r', encoding='utf-8') as f:
#     listCookies = json.loads(f.read())     #loads是将str转化成dict格式
from_driver_cookie = [
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'BSRMC', 'path': '//', 'secure': False,
     'value': '%E5%87%8C%E5%AA%9B%E5%AA%9B'},
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'BSR_YDDHHM', 'path': '//', 'secure': False,
     'value': '15928033684'},
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'CDDW', 'path': '//', 'secure': False,
     'value': '%7B%22GZFW%22%3A%22DXDL_CADL%22%7D'},
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'CUR_USERID', 'path': '//', 'secure': False,
     'value': '2104121020000001'},
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'DJXH', 'path': '//', 'secure': False,
     'value': '10214501000000519139'},
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'DJZCLX_DM', 'path': '//', 'secure': False,
     'value': '159'},
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'FDDBRMC', 'path': '//', 'secure': False,
     'value': '%E7%86%8A%E5%BE%B7%E8%B6%85'},
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'JCPT_USER', 'path': '//', 'secure': False,
     'value': '%7B%22loginTime%22%3A%2220220927160352%22%2C%22CS_DLLX%22%3A%22QYYH_DX%22%2C%22YHXXLY_1%22%3A%220%22%2C%22YHID%22%3A%222104121020000001%22%2C%22NSRMC%22%3A%22%E5%8D%97%E5%AE%81%E5%B7%B4%E8%BF%AA%E5%BE%B7%E6%94%BF%E5%AE%A0%E7%89%A9%E5%8C%BB%E9%99%A2%E6%9C%89%E9%99%90%E5%85%AC%E5%8F%B8%22%2C%22DLR%22%3A%7B%22CARDNUM%22%3A%22513123199401311824%22%2C%22PHONE%22%3A%2218668405489%22%2C%22NAME%22%3A%22%E9%9F%A9%E5%BA%94%E5%90%9B%22%2C%22ROLE%22%3A%224%22%2C%22SFZJLX_DM%22%3A%22201%22%7D%2C%22QYID%22%3A%222104121020000001%22%2C%22DL_YHSF%22%3A%22QY_DX_GPR%22%2C%22CZRY_DM%22%3A%222104121020000001%22%2C%22SCJYDLXDH%22%3A%2215307714151%22%2C%22isDefaultPwdOrCode%22%3Anull%2C%22ZZJG_MC%22%3A%22%E5%9B%BD%E5%AE%B6%E7%A8%8E%E5%8A%A1%E6%80%BB%E5%B1%80%E5%8D%97%E5%AE%81%E5%B8%82%E8%89%AF%E5%BA%86%E5%8C%BA%E7%A8%8E%E5%8A%A1%E5%B1%80%E5%A4%A7%E6%B2%99%E7%94%B0%E7%A8%8E%E5%8A%A1%E5%88%86%E5%B1%80%22%2C%22SCJYDZXZQHSZ_DM%22%3Anull%2C%22DLRK%22%3A%220%22%2C%22CS_DLMC%22%3A%2291450108MA5QD6NE80%22%2C%22useDskl%22%3A%22Y%22%2C%22SCJYDZ%22%3A%22%E5%8D%97%E5%AE%81%E5%B8%82%E8%89%AF%E5%BA%86%E5%8C%BA%E5%BE%B7%E6%94%BF%E8%B7%AF135%E5%8F%B7%22%2C%22NSRSBH%22%3A%2291450108MA5QD6NE80%22%2C%22BSRMC%22%3A%22%E5%87%8C%E5%AA%9B%E5%AA%9B%22%2C%22SWJGLX%22%3A%221%22%2C%22COL2%22%3A%22%22%2C%22COL1%22%3A%22%22%2C%22FDDBRMC%22%3A%22%E7%86%8A%E5%BE%B7%E8%B6%85%22%2C%22DLMC%22%3A%2291450108MA5QD6NE80%22%2C%22MAIN_ZGSWSKFJ_DM%22%3A%2214501083000%22%2C%22DJXH%22%3A%2210214501000000519139%22%2C%22YHXXLY%22%3A%221%22%2C%22DJZCLX_DM%22%3A%22159%22%2C%22ZGSWSKFJ_DM%22%3A%2214501083000%22%2C%22SSDABH%22%3A%2210214501000000251334%22%2C%22ZZJG_DM%22%3A%2214501083000%22%2C%22CZRY_MC%22%3A%22%E5%8D%97%E5%AE%81%E5%B7%B4%E8%BF%AA%E5%BE%B7%E6%94%BF%E5%AE%A0%E7%89%A9%E5%8C%BB%E9%99%A2%E6%9C%89%E9%99%90%E5%85%AC%E5%8F%B8%22%2C%22code%22%3A%221%22%2C%22ACCOUNTID%22%3A%221%22%2C%22NSRZT_DM%22%3A%2203%22%2C%22KZZTDJLX_DM%22%3A%221110%22%2C%22ZDSY_FLAG%22%3A%22N%22%2C%22SNSW_DM%22%3Anull%2C%22NSRDZDAH%22%3A%2210214501000000519139%22%2C%22SJ_YHID%22%3A%222104121020000001%22%2C%22SHXYDM%22%3A%2291450108MA5QD6NE80%22%2C%22GS_NSRZT_DM%22%3A%2203%22%2C%22USERNAME%22%3Anull%2C%22BSR_YDDHHM%22%3A%2215928033684%22%2C%22YHLX_DM%22%3A%221%22%2C%22DQ_YHSF%22%3A%22QY%22%2C%22USERID%22%3A%222104121020000001%22%7D'},
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'NSRDZDAH', 'path': '//', 'secure': False,
     'value': '10214501000000519139'},
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'QX_USER', 'path': '//', 'secure': False,
     'value': '%7B%22loginTime%22%3A%2220220927160352%22%2C%22CS_DLLX%22%3A%22QYYH_DX%22%2C%22YHXXLY_1%22%3A%220%22%2C%22YHID%22%3A%222104121020000001%22%2C%22NSRMC%22%3A%22%E5%8D%97%E5%AE%81%E5%B7%B4%E8%BF%AA%E5%BE%B7%E6%94%BF%E5%AE%A0%E7%89%A9%E5%8C%BB%E9%99%A2%E6%9C%89%E9%99%90%E5%85%AC%E5%8F%B8%22%2C%22DLR%22%3A%7B%22CARDNUM%22%3A%22513123199401311824%22%2C%22PHONE%22%3A%2218668405489%22%2C%22NAME%22%3A%22%E9%9F%A9%E5%BA%94%E5%90%9B%22%2C%22ROLE%22%3A%224%22%2C%22SFZJLX_DM%22%3A%22201%22%7D%2C%22QYID%22%3A%222104121020000001%22%2C%22DL_YHSF%22%3A%22QY_DX_GPR%22%2C%22CZRY_DM%22%3A%222104121020000001%22%2C%22SCJYDLXDH%22%3A%2215307714151%22%2C%22isDefaultPwdOrCode%22%3Anull%2C%22ZZJG_MC%22%3A%22%E5%9B%BD%E5%AE%B6%E7%A8%8E%E5%8A%A1%E6%80%BB%E5%B1%80%E5%8D%97%E5%AE%81%E5%B8%82%E8%89%AF%E5%BA%86%E5%8C%BA%E7%A8%8E%E5%8A%A1%E5%B1%80%E5%A4%A7%E6%B2%99%E7%94%B0%E7%A8%8E%E5%8A%A1%E5%88%86%E5%B1%80%22%2C%22SCJYDZXZQHSZ_DM%22%3Anull%2C%22DLRK%22%3A%220%22%2C%22CS_DLMC%22%3A%2291450108MA5QD6NE80%22%2C%22useDskl%22%3A%22Y%22%2C%22SCJYDZ%22%3A%22%E5%8D%97%E5%AE%81%E5%B8%82%E8%89%AF%E5%BA%86%E5%8C%BA%E5%BE%B7%E6%94%BF%E8%B7%AF135%E5%8F%B7%22%2C%22NSRSBH%22%3A%2291450108MA5QD6NE80%22%2C%22BSRMC%22%3A%22%E5%87%8C%E5%AA%9B%E5%AA%9B%22%2C%22SWJGLX%22%3A%221%22%2C%22COL2%22%3A%22%22%2C%22COL1%22%3A%22%22%2C%22FDDBRMC%22%3A%22%E7%86%8A%E5%BE%B7%E8%B6%85%22%2C%22DLMC%22%3A%2291450108MA5QD6NE80%22%2C%22MAIN_ZGSWSKFJ_DM%22%3A%2214501083000%22%2C%22DJXH%22%3A%2210214501000000519139%22%2C%22YHXXLY%22%3A%221%22%2C%22DJZCLX_DM%22%3A%22159%22%2C%22ZGSWSKFJ_DM%22%3A%2214501083000%22%2C%22SSDABH%22%3A%2210214501000000251334%22%2C%22ZZJG_DM%22%3A%2214501083000%22%2C%22CZRY_MC%22%3A%22%E5%8D%97%E5%AE%81%E5%B7%B4%E8%BF%AA%E5%BE%B7%E6%94%BF%E5%AE%A0%E7%89%A9%E5%8C%BB%E9%99%A2%E6%9C%89%E9%99%90%E5%85%AC%E5%8F%B8%22%2C%22code%22%3A%221%22%2C%22ACCOUNTID%22%3A%221%22%2C%22NSRZT_DM%22%3A%2203%22%2C%22KZZTDJLX_DM%22%3A%221110%22%2C%22ZDSY_FLAG%22%3A%22N%22%2C%22SNSW_DM%22%3Anull%2C%22NSRDZDAH%22%3A%2210214501000000519139%22%2C%22SJ_YHID%22%3A%222104121020000001%22%2C%22SHXYDM%22%3A%2291450108MA5QD6NE80%22%2C%22GS_NSRZT_DM%22%3A%2203%22%2C%22USERNAME%22%3Anull%2C%22BSR_YDDHHM%22%3A%2215928033684%22%2C%22YHLX_DM%22%3A%221%22%2C%22DQ_YHSF%22%3A%22QY%22%2C%22USERID%22%3A%222104121020000001%22%7D'},
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'SB_YJFJS_BYHDL', 'path': '//',
     'secure': False, 'value': '1'},
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'SCJYDZ', 'path': '//', 'secure': False,
     'value': '%E5%8D%97%E5%AE%81%E5%B8%82%E8%89%AF%E5%BA%86%E5%8C%BA%E5%BE%B7%E6%94%BF%E8%B7%AF135%E5%8F%B7'},
    {'domain': 'etax.guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'SF_cookie_21', 'path': '//',
     'secure': False, 'value': '60610349'},
    {'domain': 'etax.guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'SF_cookie_22', 'path': '//',
     'secure': False, 'value': '46121135'},
    {'domain': 'etax.guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'SF_cookie_24', 'path': '//',
     'secure': False, 'value': '11512033'},
    {'domain': 'etax.guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'SF_cookie_251', 'path': '//',
     'secure': False, 'value': '38731796'},
    {'domain': 'etax.guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'SF_cookie_252', 'path': '//',
     'secure': False, 'value': '18363739'},
    {'domain': 'etax.guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'SF_cookie_254', 'path': '//',
     'secure': False, 'value': '11009165'},
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'USER_CODE', 'path': '//', 'secure': False,
     'value': '91450108MA5QD6NE80'},
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'USER_SEQ', 'path': '//', 'secure': False,
     'value': '2104121020000001'},
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'WSBSFWTSESSIONID', 'path': '//',
     'secure': False, 'value': '35ae7fd3-820e-4f01-88ec-f4fdc697044c'},
    {'domain': 'guangxi.chinatax.gov.cn', 'expiry': 1666846620, 'httpOnly': False, 'name': 'isGuid', 'path': '//',
     'secure': False, 'value': '%7B%22index%22%3Atrue%2C%22main%22%3Atrue%7D'},
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'loginFlag', 'path': '//', 'secure': False,
     'value': 'true'},
    {'domain': 'etax.guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'loginTime', 'path': '//',
     'secure': False, 'value': '1664265832027'},
    {'domain': 'guangxi.chinatax.gov.cn', 'httpOnly': False, 'name': 'showFpdkJszcTips', 'path': '//',
     'secure': False, 'value': ''}]

for cookie in from_driver_cookie:  # from_driver_cookie是通过IE浏览器模拟登录,从selenium获取的cookie,这里是chrome浏览器使用这个cookie模拟登录
    driver.add_cookie({  # 此处模仿之前生成的cookies_fofa.json的格式写即可,要用单引号,因为已经转成dict格式了
        'domain': cookie.get('domain'),
        'name': cookie.get('name'),
        'value': cookie.get('value'),
        'path': '/',
        'expires': None
    })
driver.get(url)  # get下载
driver.find_element_by_xpath('//a[text()="testsaveas.zip"]').click()

html = driver.page_source  # 获取网页源代码
driver.close()  # 关闭当前页面
driver.quit()  # 关闭浏览器

下载pdf时self.driver.get_cookies获得的cookie和登录首页的cookie是不一样的,多了一个LESB_SESSION

JavaScript中获取HTML元素值的三种方法
JS获取DOM元素的方法(8种):
通过ID获取(getElementById)
通过name属性(getElementsByName)
通过标签名(getElementsByTagName)
通过类名(getElementsByClassName)
获取html的方法(document.documentElement)
获取body的方法(document.body)
通过选择器获取一个元素(querySelector)
通过选择器获取一组元素(querySelectorAll)

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值