https://www.cnblogs.com/dyfblog/p/10887940.html
一、拦截器简单用法
拦截器作用于单个Page,即浏览器中的一个标签页。每初始化一个Page都要添加一下拦截器。拦截器实际上是
通过给各种事件添加回调函数来实现的。
事件列表可参见:pyppeteer.page.Page.Events
常用拦截器:
request:发出网络请求时触发
response:收到网络响应时触发
dialog:页面有弹窗时触发
使用request拦截器修改请求:
复制代码
# coding:utf8
import asyncio
from pyppeteer import launch
from pyppeteer.network_manager import Request
launch_args = {
"headless": False,
"args": [
"--start-maximized",
"--no-sandbox",
"--disable-infobars",
"--ignore-certificate-errors",
"--log-level=3",
"--enable-extensions",
"--window-size=1920,1080",
"--user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36",
],
}
async def modify_url(request: Request):
if request.url == "https://www.baidu.com/":
await request.continue_({"url": "https://www.baidu.com/s?wd=ip&ie=utf-8"})
else:
await request.continue_()
async def interception_test():
# 启动浏览器
browser = await launch(**launch_args)
# 新建标签页
page = await browser.newPage()
# 设置页面打开超时时间
page.setDefaultNavigationTimeout(10 * 1000)
# 设置窗口大小
await page.setViewport({"width": 1920, "height": 1040})
# 启用拦截器
await page.setRequestInterception(True)
# 设置拦截器
# 1. 修改请求的url
if 1:
page.on("request", modify_url)
await page.goto("https://www.baidu.com")
await asyncio.sleep(10)