python爬虫进行自动化接管浏览器-躲避知乎1001

最新推荐文章于 2023-06-23 05:31:05 发布

ad_m1n

最新推荐文章于 2023-06-23 05:31:05 发布

阅读量377

点赞数 1

文章标签： ctf python 爬虫

本文链接：https://blog.csdn.net/hacker_zrq/article/details/119784839

版权

先将所用的浏览器驱动的路径配置到系统环境变量中

再在管理员权限打开的cmd中输入 C:\Users\DELL>chrome.exe --remote-debugging-port=9222 --user-data-dir="C:\Users\DELL\Desktop\PrePare\知乎"

import time
import json

from lxml import etree
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from time import sleep


class ZhiHu:
    def __init__(self):
        self.url = 'https://www.zhihu.com/'
        self.chrome_options = Options()
        self.chrome_options.add_experimental_option("debuggerAddress", "127.0.0.1:9222")  # 前面设置的端口号
        self.browser = webdriver.Chrome(executable_path=r'chromedriver.exe',
                                        options=self.chrome_options)  # executable执行webdriver驱动的文件

    def get_start(self):
        self.browser.get(self.url)
        info = self.browser.get_cookies()  # 获取cookies
        # print(info)
        with open(r".\info.json", 'w', encoding='utf-8') as f:
            f.write(json.dumps(info))
        #     定位输入框并输入搜索信息
        sleep(2)
        text = self.browser.find_element_by_id('Popover1-toggle')
        sleep(1)
        text.send_keys('想要搜索的信息')
        sleep(1)
        a_tag = self.browser.find_element_by_xpath(
            '//*[@id="root"]/div/div[2]/header/div[1]/div[1]/div/form/div/div/label/button')
        a_tag.click()
        page_source = self.browser.page_source
        print(len(page_source))


if __name__ == '__main__':
    zhihu = ZhiHu()
    zhihu.get_start()

ad_m1n

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
python爬虫进行自动化接管浏览器-躲避知乎1001

先将所用的浏览器驱动的路径配置到系统环境变量中再在cmd中输入C:\Users\DELL>chrome.exe --remote-debugging-port=9222 --user-data-dir="C:\Users\DELL\Desktop\PrePare\知乎"import timeimport jsonfrom lxml import etreefrom selenium import webdriverfrom selenium.webdriver.chrome.op.
复制链接

扫一扫