selinium如何多线程_selenium实现并发

最新推荐文章于 2024-04-18 12:03:40 发布

HonoYoku

最新推荐文章于 2024-04-18 12:03:40 发布

阅读量7.1k

点赞数 1

文章标签： selinium如何多线程

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_31921223/article/details/111943239

版权

for循环和多线程 + selenium

实例一

for循环

# -*- coding: utf-8 -*-

"""

Datetime: 2019/6/22

Author: Zhang Yafei

Description:

"""

import time

from selenium import webdriver

from selenium.webdriver.chrome.options import Options

from concurrent.futures import ThreadPoolExecutor

import functools

chrome_options = Options()

chrome_options.add_argument("--headless")

chrome_options.add_argument('--disable-gpu')

def timeit(func):

"""

装饰器：判断函数执行时间

:param func:

:return:

"""

@functools.wraps(func)

def inner(*args, **kwargs):

start = time.time()

ret = func(*args, **kwargs)

end = time.time() - start

if end < 60:

print(f'花费时间：\t{round(end, 2)}秒')

else:

min, sec = divmod(end, 60)

print(f'花费时间\t{round(min)}分\t{round(sec, 2)}秒')

return ret

return inner

class PolicyUrlDownload(object):

""" 政策数据下载 """

def __init__(self, url, pages_num, output_file, a_xpath, headless: bool=True):

self.url_list = [url.format(page) for page in range(1, pages_num+1)]

self.output_file = output_file

self.a_xpath = a_xpath

if headless:

self.driver = webdriver.Chrome(options=chrome_options)

else:

self.driver = webdriver.Chrome()

def start(self, page, url):

with open(self.output_file, mode='a', encoding='utf-8') as file:

print(f"make request to {url}")

self.driver.get(url)

titles = self.driver.find_elements_by_xpath(self.a_xpath)

for title in titles:

href = title.get_attribute('href')

file.write(f'{page}\t{href}\n')

print(f'{url} download completed')

def run(self):

for page, url in enumerate(self.url_list):

self.start(page+1, url)

self.driver.close()

@timeit

def main(setting):

policy_data = PolicyUrlDownload(**setting)

policy_data.run()

if __name__ == '__main__':

start_time = time.time()

print('######################## 开始下载 #########################')

# 多配置页面地址下载

settings = [

{

'output_file': '药品供应保障综合的管理.txt',

&

最低0.47元/天解锁文章

关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
selinium如何多线程_selenium实现并发

for循环和多线程 + selenium实例一for循环# -*- coding: utf-8 -*-"""Datetime: 2019/6/22Author: Zhang YafeiDescription:"""import timefrom selenium import webdriverfrom selenium.webdriver.chrome.options import Option...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。