使用 python 读取图像数据，提高数据预处理速度

最新推荐文章于 2024-06-01 17:49:21 发布

Kun Li

最新推荐文章于 2024-06-01 17:49:21 发布

阅读量6.2k

点赞数 3

分类专栏： Python学习

本文链接：https://blog.csdn.net/u012193416/article/details/87531272

版权

Python学习专栏收录该内容

81 篇文章 10 订阅

订阅专栏

import cv2
import time
import glob

'''
首先需要从处理内容的文件列表开始
使用 for 循环逐个处理每个数据，然后在每个循环迭代上运行预处理
'''
# loop through all jpg file in the current folder
# resize each one to size 600*600
start_time = time.time()
for image_filename in glob.glob('./Airport/*.jpg'):
    img = cv2.imread(image_filename)

    # resize the image
    img = cv2.resize(img, (600, 600))
print('time:', time.time() - start_time)
# 2.667  360 jpgs

'''
将 jpg 文件列表分成4个小组
运行 python 解释器中的 4 个独立的实例
让 python 的每个实例处理 4 个数据小组中一个
结合 4 个处理过程得到的结构得出最终那那个结果列表
'''
import concurrent.futures

start_time1 = time.time()


def load_and_resize(image_filename):
    img = cv2.imread(image_filename)
    img = cv2.resize(img, (600, 600))


# create a pool of processes. By default,one is created for each cpu in your machine
with concurrent.futures.ProcessPoolExecutor() as executor:
    # get a list of files to process
    image_files = glob.glob('*.jpg')

    # executor.map() 将你想要运行的函数和列表作为输入，列表中的每个元素都是我们函数的单个输入，由于我们有6个核，我们将同时处理该列表中的6个项目
    executor.map(load_and_resize, image_files)

print('acceleration time:', time.time() - start_time1)
# 0.139

使用数据集一共 360 张图片，采用第一种方法，for 循环大概 2.667s，第二种方法，利用多核并行

with concurrent.futures.ProcessPoolExecutor() as excutor
    executor.map()

大概是 0.139s，比第一种方法快一倍。

Kun Li

关注

3
点赞
踩
18

收藏

觉得还不错? 一键收藏
0
评论
使用 python 读取图像数据，提高数据预处理速度

import cv2import timeimport glob'''首先需要从处理内容的文件列表开始使用 for 循环逐个处理每个数据，然后在每个循环迭代上运行预处理'''# loop through all jpg file in the current folder# resize each one to size 600*600start_time = time.ti...
复制链接

扫一扫

专栏目录