多线程教程练习,某目录下,20个文件,每个文件62M,共1.21G。拷贝至另一处。
单线程结果
运行时长: 10.51秒。(刚反应过来,单线程不就是单进程么。)
多线程结果
有20个文件就生成20个子线程。
运行时长: 5.22~9.73秒。(20个进程耗时不到4秒就完成。20个线程却需要近10秒。)
线程池结果
executor = ThreadPoolExecutor()
运行时长: 0.21~5.32秒。(比同样无参的进程池要慢2秒,才全部完成。)
不知后面8条记录怎么就突然3,4,5秒了。无参时,有10个子线程参与。进程与核心数有关,线程与核心数无关,不知这10个线程是怎么个来源。
executor = ThreadPoolExecutor(max_workers=1)
线程2936运行时长: 0.28~0.58秒。(比进程池要快一些)
executor = ThreadPoolExecutor(max_workers=2)
运行时长: 0.63~1.58秒。(比进程池要慢)
executor = ThreadPoolExecutor(max_workers=3)
运行时长: 0.27~1.73秒。(比进程池要快)
executor = ThreadPoolExecutor(max_workers=4)
运行时长: 0.85~2.82秒。(差不多)
首次接触多进程多线程,试了一圈,觉得进程池和线程池效率相似。但多进程的效率要高于多线程。可能是例子受限,也没更多取样,多进程的资源消耗的弊端并未显现。
import os
import threading
import time
from concurrent.futures import ThreadPoolExecutor
from functools import partial
def copy_file(source_dir, dest_dir, filename):
thread_id = threading.current_thread().ident
start_time = time.time()
source_path = source_dir + "\\" + filename
dest_path = dest_dir + "\\" + filename
with open(source_path, "rb") as source_file:
with open(dest_path, "wb") as dest_file:
while True:
data = source_file.read(1024) # 源文件每次读1k数据
if data:
dest_file.write(data)
else:
break
end_time = time.time()
execution_time = end_time - start_time
print(f"线程{thread_id}运行时长: {round(execution_time, 2)}秒。")
if __name__ == '__main__':
source_dir = r"D:\6"
dest_dir = r"C:\Users\xcxc\Desktop\six"
try:
os.mkdir(dest_dir)
except:
print("目标文件夹已经存在。")
file_list = os.listdir(source_dir)
# for file_name in file_list:
# copy_file(file_name, source_dir, dest_dir)
# 有多少文件就生成多少线程,此处20个文件。
# sub_thread = threading.Thread(target=copy_file,
# args=(file_name, source_dir, dest_dir))
# sub_thread.start()
filename_list = list(file_name for file_name in file_list)
func = partial(copy_file, source_dir, dest_dir)
executor = ThreadPoolExecutor(4)
executor.map(func, filename_list)
executor.shutdown(wait=True)