python多线程处理文件流

码上bug笔记

已于 2024-03-22 15:19:38 修改

阅读量480

点赞数 10

文章标签： python 开发语言

于 2024-03-22 15:05:43 首次发布

本文链接：https://blog.csdn.net/m0_64388084/article/details/136940727

版权

本文讲述了在项目开发中，通过使用`asyncio`和多线程技术，将前端批量上传文件到数据库的处理时间从7秒降低到1秒，提升了性能。作者详细介绍了异步处理、`awaitfile.read()`以及如何在服务层和过程层实现导入功能的过程。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

在做项目的时候，有个需求，要求解析前端批量传入的文件并存到数据库中，尝试许多办法都没能解决，十几个文件处理耗时差不多7s左右，最后采用基本的Threading方法解决，耗时1s，速度提升了7倍，下面我来详细解析思路。

首先这个路由处理函数使用了`async def`和`await`关键字，因为它涉及到异步操作，特别是文件上传过程。使用异步操作能够提高性能，避免阻塞整个应用程序，同时允许应用程序在文件上传的同时处理其他请求。`await file.read()`这一行代码异步地读取上传的文件内容，`await`关键字确保在文件读取完成前不会执行下一行代码，以保证异步操作的顺序执行。

@router.post("/importThinks", summary="导入")
async def import_thinks(files: List[UploadFile] = File(..., description="文件"),
                        id: str = File(..., description="案例id"),
                        creator: str = File(..., description="上传人")):
    logger.info(f"接收参数：{creator}{files}")
    try:
        processed_plans=[]
        for file in files:
            file_content = await file.read()
            filename = file.filename.lower()
            processed_plans.append((filename,file_content))
        return case_service.import_thinks(processed_plans,d, creator)
    except Exception as e:
        logging.error(e)
        return error_response()

接下来，我们在服务层实现import_thinks方法，这段代码是一个导入方法 `import_thinks`，它接收文件列表 `files`、案例ID `=id` 和上传者信息 `creator` 作为参数。它通过多线程的方式处理每个文件的导入操作，每个线程调用 `process_plan` 函数来处理导入，将结果存储在 `data` 列表中。

    def import_thinks(self, files, =id, creator):
        data = []
        threads = []
        t1=time.time()
        # 创建并启动线程
        for filename,file_content in files:
            thread = threading.Thread(target=process_plan, args=(self, =id,filename,creator,file_content,data))
            threads.append(thread)
            thread.start()

        # 等待所有线程完成
        for thread in threads:
            thread.join()
        print("耗时：",time.time()-t1,"s")
        return success_response(data)

最后我们实现process_plan方法

def process_plan(self, id, name,  creator, file,data):
    plan = None
    if name.endswith(".docx"):
        plan = word_dic(file)
    elif name.endswith(".xlsx"):
        plan = excel_dic(file)
    if plan:
        plan["creator"] = creator
        plan["id"] = id
        plan["create_time"] = datetime.now().strftime("%Y-%m-%d")
        try:
            result = self.get_es.insert_one('plan', document=plan)
            data.append({"name": name, "success": result.get("result") == "created"})
        except Exception as e:
            logging.error(e)
            data.append({"name": name, "success": False})
    else:
        data.append({"name": name, "success": False})