关于python flask处理前端传过来的二进制文件的常见操作（待更新）

王小希ww

已于 2022-10-19 01:52:04 修改

阅读量3.1k

点赞数 3

分类专栏： # python 项目文章标签： python flask 前端

于 2022-10-16 02:26:25 首次发布

本文链接：https://blog.csdn.net/qq_33934427/article/details/127343352

版权

python 同时被 2 个专栏收录

48 篇文章 3 订阅

订阅专栏

项目

19 篇文章 3 订阅

订阅专栏

关于python flask处理前端传过来的二进制文件的常见操作

文章目录

关于python flask处理前端传过来的二进制文件的常见操作

一、音频、视频、文本文件保存到本地

参考

核心代码：

with open(file_path, "wb") as out_file:  # open for [w]riting as [b]inary
    out_file.write(buffer_video)

其中wb+的含义是：以二进制格式打开一个文件用于读写，如果该文件已存在则将其覆盖，如果该文件不存在，创建新文件。

Note：字节流无需考虑字符编码，即open()无需设置encoding。

1）保存二进制视频

如果前端传过来视频，则使用with open处理，注意wb+；

fileStorage = request.files['videofile']  #视频文件
buffer_video = fileStorage.read()
filename = request.files['textfile'].filename  #上传的文件名
# 将二进制视频流保存成文件之后再用opencv读取 参考https://stackoverflow.com/questions/57865656/save-video-in-python-from-bytes
if (not os.path.isdir(file_path)):
    os.mkdir(file_path)  # 创建文件夹
file_path = os.path.join(file_path,"temp." + filename.split(".")[-1])
with open(file_path, "wb+") as out_file:  # open for [w]riting as [b]inary
    out_file.write(buffer_video)

2）保存二进制音频

如果前端传过来音频，同样使用with open读取二进制进行处理；

'''语音转文字'''
@speechB.route('/predict_text_from_audio',methods=['POST'])
def speech2word():
    if (request.method == 'POST'):  # 先返回音频文件 / 如果不行再返回一个音频地址供前端访问
        if (not os.path.isdir(voice2text_save_path)):  # 创建文件夹
            os.mkdir(voice2text_save_path)
        fileStorage = request.files['audiofile']  #视频文件
        buffer_data = fileStorage.read()
        filename = request.files['audiofile'].filename  #上传的文件名
        temp_path = os.path.join(voice2text_save_path, 'demo.' + filename.split(".")[-1]))
        with open(temp_path, 'wb+') as f:
            f.write(buffer_data)  #二进制转为音频文件
        text = speech2word_Handler.predict_word_with_voice(temp_path)
        return text
    else:
        return jsonify({'code': 400, 'msg': '操作失败：请使用post方法'})

3）保存二进制文本文件

如果前端传过来的是二进制文件（pdf，docx，txt等等），同样使用with open读取二进制进行处理；

'''语音合成'''
@speechB.route('/predict_audio_from_text',methods=['POST'])
def speechSynthetic():
    if (request.method == 'POST'):  # 先返回音频文件 / 如果不行再返回一个音频地址供前端访问
        type = int(request.form.get("type"))
        if(type == None): type = int(request.json['type'])

        if (not os.path.isdir(text2voice_save_path)):  # 创建多级文件夹
            # os.mkdir(text2voice_save_path)
            os.makedirs(text2voice_save_path, mode=0o777)

        ret = True
        if(type == 0): #type=0为文本字符串
            text = request.form.get('text')   #将text封装再formdata里
        elif(type == 1): #type=1为二进制文本文件
            fileStorage = request.files['textfile']  # 二进制文件
            buffer_data = fileStorage.read()
            filename = request.files['textfile'].filename
            suffix = filename.split(".")[-1]
            filePath = os.path.join(text2voice_save_path, 'demo.' + suffix)
            save_file_from_byte(buffer_data,filePath)  #保存二进制文件
            text,ret = read_file(filePath)  #读取二进制文件文本内容(ret=False表示文本解析异常)

        if(ret == True):
            savePath = os.path.join(text2voice_save_path, 'demo.wav')
            wav_path = text2voice_Handler.handle_speech_2_voice(input_text=text, savePath=savePath)
            timeStamp = str(time.mktime(time.localtime(time.time())))
            data = "http://" + ip + ":" + port + "/get_audio?file_path=" + wav_path + "&timeStamp=" + timeStamp  # 返回文件访问路径
            return jsonify({'data': data, 'error_flag' : False})
        else:
            return jsonify({'data': text, 'error_flag' : True})
    else:
        return jsonify({'code': 400, 'msg': '操作失败：请使用post方法'})

其中save_file_from_byte()为文件写入代码：

#将二进制流保存为文件
def save_file_from_byte(file_byte,filePath):
    with open(filePath, 'wb+') as f:
        f.write(file_byte)  # 二进制转为文本文件保存再本地

二、读取刚保存的文本文件

1）读取txt

text = open(filePath, encoding='utf-8').read()

2）读取docx

参考python_docx读取word的内容

先pip install python_docx，再使用如下代码

from docx import Document
doc = Document(filePath)
for i in doc.paragraphs:
  text = text + str(i.text)
print(text)

Note：能读取docx，但读取不了doc

3）读取pdf

参考一文教会你用Python读取PDF文件_python_脚本之家

先pip install pdfplumber，再使用如下代码

import pdfplumber
with pdfplumber.open(filePath) as pdf:
    for page in pdf.pages:
        text = text + page.extract_text()
print(text)

4）整体代码

import os
from docx import Document
import pdfplumber

#读取文件
def read_file(filePath):
    '''
    @param filePath: 文件路径
    @return:
    '''
    # 文件类型
    file_types = ['txt','md']
    file_type = filePath.split(".")[-1]
    text = ""
    if (file_type == 'docx'):  #参考https://blog.csdn.net/qq_38870145/article/details/124076591
        doc = Document(filePath)
        for i in doc.paragraphs:
            text = text + str(i.text)
    elif(file_type == 'pdf'):  #参考https://www.jb51.net/article/258597.htm
        with pdfplumber.open(filePath) as pdf:
            for page in pdf.pages:
                text = text + page.extract_text()
    elif(file_type in file_types):
        text = open(filePath, encoding='utf-8').read()
    else:
        return "Exception: Only `*.pdf`, `*.docx`, `*.txt`, `*.md` files can be read",False
    return text,True

三、mp3转wav，并设置采样率

参考

import soundfile as sf
import librosa
file_path = 'demo.mp3'   
y, s = librosa.load(file_path, sr=16000)  #将音频的采样率设置为16000HZ
file_path = file_path.split(".")[0] + ".wav"
sf.write(file_path, y, 16000)  # 写入文件（mp3转为wav格式）