whisper+whisperx ASR加对齐, FunASR

长虹剑

已于 2024-11-15 00:32:42 修改

阅读量1.2k

点赞数 1

CC 4.0 BY-SA版权

文章标签： whisper funasr 语音识别

于 2024-08-01 20:04:08 首次发布

本文链接：https://blog.csdn.net/hongmaodaxia/article/details/140856125

FunASR

安装

直接借助 sensevoice 里面的 requirement.txt 安装就行
使用的时候，如果想把模型下载到指定目录就这样指定一下

export MODELSCOPE_CACHE=XXX

from funasr import AutoModel
from funasr.utils.postprocess_utils import rich_transcription_postprocess

model = AutoModel(model="paraformer-zh",  vad_model="fsmn-vad",  punc_model="ct-punc")
res = model.generate(input=fwav, batch_size_s=300)

print( res[0]["text"] )
# text = rich_transcription_postprocess(res[0]["text"])
# print(text)

初步的对齐后处理

def post_hd(result):
    # 提取文本和时间戳
    text = result[0]['text']
    timestamps = result[0]['timestamp']
    word_timestamps = []

    # 当前时间戳索引
    index = 0

    # 正则表达式匹配中文或英文字母
    #pattern = re.compile(r'[\u4e00-\u9fa5a-zA-Z]+')
    #pattern = re.compile(r'([\u4e00-\u9fa5])|([a-zA-Z]+)')
    #pattern = re.compile(r'([\u4e00-\u9fa5])|([a-zA-Z]+)|([，。！？；：])')
    pattern = re.compile(r'([\u4e00-\u9fa5])|([a-zA-Z\']+\s?)|([，。！？；：]|,\s|.\s)')

    # 在文本中查找所有匹配
    matches = pattern.finditer(text.strip