如何利用大模型将语音转文字

冀辉

已于 2024-12-20 10:43:59 修改

阅读量606

点赞数 1

分类专栏： LLM 文章标签：语音识别人工智能

于 2024-12-20 10:42:10 首次发布

本文链接：https://blog.csdn.net/jihui8848/article/details/144605032

版权

LLM 专栏收录该内容

15 篇文章

订阅专栏

如何利用大模型将语音转文字

OpenAI支持将语音转文字，调用接口可以直接将语音文件转为文字。

这个例子是调用了一个私有部署的Belle-whisper-large-v2-zh，使用OpenAI的客户端。

测试代码如下：

from openai import OpenAI

client = OpenAI(
    base_url='http://127.0.0.1:9922/v1',
    api_key='EMPTY'
)

models = client.models.list()

print(models)

可以从下面的输出中，确认大模型的名称。

SyncPage[Model](data=[Model(id='Belle-whisper-large-v2-zh', created=0, object='model', 
owned_by='xinference', model_type='audio', address='0.0.0.0:36445', accelerators=['0'], 
model_name='Belle-whisper-large-v2-zh', model_family='whisper', model_revision='ec5bd5d78598545b7585814edde86dac2002b5b9', replica=1),
 Model(id='bge-reranker-large', created=0, object='model', owned_by='xinference', 
 model_type='rerank', address='0.0.0.0:46201', accelerators=['0'], type='normal', 
 model_name='bge-reranker-large', language=['en', 'zh'], model_revision='v0.0.1', replica=1), Model(id='bge-base-zh-v1.5', created=0, object='model', owned_by='xinference', model_type='embedding', 
 address='0.0.0.0:40537', accelerators=['0'], model_name='bge-base-zh-v1.5', dimensions=768, 
 max_tokens=512, language=['zh'], model_revision='v0.0.1', replica=1)], object='list')

选择一个声音文件，将文件内容提交给大模型。

file_name = r'C:\Temp\四年级英语听力.mp3'

audio_file = open(file_name, 'rb')

transcription = client.audio.transcriptions.create(
    model="Belle-whisper-large-v2-zh",
    file=audio_file
)

print(transcription.text)

输出结果为：

四年级英语听力部分ALookListenandChoose听音选图 writing he is a famous writer to Galway's brother is 
a policeman Galway's brother is a policeman three this is my classmate Li Yan she's good at reading books 
this is my classmate Li Yan she is good at reading books My uncle is a taxi driver. He drives well
听录音填写 I'm eleven She is twelve We are in the same class Her father is a teacher Her mother i
s a TV reporter of class two grade five.听录音用钩叉判断 I'm a new student I'm in class 2 five. 
Here is a picture of my family. This is my father. He's a writer. This is my mother. She's a singer. 
The girl is my sister. The boy is me. We love our father and mother and they love us. 
We are a happy family听力结束请同学们继续答题

看上去还不错。