Jetson 部署 Faster Whisper

Whisper

Whisper 是一种通用语音识别模型。它是在大量不同音频数据集上进行训练的,也是一个多任务模型,可以执行多语言语音识别、语音翻译和语言识别。

在这里插入图片描述
测试,用Chattts生成一段语音:四川美食确实以辣闻名,但也有不辣的选择。比如甜水面、赖汤圆、蛋烘糕、叶儿粑等,这些小吃口味温和,甜而不腻,也很受欢迎。

$ pip install -U openai-whisper
$ sudo apt update && sudo apt install ffmpeg
$ pip install setuptools-rust

$ whisper ../audio.wav --model tiny
100%|█████████████████████████████████████| 72.1M/72.1M [00:36<00:00, 2.08MiB/s]
/home/jetson/.local/lib/python3.8/site-packages/whisper/__init__.py:146: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(fp, map_location=device)
/home/jetson/.local/lib/python3.8/site-packages/whisper/transcribe.py:115: UserWarning: FP16 is not supported on CPU; using FP32 instead
  warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Detecting language using up to the first 30 seconds. Use `--language` to specify the language
Detected language: Chinese
[00:00.000 --> 00:03.680] 四川美時確實以辣文明 但以有不辣的選擇
[00:03.680 --> 00:07.200] 比如潛水面 賴湯圓 再轟高夜熱八等
[00:07.200 --> 00:11.560] 這些小市口維溫和 然後甜而不膩也很受歡迎

这个是CPU运行的😂,GPU都没带喘的。

Faster Whisper

fast-whisper 是使用 CTranslate2 重新实现 OpenAI 的 Whisper 模型,CTranslate2 是 Transformer 模型的快速推理引擎。

Funasr有个大问题,它的实时转录是CPU的,很慢,GPU的支持离线语音转文字,但又不能实时。找到了一个faster-whisper可以支持实时GPU转录,也支持中文。

安装使用

pip install faster-whisper
from faster_whisper import WhisperModel

model_size = "large-v3"

# Run on GPU with FP16
# model = WhisperModel(model_size, device="cuda", compute_type="float16")

# or run on GPU with INT8
model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")
# or run on CPU with INT8
# model = WhisperModel(model_size, device="cpu
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Yanjing-233

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值