SpeechT5 文本转语音大模型

1.准备环境

#创建环境

conda create -n speechT5 python=3.10 -y  

conda activate speechT5 

 #安装环境

conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=11.8 -c pytorch -c nvidia 

#下载依赖包

pip install transformers

pip install soundfile

2.下载微调模型

#下载 微调模型,以及用于标记化和特征提取的处理器对象

git clone https://huggingface.co/microsoft/speecht5_tts

#下载 vocoder:将预测的对数梅尔声谱图转换为实际的语音波形

git clone https://huggingface.co/mechanicalsea/speecht5-tts

3.测试

from transformers import SpeechT5Processor, SpeechT5ForTextToSpeech, SpeechT5HifiGan, SpeechT5Config
import torch
import soundfile as sf
from transformers import logging

# 设置调试日志级别
logging.set_verbosity_debug()

try:
    # 使用本地路径加载模型
    processor = SpeechT5Processor.from_pretrained("/root/autodl-tmp/speecht5_tts", local_files_only=True)
    print("Processor loaded successfully.")
    
    model = SpeechT5ForTextToSpeech.from_pretrained("/root/autodl-tmp/speecht5_tts", local_files_only=True)
    print("TTS model loaded successfully.")
    
    vocoder = SpeechT5HifiGan.from_pretrained("/root/autodl-tmp/microsoft/speecht5_hifigan", local_files_only=True)
    print("Vocoder loaded successfully.")
    
    # 标记输入文本
    inputs = processor(text="please go straight!", return_tensors="pt")
    
    # 获取模型配置中的 speaker embedding 维度
    config = SpeechT5Config.from_pretrained("/root/autodl-tmp/speecht5_tts", local_files_only=True)
    speaker_embeddings = torch.randn(1, config.speaker_embedding_dim)
    
    # 生成语音的声谱图
    spectrogram = model.generate_speech(inputs["input_ids"], speaker_embeddings)
    
    # 将声码器对象传递给 generate_speech 时,它会直接输出语音波形
    speech = model.generate_speech(inputs["input_ids"], speaker_embeddings, vocoder=vocoder)
    
    # 保存生成的音频
    sf.write("tts_example.wav", speech.numpy(), samplerate=16000)
    
    print("Speech synthesis completed successfully.")
except Exception as e:
    print(f"An error occurred: {e}")

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值