语音版语言模型【Audio LLM】

u013250861

已于 2024-05-21 02:23:44 修改

阅读量85

点赞数

分类专栏： # Audio/ASR 文章标签： python 人工智能深度学习

于 2024-05-21 02:22:42 首次发布

本文链接：https://blog.csdn.net/u013250861/article/details/139078598

版权

Audio/ASR 专栏收录该内容

31 篇文章 62 订阅 ¥15.90 ¥99.00

订阅专栏

超级会员免费看

In this repository, we survey three crucial areas: (1) representation learning, (2) neural codec, and (3) language models that contribute to speech/audio large language models.

1.⚡ Speech Representation Models: These models focus on learning structural speech representations, which can then be quantized into discrete speech tokens, often refer to semantic tokens.

2.⚡ Speech Neural Codec Models: These models are designed to learn speech and audio discrete tokens, often referred to as acoustic tokens, while maintaining reconstruction ability and low bitrate.

3.⚡ Speech Large Language Models: These models are trained on top of speech and acoustic tokens in a language modeling approach. They demonstrate proficiency in tasks on

了解本专栏

订阅专栏解锁全文

超级会员免费看

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

u013250861

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
语音版语言模型【Audio LLM】

In this repository, we survey three crucial areas: (1) representation learning, (2) neural codec, and (3) language models that contribute to speech/audio large language models.1.⚡Speech Representation Models:These models focus on learning structural sp
复制链接

扫一扫