端到端语音识别
TrainerNN
ASR/NMT
展开
-
语音识别中半监督与无监督训练
背景端到端的语音识别需要大量成对的语音-文本数据,以获得更好的performance。然而目前来说成对的数据是相对较少的相比于有标签的语音文本对,无标签的语音数据更多ASR模型的准确率依赖语言模型的rescoring,而大量的纯文本数据可以用于语言模型的构建无监督与半监督的作用充分利用未成对的数据,通过预训练等方法,对模型整体或部分网络进行“强化”。具体方法无监督(unsupervised)...原创 2020-12-06 17:04:08 · 2447 阅读 · 1 评论 -
【论文笔记】Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems
题目Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems链接https://arxiv.org/pdf/2010.04284.pdf标签Speech-to-intent, spoken language understanding, end-to-end systems, pre-trained text embedding, synthetic speechaugmentationCont原创 2020-10-25 21:29:45 · 309 阅读 · 0 评论 -
【论文笔记】ContextNet: Improving Convolutional Neural Networks for ASR with Global Context
题目ContextNet: Improving Convolutional Neural Networks for Automatic SpeechRecognition with Global Context链接https://arxiv.org/pdf/2005.03191.pdf标签Speech Recognition, CNNContributions亮点与启发文章指出:文章重点实验结果持续记录关于端到端语音识别论文与资料:https://github.com/z原创 2020-10-23 21:31:40 · 1848 阅读 · 0 评论 -
【论文笔记】Learn Spelling from Teachers: Transferring Knowledge from LM to Seq-to-Seq Speech Recognition
题目Learn Spelling from Teachers: Transferring Knowledge from LanguageModels to Sequence-to-Sequence Speech Recognition链接https://arxiv.org/pdf/1907.06017.pdf标签知识蒸馏, 外部语言模型, 端到端, sequence-to-sequenceContributions基于知识蒸馏的思想,在训练阶段,以一个预训练的基于RNN的LM作为“老师”模型原创 2020-10-22 20:46:42 · 285 阅读 · 0 评论 -
【论文笔记】Improving Transformer-based End-to-End Speech Recognition with CTC and LM Integration
题目:Improving Transformer-based End-to-End Speech Recognition with Connectionist Temporal Classification and Language Model Integration链接:http://www.isca-speech.org/archive/Interspeech_2019/abstracts/1938.html标签:Speech Recognition, Transformer, CTC, LM原创 2020-10-22 19:05:56 · 659 阅读 · 0 评论