speech recognition - 语音识别
speech recognition - 语音识别
Yongqiang Cheng
既然选择了远方 便只顾风雨兼程 - 永强
展开
-
end2end-asr-pytorch - PAD_TOKEN - SOS_TOKEN - EOS_TOKEN
end2end-asr-pytorch -PAD_TOKEN - SOS_TOKEN - EOS_TOKEN1. end2end-asr-pytorchhttps://github.com/gentaiscool/end2end-asr-pytorchhttps://github.com/gentaiscool/end2end-asr-pytorch/blob/master/utils/constant.py2.utils/constant.pyPAD_TOKEN = 0SOS_..原创 2016-08-07 23:38:28 · 1328 阅读 · 7 评论 -
Positional Encoding - 位置编码
Positional Encoding - 位置编码1. Positional EncodingSince our model contains no recurrence and no convolution, in order for the model to make use of the order of the sequence, we must inject some information about the relative or absolute position of the tok原创 2020-06-12 22:16:10 · 5264 阅读 · 2 评论 -
torchaudio - Python wave 读取音频数据对比
torchaudio - Python wave 读取音频数据对比1. torchaudio: an audio library for PyTorchhttps://github.com/pytorch/audioData manipulation and transformation for audio signal processing, powered by PyTorch.torchaudio: an audio library for PyTorchhttps://github.com原创 2020-06-07 22:50:12 · 6804 阅读 · 13 评论 -
end2end-asr-pytorch - audio processing - speech signal processing
end2end-asr-pytorch - audio processing - speech signal processinghttps://github.com/gentaiscool/end2end-asr-pytorch采样频率/取样频率是每秒钟采集声音样本的次数。采样频率越高,声音质量越好,声音还原越真实,同时占用资源越多。采样位数/量化精度/采样值/取样值是采样样本幅度量化,用来衡量声音波动变化的一个参数。数值越大,分辨率越高,发出声音的能力越强。采样数据记录的是振幅,采样精度取决于采原创 2020-06-07 22:14:13 · 1260 阅读 · 0 评论 -
语音信号处理 (speech signal processing) - 参考文献
语音信号处理 (speech signal processing) - 参考文献语音信号处理基础 (Fundamentals of Speech Signal Processing)http://staff.ustc.edu.cn/~zhling/Course_SSP/原创 2020-06-07 21:55:46 · 1820 阅读 · 0 评论 -
end2end-asr-pytorch
end2end-asr-pytorchhttps://github.com/gentaiscool/end2end-asr-pytorchEnd-to-End Automatic Speech Recognition on PyTorch.End-to-End Speech Recognition on Pytorch.Transformer-based Speech Recognition Model.end-to-end:adj. 端到端的,端点对端点的 n. 不断地automatic sp原创 2020-06-05 22:28:02 · 1253 阅读 · 0 评论 -
读取 wav 格式声音文件
读取 wav 格式声音文件http://bigsec.net/b52/scipydoc/wave_pyaudio.htmlPython 支持 wav 文件的读写,实时的声音输入输出需要安装 pyAudio,pyMedia 进行 MP3 的解码和播放。wav 是 Microsoft 开发的一种声音文件格式,通常被用来保存未压缩的声音数据 (Pulse Code Modulation,PCM,脉冲编码调制)。wav 有三个重要的参数:声道数、采样频率和量化位数。声道数:单声道 (mono) 或者是双声道原创 2020-06-04 01:14:24 · 2295 阅读 · 0 评论 -
wave - Read and write WAV files (读写 WAV 格式文件)
wave - Read and write WAV files (读写 WAV 格式文件)https://docs.python.org/3/library/wave.htmlThe wave module provides a convenient interface to the WAV sound format. It does not support compression/decompression, but it does support mono/stereo.wave 模块提供了一个处翻译 2020-06-04 00:02:59 · 1703 阅读 · 0 评论 -
TIMIT dataset - The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus
TIMIT dataset - The DARPA TIMIT Acoustic-Phonetic Continuous Speech CorpusDefense Advanced Research Projects Agency,DARPA:美国国防高级研究计划局Advanced Research Projects Agency,ARPA:高等研究计划局acoustic /əˈkuːstɪk/:adj. 声学的,音响的,听觉的 n. 原声乐器,不用电传音的乐器phonetic /fəˈnetɪk/翻译 2020-06-03 00:23:41 · 1601 阅读 · 0 评论 -
tensorflow - tensor2tensor - v1.0.12
tensorflow - tensor2tensor - v1.0.12Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.深度学习模型和数据集的库,旨在使深度学习更易于访问并加速 ML 研究。tensor2tensor 是一个库,里面封装了很多的模型。https://github.com/tensorflow/ten原创 2020-05-30 08:23:16 · 2120 阅读 · 0 评论 -
Attention Is All You Need
Attention Is All You Need注意力机制是你需要的全部Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhinattention [ə'tenʃ(ə)n]:n. 注意,关注,注意力,关心 int. 注意,立正attention mechanism:注意力机制Computer Science,CS:翻译 2020-05-11 23:34:32 · 25375 阅读 · 3 评论 -
KALDI - Kaldi
KALDI - KaldiKALDIhttp://www.kaldi-asr.org/Documentationhttp://kaldi-asr.org/doc/index.htmlModelshttp://www.kaldi-asr.org/models.htmlKaldi中文手册https://shiweipku.gitbooks.io/chinese-doc-of-kaldi...翻译 2019-08-27 17:21:29 · 684 阅读 · 0 评论