End-to-end
1 .END-TO-END MULTI-SPEAKER SPEECH RECOGNITION WITH TRANSFORMER
key :transformer,overlapped speech recognition,neural beamforming, speech separation
2. STREAMING AUTOMATIC SPEECH RECOGNITION WITH THE TRANSFORMER MODEL
这篇文章主要介绍了利用time-restricted self-attention对transformer的流式解码实现.
3.JOINT PHONEME-GRAPHEME MODEL FOR END-TO-END SPEECH RECOGNITION
Noise Speech Recognition
- IMPROVED ROBUST ASR FOR SOCIAL ROBOTS IN PUBLIC SPACES
key: public space speech,Signal to noise ratio
这篇文章主要介绍公众场合,低信噪比环境对语音识别的影响。文中提到不同声学环境下的混响系数T60的值分别是多少,以及不同声学环境下的信噪比估计。 - One-Pass Single-Channel Noisy Speech Recognition Using a Combination of Noisy and Enhanced Features
key: single channel process,feature combine, feature/sub-network-level com-
bination, gating mechanism
这篇文章主要介绍了一种将降噪特征与原始特征进行联合使用的方法. 经过speech enhancement 方法处理后的语音,虽然人耳听觉感官上音质变得清晰,但是在语音识别系统中,往往由于语音增强造成的信号失真,导致识别率反而降低. 本文通过将降噪特征与原始信号特征进行融合的方法,可以有效提高模型识别率. - Deep Learning for Distant Speech Recognition
- An Overview of Noise-Robust Automatic Speech Recognition
- SINGLE- AND TWO-CHANNEL NOISE REDUCTION FOR ROBUST SPEECH RECOGNITION
- INVESTIGATION OF MONAURAL FRONT-END PROCESSING FOR ROBUST ASR WITHOUT RETRAINING OR JOINT-TRAINING
- IMPROVING NOISE ROBUSTNESS OF AUTOMATIC SPEECH RECOGNITION VIA
PARALLEL DATA AND TEACHER-STUDENT LEARNING - Adversarial Feature-Mapping for Speech Enhancement
- Speech Denoising with Deep Feature Losses
- A FULLY CONVOLUTIONAL NEURAL NETWORK FOR SPEECH ENHANCEMENT
- Bridging the gap between monaural speech enhancement and recognition with distortion-independent acoustic modeling
Code-Switching
- Code-Switched Language Models Using Neural Based Synthetic Data from Parallel Sentences