语音合成 语音合成 语音合成
-
TTS Synthesis with Bidirectional LSTM based Recurrent Neural
Networks -
WaveNet: A Generative Model for Raw Audio
-
SampleRNN: An Unconditional End-to-End Neural Audio Generation Model
-
Char2Wav: End-to-end speech synthesis
-
Deep Voice: Real-time Neural Text-to-Speech
-
Parallel WaveNet: Fast High-Fidelity Speech Synthesis
-
Statistical Parametric Speech Synthesis Using Generative Adversarial Networks Under A Multi-task Learning Framework
-
Tacotron: Towards End-to-End Speech Synthesis
-
VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop
-
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
-
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
-
Deep Voice 3: Scaling text-to-speech with convolutional sequence learning
-
ClariNet Parallel Wave Generation in End-to-End Text-to-Speech
-
LPCNET: IMPROVING NEURAL SPEECH SYNTHESIS THROUGH LINEAR PREDICTION
-
Neural Speech Synthesis with Transformer Network
-
Glow-TTS:A Generative Flow for Text-to-Speech via Monotonic Alignment Search
-
FLOW-TTS: A NON-AUTOREGRESSIVE NETWORK FOR TEXT TO SPEECH BASED ON FLOW
-
Conditional variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
-
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
语音识别
A Neural Probabilistic Language Model
Recurrent neural network based language model
Lstm neural networks for language modeling
Hybrid speech recognition with deep bidirectional lstm
Attention is all you need
Improving language understanding by generative pre- training
Bert: Pre-training of deep bidirectional transformers for language understanding
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Lstm neural networks for language modeling
Feedforward sequential memory networks: A new structure to learn long-term dependency
Convolutional, long short-term memory, fully connected deep neural networks
Highway long short-term memory RNNs for distant speech recognition