paper list

总结一下自己看过的文章目录,以为看了很多,发现才一丢丢,距离读千篇论文的目标还很远啊

前端降噪
  1. 汪德亮2018–Supervised Speech Separation Based on DeepLearning: An Overview
声码器
  1. WaveNet:a generate model for raw audio
  2. WAVGLOW: A flow-based generative network for speech synthesis
  3. Flowavenet:A Generative Flow for Raw Audio
  4. LPCNET: IMPROVING NEURAL SPEECH SYNTHESIS THROUGH LINEAR PREDICTION
  5. WORLD声码器:A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications
  6. Harvest: A high-performance fundamental frequency estimator from speech signals
  7. 2018 ins : WaveNet Vocoder with Limited Training Data for Voice Conversion
识别
  1. x-vector:Deep Neural Network Embeddings for Text-Independent Speaker Verification
  2. [2019 ASRU] [fanzhiyun] SPEAKER-AWARE SPEECH-TRANSFORMER
  3. Language Identification with Deep Bottleneck Features
TTS
  1. Tacotron: Towards End-to-End Speech Synthesis
  2. tacotron2: Natural TTS Synthesis by Conditioning Wavenet on mel spectrogram predictions
  3. 2017NIPS----deep voice2:Multi-Speaker Neural Text-to-Speech
  4. Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
  5. GST–Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
  6. 2018INPS:Neural Voice Cloning with a Few Samples
  7. Uncovering Latent Style Factors for Expressive Speech Synthesis
  8. Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis
voice conversion
  1. 2016ICME:Phonetic posteriorgrams for many-to-one voice conversion without parallel data training
  2. Non-parallel voice conversion using variational auto-encoders conditioned by phonetic PPGs
  3. 2019trans–Sequence-to-Sequence Acoustic Modeling for Voice Conversion
  4. 2019ins:A Vocoder-free WaveNet Voice Conversion with Non-Parallel Data
  5. 2018ins–Wavelet Analysis of Speaker Dependent and Independent Prosody for Voice Conversion
  6. Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
  7. 2019icas–Cross-lingual Voice Conversion with Bilingual Phonetic PosteriorGram and Average Modeling
  8. Odessey 2018:Average Modeling Approach to Voice Conversion with Non-Parallel Data
  9. trans:Voice conversion with SI-DNN and KL divergence based mapping without parallel training data
  10. Voice Conversion Across Arbitrary Speakers based on a Single Target-Speaker Utterance
  11. 2018trans,zhangjingxuan----Sequence-to-Sequence Acoustic Modeling for Voice Conversion
  12. 2018 icassp:improving sequence-to-sequence voice conversion by adding text-supervision[zhangjinxuan]
  13. 2019trans:Non-Parallel Seq2Seq Voice Conversion with Disentangled Linguistic and Speaker Representations[zhangjingxuan]
  14. [2019ins] One-shot Voice Conversion with Global Speaker Embeddings
  15. 2019ins—Fast Learning for Non-Parallel Many-to-Many Voice Conversion with Residual Star-GAN
  16. Many-to-many Cross-lingual Voice Conversion with a Jointly Trained Speaker Embedding Network
  17. Mellotron:Multi-speaker expressive voice synthesis by conditioning on rhythm, pitch and global style
  18. [2019 interspeech]One-shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization
  19. [2019 interspeech]One-shot Voice Conversion with Disentangled Representations by Leveraging Phonetic Posteriorgrams
  20. [2019 ASRU]Zhou Y , Tian X , Emre Yılmaz, et al. A Modularized Neural Network with Language-specific Output Layers for Cross-lingual Voice Conversion[C]// Accepted by ASRU 2019. 2019.
  21. [2020] Vocoder-free End-to-End Voice Conversion with Transformer Network
GAN
  1. [2019ASRU]-ON THE STUDY OF GENERATIVE ADVERSARIAL NETWORKS FOR CROSS-LINGUAL VOICE CONVERSION
  2. [2019 interspeech] Non-parallel Voice Conversion using Weighted Generative Adversarial Networks
  3. [2017][cycle-GAN-vc的初文章]Parallel-data-free voice conversion using cycle-consistent adversarial networks
  4. [2018][IEEE SLT] StarGAN-VC: non-parallel many-to-many voice conversion with StaGAN
  5. [2019 interspeech]Fast Learning for Non-Parallel Many-to-Many Voice Conversion with Residual Star-GAN
transformer结构
  1. Attention Is All You Need
  2. FastSpeech: Fast, Robust and Controllable Text to Speech
  3. Neural Speech Synthesis with Transformer Network
没有收获的
  1. [2019 interspeech] Whether To Pretrain DNN or Not?: An Empirical Analysis for Voice Conversion
2020 icassp
  1. [VAE][one-shot] ONE-SHOT VOICE CONVERSION BY VECTOR QUANTIZATION
  2. [FHVAE] [情感vc] MULTI-SPEAKER AND MULTI-DOMAIN EMOTIONAL VOICE CONVERSION USING FACTORIZED HIERARCHICAL VARIATIONAL AUTOENCODER
singing VC
  1. 2019 APSIPA —SINGAN: Singing Voice Conversion with Generative Adversarial Networks
  2. SINGING VOICE CONVERSION WITH NON-PARALLEL DATA
  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值