paper list

最新推荐文章于 2024-06-03 09:58:46 发布

林林宋

最新推荐文章于 2024-06-03 09:58:46 发布

阅读量611

点赞数 2

分类专栏： paper笔记

本文链接：https://blog.csdn.net/qq_40168949/article/details/103336307

版权

paper笔记专栏收录该内容

162 篇文章 24 订阅

订阅专栏

总结一下自己看过的文章目录，以为看了很多，发现才一丢丢，距离读千篇论文的目标还很远啊

文章目录

前端降噪

汪德亮2018–Supervised Speech Separation Based on DeepLearning: An Overview

声码器

WaveNet：a generate model for raw audio
WAVGLOW： A flow-based generative network for speech synthesis
Flowavenet：A Generative Flow for Raw Audio
LPCNET: IMPROVING NEURAL SPEECH SYNTHESIS THROUGH LINEAR PREDICTION
WORLD声码器：A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications
Harvest: A high-performance fundamental frequency estimator from speech signals
2018 ins ： WaveNet Vocoder with Limited Training Data for Voice Conversion

识别

x-vector：Deep Neural Network Embeddings for Text-Independent Speaker Verification
[2019 ASRU] [fanzhiyun] SPEAKER-AWARE SPEECH-TRANSFORMER
Language Identification with Deep Bottleneck Features

TTS

Tacotron: Towards End-to-End Speech Synthesis
tacotron2: Natural TTS Synthesis by Conditioning Wavenet on mel spectrogram predictions
2017NIPS----deep voice2：Multi-Speaker Neural Text-to-Speech
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
GST–Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
2018INPS：Neural Voice Cloning with a Few Samples
Uncovering Latent Style Factors for Expressive Speech Synthesis
Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis

voice conversion

2016ICME：Phonetic posteriorgrams for many-to-one voice conversion without parallel data training
Non-parallel voice conversion using variational auto-encoders conditioned by phonetic PPGs
2019trans–Sequence-to-Sequence Acoustic Modeling for Voice Conversion
2019ins:A Vocoder-free WaveNet Voice Conversion with Non-Parallel Data
2018ins–Wavelet Analysis of Speaker Dependent and Independent Prosody for Voice Conversion
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
2019icas–Cross-lingual Voice Conversion with Bilingual Phonetic PosteriorGram and Average Modeling
Odessey 2018：Average Modeling Approach to Voice Conversion with Non-Parallel Data
trans：Voice conversion with SI-DNN and KL divergence based mapping without parallel training data
Voice Conversion Across Arbitrary Speakers based on a Single Target-Speaker Utterance
2018trans,zhangjingxuan----Sequence-to-Sequence Acoustic Modeling for Voice Conversion
2018 icassp：improving sequence-to-sequence voice conversion by adding text-supervision[zhangjinxuan]
2019trans：Non-Parallel Seq2Seq Voice Conversion with Disentangled Linguistic and Speaker Representations[zhangjingxuan]
[2019ins] One-shot Voice Conversion with Global Speaker Embeddings
2019ins—Fast Learning for Non-Parallel Many-to-Many Voice Conversion with Residual Star-GAN
Many-to-many Cross-lingual Voice Conversion with a Jointly Trained Speaker Embedding Network
Mellotron：Multi-speaker expressive voice synthesis by conditioning on rhythm, pitch and global style
[2019 interspeech]One-shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization
[2019 interspeech]One-shot Voice Conversion with Disentangled Representations by Leveraging Phonetic Posteriorgrams
[2019 ASRU]Zhou Y , Tian X , Emre Yılmaz, et al. A Modularized Neural Network with Language-specific Output Layers for Cross-lingual Voice Conversion[C]// Accepted by ASRU 2019. 2019.
[2020] Vocoder-free End-to-End Voice Conversion with Transformer Network

GAN

[2019ASRU]-ON THE STUDY OF GENERATIVE ADVERSARIAL NETWORKS FOR CROSS-LINGUAL VOICE CONVERSION
[2019 interspeech] Non-parallel Voice Conversion using Weighted Generative Adversarial Networks
[2017][cycle-GAN-vc的初文章]Parallel-data-free voice conversion using cycle-consistent adversarial networks
[2018][IEEE SLT] StarGAN-VC： non-parallel many-to-many voice conversion with StaGAN
[2019 interspeech]Fast Learning for Non-Parallel Many-to-Many Voice Conversion with Residual Star-GAN

transformer结构

Attention Is All You Need
FastSpeech: Fast, Robust and Controllable Text to Speech
Neural Speech Synthesis with Transformer Network

没有收获的

[2019 interspeech] Whether To Pretrain DNN or Not?: An Empirical Analysis for Voice Conversion

2020 icassp

[VAE][one-shot] ONE-SHOT VOICE CONVERSION BY VECTOR QUANTIZATION
[FHVAE] [情感vc] MULTI-SPEAKER AND MULTI-DOMAIN EMOTIONAL VOICE CONVERSION USING FACTORIZED HIERARCHICAL VARIATIONAL AUTOENCODER

singing VC

2019 APSIPA —SINGAN: Singing Voice Conversion with Generative Adversarial Networks
SINGING VOICE CONVERSION WITH NON-PARALLEL DATA

林林宋

关注

2
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
paper list

【1】汪德亮2018–Supervised Speech Separation Based on DeepLearning: An Overview【2】 WaveNet：a generate model for raw audio【3】WAVGLOW： A flow-based generative network for speech synthesis【4】2016ICME：Ph...
复制链接

扫一扫

专栏目录