FallenDarkStar-CSDN博客

原创【论文学习】ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation

目前的说话人验证技术依赖于神经网络来提取说话人的表征。成功的x-vector架构是一个时延神经网络(TDNN)，它应用统计池化将变长语句投影到表征嵌入的定长说话人中。本文基于人脸验证和计算机视觉相关领域的最新趋势，对该架构提出了多种增强。首先，将初始帧层重构为具有有效跳跃连接的1维Res2Net模块；与SE-ResNet类似，在这些模块中引入了挤压和激发块，以显式地对通道相互依赖性进行建模。SE块根据记录的全局特性重新缩放通道，扩展帧层的时间上下文。

2023-05-09 14:27:36 3613

原创【论文学习】《Source Mixing and Separation Robust Audio Steganography》

音频隐写技术是通过在载体上进行不易察觉的修改，将秘密信息隐藏在载体音频中。虽然以前的工作解决了隐藏消息恢复对传输过程中引入的失真的鲁棒性，但他们没有解决对侵略性编辑(如混合其他音频源和源分离)的鲁棒性。在这项工作中，我们首次提出了一种隐写方法，可以将信息嵌入到混合的单个声源中，如音乐中的乐器音轨。为此，我们提出了一个时域模型和课程学习，以学习从分离源解码隐藏信息。实验结果表明，该方法成功地将信息隐藏在难以察觉的扰动中，并且通过源分离算法，即使混合了其他源和分离，也能正确地恢复信息。...

2022-07-20 14:00:43 1764 2

原创【论文学习】《VQMIVC》

《VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion》论文学习文章目录《VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement

2022-05-30 20:56:46 1648

原创【论文学习】《Who is Real Bob? Adversarial Attacks on Speaker Recognition Systems》

说话人识别(SR)作为一种生物特征认证或识别机制在我们的日常生活中得到了广泛的应用。SR的流行带来了严重的安全问题，最近的对抗攻击证明了这一点。然而，这种威胁在实际的黑盒场景中的影响仍然是未探索的，因为当前的攻击只考虑白盒场景。

2022-05-19 10:54:27 1942

原创【论文学习】《Adversarial Attacks on GMM i-vector based Speaker Verification Systems》

《Adversarial Attacks on GMM i-vector based Speaker Verification Systems》论文学习文章目录《Adversarial Attacks on GMM i-vector based Speaker Verification Systems》论文学习摘要 1 介绍 2 自动说话人验证系统 2.1 高斯混合模型 i-vector提取 2.2 x-vector

2022-04-29 10:39:46 564

原创【论文学习】《Adversarial examples for generative models》

《Adversarial examples for generative models》论文学习文章目录《Adversarial examples for generative models》论文学习摘要 1 介绍 2 相关工作与背景 2.1 关于对抗的相关工作 3 问题定义 4 攻击方法 5 评估 6 结论摘要我们探讨了在

2022-04-24 16:29:53 1032

原创【论文学习】《Practical Attacks on Voice Spoofing Countermeasures》

我们开发了第一个针对CMs的实际攻击，并展示了恶意行为者如何高效地制作音频样本，以绕过最严格形式的语音认证。以前的工作主要集中在非主动攻击或对抗策略的ASV，而不是使用受害者的音色生成语音。我们攻击的后果要严重得多，因为我们生成的样本听起来像受害者，消除了受害者任何貌似合理的推脱机会。

2022-03-23 22:35:08 4683

原创【论文学习】《One-shot Voice Conversion by Separating Speaker and Content Representations with IN》

近年来，在没有并行数据的情况下，通过训练单个模型在输入语音转换(VC)为多个不同说话人的多目标场景下，成功实现了语音转换。但该模型存在局限性，只能将训练数据中说话人进行语音转换，从而缩小了VC的适用场景。在本文中，我们提出了一种新颖的一次性VC方法，它可以分别通过源说话人和目标说话人的一个示例语音来执行VC，并且在训练过程中源说话人和目标说话人甚至不需要出现。这是通过将说话人和内容表示与实例规范化(IN)分开来实现的。客观和主观评价表明，我们的模型能够生成与目标说话人相似的声音。

2022-02-26 19:01:30 3049

原创【论文学习】《Tacotron: Towards End-to-End Speech Synthesis》

《Tacotron: Towards End-to-End Speech Synthesis》论文学习文章目录《Tacotron: Towards End-to-End Speech Synthesis》论文学习摘要 1 介绍 2 相关工作 3 模型架构 3.1 CBHG模块 3.2 编码器 3.3 解码器 3.4 后处理网和波形合成 4 模

2022-01-23 12:19:18 4737

原创【论文学习】《Generalized End-to-End Loss for Speaker Verification》

《Generalized End-to-End Loss for Speaker Verification》论文学习文章目录《Generalized End-to-End Loss for Speaker Verification》论文学习摘要 1 介绍 1.1 背景 1.2 基于元组的端到端损失 1.3 概述 2 广义的端到端模型 2.1 训练方法 &

2021-11-26 12:42:18 3268

原创【论文学习】《“Hello, It’s Me”: Deep Learning-based Speech Synthesis Attacks in the Real World》

《“Hello, It’s Me”: Deep Learning-based Speech Synthesis Attacks in the Real World》论文学习文章目录《“Hello, It’s Me”: Deep Learning-based Speech Synthesis Attacks in the Real World》论文学习摘要 1 介绍 2 背景 2.1 基于语音的用户标识 2.2 语音合

2021-11-23 19:39:46 5511

原创【论文学习】《Defending Your Voice: Adversarial Attack on Voice Conversion》

《Defending Your Voice: Adversarial Attack on Voice Conversion》论文学习文章目录《Defending Your Voice: Adversarial Attack on Voice Conversion》论文学习摘要 1 介绍 2 相关工作 2.1 语音转换 2.2 声音的攻击与防御 3 方法论 3.1 端到端攻击&nb

2021-11-18 21:31:19 3810

原创【论文学习】《A Overview of Spoof Speech Detection for Automatic Speaker Verification》

《A Overview of Spoof Speech Detection for Automatic Speaker Verification》论文学习文章目录《A Overview of Spoof Speech Detection for Automatic Speaker Verification》论文学习摘要 1 介绍 2 ASV 系统：欺骗攻击 2.1 双胞胎 2.2 模仿 2.

2021-10-16 13:47:11 4662

原创【论文学习】《MOSNet: Deep Learning-based Objective Assessment for Voice Conversion》

《MOSNet: Deep Learning-based Objective Assessment for Voice Conversion》论文学习文章目录《MOSNet: Deep Learning-based Objective Assessment for Voice Conversion》论文学习摘要 1 介绍 2 语音转换挑战评估数据 2.1 2018 年语音转换挑战 2.2 数据及其分布和可预测性&nb

2021-09-27 20:23:40 3345

原创【论文学习】《Building High-level Features Using Large Scale Unsupervised Learning》

《Building High-level Features Using Large Scale Unsupervised Learning》论文学习文章目录《Building High-level Features Using Large Scale Unsupervised Learning》论文学习摘要 1 介绍 2 训练集构造 3 算法 3.1 先前的工作 3.2 架构 3

2021-09-23 20:03:48 826

原创【论文学习】《FastPitch: Parallel Text-to-speech with Pitch Prediction》

《FastPitch: Parallel Text-to-speech with Pitch Prediction》论文学习文章目录《FastPitch: Parallel Text-to-speech with Pitch Prediction》论文学习摘要 1 介绍 2 模型描述 2.1 输入符号持续时间 2.2 输入符号的音高 3 实验 3.1 设置 &nbsp

2021-08-18 21:40:21 2077 3

原创【论文学习】《Neural Speech Synthesis with Transformer Network》

《Neural Speech Synthesis with Transformer Network》论文学习文章目录《Neural Speech Synthesis with Transformer Network》论文学习摘要 1 介绍 2 背景 2.1 序列到序列模型 2.2 Tacotron2 2.3 Transformer for NMT 3 使用 Transfor

2021-08-12 17:39:11 1778

原创【论文学习】《Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks》

《Parallel WaveGAN : A fast waveform generation model based on generative adversarial networks with Multi-Resolution Spectrogram》论文学习文章目录《Parallel WaveGAN : A fast waveform generation model based on generative adversarial networks with Multi-Resolution S

2021-08-06 17:08:15 2084 3

原创【论文学习】《On Prosody Modeling For ASR+TTS Based Voice Conversion》

《On Prosody Modeling For ASR+TTS Based Voice Conversion》论文学习文章目录《On Prosody Modeling For ASR+TTS Based Voice Conversion》论文学习摘要 1 介绍 2 基于 ASR + TTS 的语音转换 2.1 整体框架和转换过程 2.2 中间表示 2.3 训练 3 基于 ASR

2021-08-03 15:09:56 1307

原创【论文学习】《Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis》

《Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis》论文学习文章目录《Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis》论文学习摘要 1 介绍 2 多说话人语音合成模型 2.1 说话人编码器 &

2021-07-24 13:36:43 1293 1

原创【论文学习】《FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech》

《FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech》论文学习文章目录《FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech》论文学习摘要 1 介绍 2 方法 2.1 动机 2.2 模型概述 2.3 方差适配器 2.4 Fa

2021-07-20 18:56:36 3878 3

原创【论文学习】《A Survey on Neural Speech Synthesis》

《A Survey on Neural Speech Synthesis》论文学习文章目录《A Survey on Neural Speech Synthesis》论文学习摘要 1 介绍 1.1 TTS 技术的历史 1.2 调查的组成 2 TTS 中的关键组件 2.1 主要分类 2.2 文本分析 2.3 文本分析

2021-07-19 22:35:40 6686 2

原创【项目实战】FastSpeech 代码解析 —— eval.py

FastSpeech 代码解析 —— eval.py文章目录FastSpeech 代码解析 —— eval.py 简介函数解析 get_DNN synthesis get_data main 简介本项目一个基于 FastSpeech 模型的语音转换项目，它是使用 PyTorch 实现的(项目地址)。&nbs

2021-06-30 16:54:03 606

原创【项目实战】FastSpeech 代码解析 —— train.py

FastSpeech 代码解析 —— train.py文章目录FastSpeech 代码解析 —— train.py 简介函数解析 main 简介本项目一个基于 FastSpeech 模型的语音转换项目，它是使用 PyTorch 实现的(项目地址)。 FastSpeech 基

2021-06-30 15:28:31 882

原创【项目实战】FastSpeech 代码解析 —— dataset.py

FastSpeech 代码解析 —— dataset.py文章目录FastSpeech 代码解析 —— dataset.py 简介函数解析 get_data_to_buffer reprocess_tensor collate_fn_tensor 简介本项目一个基于 FastSpeech 模型的语音转换项目，它是使用 PyTorch

2021-06-29 22:39:16 855 2

原创【项目实战】FastSpeech 代码解析 —— preprocess.py

FastSpeech 代码解析 —— preprocess.py文章目录FastSpeech 代码解析 —— preprocess.py 简介函数解析 preprocess_ljspeech write_metadata 简介本项目一个基于 FastSpeech 模型的语音转换项目，它是使用 PyTorch 实现的(项目地址)。 &

2021-06-29 11:56:10 1094

原创【项目实战】FastSpeech 代码解析 —— ljspeech.py

FastSpeech 代码解析 —— ljspeech.py文章目录FastSpeech 代码解析 —— ljspeech.py 简介函数解析 build_from_path _process_utterance 简介本项目一个基于 FastSpeech 模型的语音转换项目，它是使用 PyTorch 实现的(项目地址)。 &nbsp

2021-06-29 00:58:27 1506

原创【论文学习】《FastSpeech: Fast, Robust and Controllable Text to Speech》

《FastSpeech: Fast, Robust and Controllable Text to Speech》论文学习文章目录《FastSpeech: Fast, Robust and Controllable Text to Speech》论文学习摘要 1 介绍 2 背景 3 FastSpeech 3.1 前馈 Transformer 3.2 长度调节器 3.3 持续时间

2021-06-27 12:10:26 3180 1

原创【论文学习】《Replay attack detection with complementary high-resolution information using end-to-end DNN 》

《Replay attack detection with complementary high-resolution information using end-to-end DNN for the ASVspoof 2019 Challenge》论文学习文章目录《Replay attack detection with complementary high-resolution information using end-to-end DNN for the ASVspoof 2019 Challen

2021-05-22 17:42:33 1889 4

空空如也

空空如也