单通道语音增强Speech enhancement文献汇总

论文一:基于注意机制的神经网络单通道语音增强方法
核心工作:提出基于注意机制的单通道语音增强方法,关注音频流的重要语音成分并适当降低对噪声、干扰的注意力。
注意力机制(attention-based):
1. 注意力机制需要决定整段输入的哪个部分需要更加关注;
2. 从关键的部分进行特征提取,得到重要的信息。
3. RNN模型在预测增强帧时隐式地学习过去输入特征的权重,而注意机制计算过去帧与要增强的当前帧之间的相关性,并且明确地给过去的帧赋予权重;
算法分类:
1. 统计算法:谱减法、维纳滤波、最小均方差对数谱法等
2. 数据驱动算法:非负矩阵分解、神经网络
神经网络结构:分为编码器(堆叠和扩展)、注意机制、生成器。
在这里插入图片描述
实验结果:
1. 基于注意力机制的LSTM始终优于最佳修正对数谱(OM-LSA)和LSTM,堆叠编码器略好于扩展编码器
2. 因果局部注意力模型具有比因果动态注意力模型更好的性能或甚至更好的性能,证明语音增强不需要考虑太长的历史信息

论文二:基于BLSTM的语音增强的师生学习
核心工作:本文提出了一种单通道语音增强的学生教师学习模式。将

  • 1
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 10
    评论
Title: Speech Enhancement: Theory and Practice, 2nd Edition Author: Philipos C. Loizou Length: 711 pages Edition: 2 Language: English Publisher: CRC Press Publication Date: 2013-02-25 ISBN-10: 1466504218 ISBN-13: 9781466504219 With the proliferation of mobile devices and hearing devices, including hearing aids and cochlear implants, there is a growing and pressing need to design algorithms that can improve speech intelligibility without sacrificing quality. Responding to this need, Speech Enhancement: Theory and Practice, Second Edition introduces readers to the basic problems of speech enhancement and the various algorithms proposed to solve these problems. Updated and expanded, this second edition of the bestselling textbook broadens its scope to include evaluation measures and enhancement algorithms aimed at improving speech intelligibility. Fundamentals, Algorithms, Evaluation, and Future Steps Organized into four parts, the book begins with a review of the fundamentals needed to understand and design better speech enhancement algorithms. The second part describes all the major enhancement algorithms and, because these require an estimate of the noise spectrum, also covers noise estimation algorithms. The third part of the book looks at the measures used to assess the performance, in terms of speech quality and intelligibility, of speech enhancement methods. It also evaluates and compares several of the algorithms. The fourth part presents binary mask algorithms for improving speech intelligibility under ideal conditions. In addition, it suggests steps that can be taken to realize the full potential of these algorithms under realistic conditions. What’s New in This Edition Updates in every chapter A new chapter on objective speech intelligibility measures A new chapter on algorithms for improving speech intelligibility Real-world noise recordings (on accompanying CD) MATLAB® code for the implementation of intelligibility measures (on accompanying CD) MATLAB and C/C++ code for the implementation of algorithms to improve speech intelligibility (on accompanying CD) Valuable Insights from a Pioneer in Speech Enhancement Clear and concise, this book explores how human listeners compensate for acoustic noise in noisy environments. Written by a pioneer in speech enhancement and noise reduction in cochlear implants, it is an essential resource for anyone who wants to implement or incorporate the latest speech enhancement algorithms to improve the quality and intelligibility of speech degraded by noise. Includes a CD with Code and Recordings The accompanying CD provides MATLAB implementations of representative speech enhancement algorithms as well as speech and noise databases for the evaluation of enhancement algorithms. Table of Contents Chapter 1 Introduction Chapter 2 Discrete-Time Signal Processing and Short-Time Fourier Analysis Chapter 3 Speech Production and Perception Chapter 4 Noise Compensation by Human Listeners Chapter 5 Spectral-Subtractive Algorithms Chapter 6 Wiener Filtering Chapter 7 Statistical-Model-Based Methods Chapter 8 Subspace Algorithms Chapter 9 Noise-Estimation Algorithms Chapter 10 Evaluating Performance of Speech Enhancement Algorithms Chapter 11 Objective Quality and Intelligibility Measures Chapter 12 Comparison of Speech Enhancement Algorithms Chapter 13 Algorithms That Can Improve Speech Intelligibility Appendix A: Special Functions and Integrals Appendix B: Derivation of the MMSE Estimator Appendix C: MATLAB ® Code and Speech/Noise Databases
鲁棒的GSC波束形成方法是一种用于语音增强的方法,它使用线性麦克风阵列。GSC(Generalized Sidelobe Canceller)是一种常见的用于语音增强的方法,它能够优化麦克风阵列的波束形成性能。 在这种方法中,首先使用线性麦克风阵列采集到的原始语音信号。然后,通过对麦克风信号进行预处理和声学模型建立,可以建立起麦克风阵列的声学特性。 在预处理阶段,可以使用一些降噪算法来去除噪声对语音信号的影响。例如,常用的算法包括谱减法、短时时域幅度估计(STSA)等。这些算法可以通过对麦克风阵列中的信号进行频谱分析和幅度估计来实现噪声的去除。 在声学模型建立阶段,通常会使用一些机器学习算法,例如统计模型(如高斯混合模型)或神经网络模型,来建立麦克风阵列的声学特性模型。这些模型可以通过对语音信号进行建模和预测来优化麦克风阵列的波束形成性能。 最后,在语音增强阶段,通过对预处理后的信号应用波束形成算法,可以将麦克风阵列的指向性增强到语音源的方向,从而增强语音信号的清晰度和质量。这样,用户在接收到语音信号时,可以更清晰地听到对方的声音,减小噪声对语音信号的干扰。 总之,通过采用线性麦克风阵列和GSC波束形成方法,我们可以实现对语音信号的鲁棒增强,提高语音清晰度和质量。同时,该方法还可以有效抑制噪声对语音信号的干扰,提高语音通信的可靠性和质量。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 10
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值