经典回声消除背景知识学习笔记
前言
为什么三番五次滴研究回声消除,用一句经典“因为山就在那儿。”(“Because it’s there.”)来形容最为贴切,
人类第一次登上海拔8848.43米的珠穆朗玛峰壮举的,也许不是有记录的1953年5月29日著名登山家新西兰人埃德蒙•希拉里以及尼泊尔的夏尔巴人丹增•诺尔盖,而是1924年6月8日在第二次冲顶珠峰时壮烈牺牲的英国著名登山家乔治.马洛里,不仅如此,马洛里在回答记者“为何想要攀登珠穆朗玛峰”提问时,他回答“因为山就在那里!”,更让人无不钦佩,赞叹。《from百度百科》
回声消除或者说抑制是经典语音处理的珠峰,每次尝试攀登都会有不同的感觉,也会有不同的收获,这篇笔记就记录一下所收集的一些零散的背景知识,试图建立一种知识体系,当你面对各种回声的时候,不至于手足无措;当你需要研究相关术语或方法的时候有的放矢。这是综述性的笔记,概念比较分散。
ERLE
ERL是一个术语,可以翻译成回路损耗。那么加了enhancement(增益)有何解。在搜索引擎的帮助下,看到了 10.ERLE,PESQ 回声消除评价指标给出了一个算法,起初觉得这个公式叫做ERL更为合适,后来有翻阅英文网站echo-cancellation-part-1-the-basics-and-acoustic-echo-cancellation,终于算是理解了其中滋味,其实这个公式对两者都是用,而差异是在echo回路的哪个节点来看的
正如这篇文章所讲,ITU’s specifications 要求ERL大于6dB的线性回声都能够被抑制。直接粘贴另一端论述,不翻译了。
Estimating Echo Return Loss and Echo Return Loss Enhancement
Acoustic echo cancellation (AEC) is a signal processing technique
that is used to achieve echo-free full-duplex communication in a
telecommunications system that has acoustic coupling between
the loudspeaker and microphone. The difficulty with AEC over line
echo cancellation (LEC) is the variability not only in the echo path,
but also in the implementation. For LEC systems, the coupling
resulting from the hybrid is relatively steady between implementations.
Whereas for AEC systems, the coupling between the loudspeaker and
microphone can vary significantly depending on the design of the loudspeaker
and microphone enclosure as well as the acoustics of the room in which
the device is deployed. Therefore, in order to achieve an ubiquitous solution
for an acoustic system, intelligent control of the adaptive filter is required
for the echo canceller as well as the post-filter.
In Variable Stepsize and Regularization Parameters for NLMS, it was shown that
the performance of the echo canceller can be improved with variable step-size control.
Optimum control of the step-size parameter is based on the convergence state of the
canceller. In Post Filtering for Residual Echo Control it was concluded that a post-filter
can be designed to reduce the residual echo from the linear adaptive filter. Optimum
control of the post-filter is based on the estimate of this residual, which in turned is
based on the convergence state of the canceller. In addition, systems which employ
the two-path method require an estimate of the convergence of the foreground and
background filters to decide which filter set is in the most beneficial state. From the
three examples above, it is clear that the ability to obtain a quick and accurate
estimate of convergence of the acoustic echo canceller is crucial to the performance of the entire system.
To obtain an estimate of convergence or the Echo Return Loss Enhancement (ERLE),
one must first estimate the coupling factor or the Echo Return Loss (ERL) of the
loudspeaker-microphone enclosure. An estimate of the ERL is required to determine
how much attenuation can be attributed to the echo path and how much can be attributed
to the echo canceller. The coupling factor determines the attenuation or possible gain in the path.
There are two main approaches to estimating the coupling factor of an echo canceller. The first
method is amplitude based while the second is cross-spectrum based. The amplitude based
method to estimate ERL is the average spectral energy of the near-end signal over the average
spectral energy of the far-end signal. This approach should only be updated during periods of
known far-end signal energy and should not be updated during periods of double-talk. In the
cross-spectrum based method, the far-end and near-end spectrum signals are multiplied and
summed over a long period of frames. Then it is normalized by the far-end signal energy.
This method is unaffected by double-talk of the near-end speaker and far-end speaker
as long as they are uncorrelated. The downside to this method is the echo path changes
are not followed accurately due to the long averaging period. However using a combination
of the two methods will allow for quick and accurate estimation of the ERL, and hence,
proper control of the entire echo cancellation system.
线性回声和非线性回声
线性回声和非线性回声,这两个概念一直是回声消除领域经常提及的词汇, 非线性声学回声消除技术一文中对两者的关系和引入阶段做了非常明确的概述。除此之外,还可以先果后因的理解:能被自适应滤波器消除的,都可以理解为线性回声,而一般把残留的那部分被称为非线性回声,偷懒吧。。。
腔体影响和回声延时
这里实践性的东西更多一些,但好的经验和设计对回声抑制是绝对起到积极作用的,这里要积累的还很多。
时域自适应滤波
再ANC 与 adaptive filter做过一些笔记,过于复杂的算法暂时没有研究。
频域自适应滤波
webrtc的aec中的算法WebRtc AEC核心算法之一:频域自适应滤波据说是计算效率最优的一个方案,也许孤陋寡闻,但确实没接触过更好的了。
双耦合滤波器
非线性声学回声消除技术提到了这个方法,好像speex中也用了双滤波器方案,但不是双耦合方式。
双讲(通)判断
双讲场景是回声消除过程的难点,处理好了也是亮点,处理不好就是灾难。Double Talk Detection 即双讲检测是处理双讲场景的第一关。用双讲把回声消除的场景分成如下四个象限:
Senario | near noise | near talk |
---|---|---|
far noise | Double Noise:None filter | FN-NT:None filter |
far talk | FT-NT: Adapt filter updated | Double Talk: filter without tuning adaptive coefficient |
Double-Talk Detection in Echo Cancellation 一文中总结了经典的两种DTD方法,以Geigel Algorithm为例的基于能量比较的方法和基于Cross correlation的向量比较方法。还有一种利用双耦合滤波器的跟踪滤波器发散场景反向推算双讲出现。
开源世界中的回声消除
大名鼎鼎的webrtc和speex让众多从业者很容易接触到核心算法,但这些理解起来也不是非常容易,下面简单的将aecm中用到的一些流程,画些框图,帮助阅读代码的时候快速理解。
AECM
主要调用函数的联系图:
buffer管理的关系图:
其中aecm energy calculation的框图如下
这个算法的剖析在网上有很多文章,推荐WEBRTC-AECM算法浅析和LearningWebRTC: AECM两篇大作,但对比试验以及分析下来,这个算法可能有点问题,存在改进空间。
WebRTC_AEC和speex
这两个算法可能是被用来对比和porting最多的互联网神器,把他们放在一起来说,也是因为他们的技术特点很多相似之处,WebRTC_AEC来自《On the implementation of a partitioned block frequency domain adaptive filter (PBFDAF) for long acoustic echo cancellation》 Jose M. Paez Borrallo.etc,简称PBFDAF,speex来自《Multidelay block frequency domain adaptive filter》J.-S. Soo; K.K. Pang。可以看出两者都是block frequency domain adaptive filter范畴,非权威调研根系是下面的两篇文章,1978年Dentino等在《Adaptive filtering in the frequency domain》提出了频域滤波器,紧接着Ferrara在1980年的《Fast implementations of LMS adaptive filters》提出了频域最小均方差方法使得频域地维纳解实现了快速收敛。几年前曾经硬(着头皮)读过webrtc的aec,做过笔记,但speex却没勇气看下去了。【论文笔记之 MDF】Multidelay Block Frequency Domain Adaptive Filter这篇文章写的很有深度,结合论文代码学习的话一定受益匪浅。Speex 一个双声道回声消除的小demo还有一个手把手教程,就着[简话语音识别] 语音前端信号处理——回声消除算法这道菜一起看,想给实践的提供更快捷通道。
Webrtc_AEC3
最新的据说也是最好的,没接触过,看过一些文章,好像是说已经上升到卡尔曼滤波的能力。首推WEBRTC AEC3算法原理这篇,短时期也没有学习的动力。
kalman 滤波器的应用
使用卡尔曼滤波器进行回声消除给了一个demo,看上去很有吸引力。
参考
G.160 : Voice enhancement devices
Perceptual Echo Control and Delay Estimation
G.168 : Digital network echo cancellers
Transmission systems and media, digital systems and networks
Telephone transmission quality, telephone installations, local line networks
WEBRTC-AECM算法浅析
10.ERLE,PESQ 回声消除评价指标
非线性声学回声消除技术
针对回声消除应用的自适应算法评价标准研究
AEC声学回音消除技术
VOCAL Technologies > Echo Cancellation
Estimating Echo Return Loss and Echo Return Loss Enhancement
回声消除器及其测试方法的研究
EG 202 396-1 Background noise database
echo-cancellation-part-1-the-basics-and-acoustic-echo-cancellation
Q-‐Sys Acoustic Echo Cancellation (AEC)
硬货专栏 |深入浅出 WebRTC AEC(声学回声消除)
webrtc aec3效果对比aec与aecm
LearningWebRTC: AECM
信号处理-回声消除(AEC)的原理简述
WEBRTC-AECM算法浅析
realloc函数用法解释
WebRTC 的回声抵消(AEC、AECM)算法简介
“不对称波形”和全通滤波器