Summmary of Dereverberation techniques
Zhang Fan
Reverberation is caused by multipath effect when sound wave propargates in a enclosure. The received reverberated signals contains three components: direct-path signal, early reflection signal and late reflection signal. The effects of these two reflection signals are different. Early reflection signal produces coloration effect, which can enhances perception effect of direct-path signal due to the short time difference of arrival of these two signals. However, late reflection signal can be distinguished from direct-path signal by human auditory system, which sounded like “echo”. The latter component is almost undesirable in speech intelligibility perspective, and leads extreme degration to performance of ASR system.
The reverberated signal can be modelled as the convolution of clean signal and acoustic impulse response (AIR) represented by FIR. Dereverberation methods aim to restore clean signal from reverberated signals with unknown AIRs in practice. Here we assume this is a SISO/SIMO framework, which means we have only one desired source, no competing sources, and one/multiple output(s).
The most straightforward dereverberation method composed by two stages: blind channel identificaiton and channel equalization. In the first stage, blind channel identification is always realized in SIMO framework, where AIRs are bindly estimated from multiple outputs[1][2][3]. Blind indentification is not possible if common zeros exist in the channels and its performance degrates in the presense of near-common zeros which is not strange in RTFs especially for large reverberation time. A forced spectral diversity algorithm[5] employs undermodeling in combination with spectral shaping filter to reduce the effect of near-common zeros. In the second stage, channel equalizers is designed to remove the convolution effect from microphone signals. MINT[1][3] can achieve perfect dereverberation with exact channel identification when some conditions are satisfied. However, channel identification errors and additive noise are not avoidable in practice. There is need to design equalizers being robust to these errors. Many ideas come out to solve this problem, such as weighted LS[4][12], regularized MINT[7], truncated MINT[7], relaxed multichannel LS(RMCLS)[7][10][13], channel shorting, and partial MINT(P-MINT)[7] .etc. In order to reduce computation complexity, adaptive method[6] [11][14]and subband version[8][10][15][16] are proposed.
Spectral suppression method is based on the assumption that early reverberation is uncorrelated with late reverberation, such that spectral subtraction[17][23] wiener estimator[8] originally used to weaken noise component in noisy speech, can used to weaken late reverberation. Like the key parameter is noise PSD in speech enhancement, the key problem is late reverberation PSD estimation now[17][24][28][29][30][31][34][35][36][38][39]. Spectral suppression based methods are robust to additive noise, however only late reverberation is suppressed and distorion occurs at the same time[17]. So this method is always combined with inverse filter[18][21] and microphone array[20][25][27][26][36][37] as second stage. When we model reverberation as convolution between clean speech STFT coefficients and a convolutive transfer function(CTF) in STFT domain, many well-known speech spectral estimator such as MAP estimator[26][33][39] can be applied.
Linear prediction based methods contains approximate two kinds: LPC residuals based, multi-channel linear prediction(MCLP) based. LPC residuals of reverberated signal can be modelled as convolution between LPC residuals of clean signal and channel impulse response. To deemphasize the effect of AIR in residual signal, one idea is do the coarse channel estimation then form a matched filter applied to residuals of reverberated signal[40]. Another idea is finding an optimal criterion to distinguish differnence between two residual signals, such as kurtosis, such that apaptive filtering can be applied[41][47][48]. MCLP based method which first is modelled in time domain[42][43][44][46][53], shows that using long prediction filters, we can fully remove the effect of reverberation on the speech residual. The extension form (we call it WPE) [45][49][50][66][65][55]is developed in STFT domain, which remarkably reduce the compution complexity, and has efficient dereverberation performance. WPE has two parameters to estimate: weighted coefficients and speech power spectral. That may increase distortion of speech especially when observation is short. To overcome this problem, [51] [56][57] [59][61][58][63] [67]introduce constraints such as estimated speech log-spectral priors, sparse nature, speech low-rank approximation and estimated late reverberat power. To improve robustness to noise, WPE combined with beamforming performs better[62][64].
Other dereverberation methods such as cepstral processing[68][69][70][71], subband envelop estimaiton[72][73][74], spherical microphone arrays processing[76][77], harmonic structure based[78][79][80][81], DNN based[84][85][86][87][88] still have their place.
References:
- 2005 - A blind channel identification-based two-stage approach to separation and dereverberation of speech signals in a reverberant environment
- 2005 - Blind dereverberation based on estimates of signal transmission channels without precise information on channel order speech processing applications
- 2006 - Speech Acquisition and Enhancement in a Reverberant- Cocktail-Party-Like Environment
- 2010 - A System-Identification-Error-Robust Method for equalization of multichannel acoustic systems
- 2011 - A Forced Spectral Diversity Algorithm for Speech Dereverberation in the Presence of Near-Common Zeros
- 2011 - Equalization of multichannel acoustic system using sub-systems for speech dereverberation
- 2012 - Robust partial multichannel equalization techniques for speech dereverberation
- 2013 - Computationally efficient single channel dereverberation based on complementary wiener filter
- 2013 - Optimized Speech Dereverberation From Probabilistic Perspective for Time Varying Acoustic Transfer Function
- 2013 - Robust low-complexity multichannel equalization for dereverberation
- 2014 - Adaptive multichannel equalization applied to room acoustics exploiting the sparsity of target response
- 2014 - Joint dereverberation and noise reduction based on acoustic multichannel equalization
- 2014 - Robust Multichannel Dereverberation using Relaxed Multichannel Least Squares
- 2016 - An iterative method for equalization of multichannel acoustic systems robust to system identification errors
- 2016 - Robust sparsity-promoting acoustic multi-channel equalization for speech dereverberation
- 2018 - Multichannel Identification and Nonnegative Equalization for Dereverberation and Noise Reduction Based on Convolutive Transfer Function
- 2005 - Multi-channel speech dereverberation based on a statistical model of late reverberation
- 2006 - A two-stage algorithm for one-microphone reverberant speech enhancement
- 2006 - Spectral Subtraction Steered by Multi-Step Forward Linear Prediction For Single Channel Speech Dereverberation
- 2007 - Dual-Microphone Speech Dereverberation using a Reference Signal
- 2007 - Robust Speech Dereverberation Using Multichannel Blind Deconvolution With Spectral Subtraction
- 2010 - A blind subband-based dereverberation algorithm
- 2013 - A new cascaded spectral subtraction approach for binaural speech dereverberation and its application in source separation
- 2013 - Blind reverberation time estimation by intrinsic modeling of reverberant speech
- 2014 - Unbiased coherent-to-diffuse ratio estimation for dereverberation
- 2015 - A Bayesian approach to spatial filtering and diffuse power estimation for joint dereverberation and noise reduction
- 2015 - Coherent-to-Diffuse Power Ratio Estimation for Dereverberation
- 2015 - Multi-channel PSD estimators for speech dereverberation - A theoretical and experimental comparison
- 2016 - Late reverberation PSD estimation for single-channel dereverberation using relative convolutive transfer functions
- 2016 - Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise
- 2017 - Cram-r-Rao Bound Analysis of Reverberation Level Estimators for Dereverberation and Noise Reduction
- 2017 - Joint Denoising and Dereverberation Using Exemplar-Based Sparse Representations and Decaying Norm Constraint
- 2017 - Single-Channel Online Enhancement of Speech Corrupted by Reverberation and Noise
- 2018 - Analysis of Eigenvalue Decomposition-Based Late Reverberation Power Spectral Density Estimation
- 2018 - Evaluation and Comparison of Late Reverberation Power Spectral Density Estimators
- 2018 - Joint Late Reverberation and Noise Power Spectral Density Estimation in a Spatially Homogeneous Noise Field
- 1998 - Analysis of noise reduction and dereverberation techniques based on microphone arrays with postfiltering
- 2016 - Estimation of Room Acoustic Parameters The ACE Challenge
- 2018 - Blind Single-Channel Dereverberation Using a Recursive Maximum-Sparseness-Power-Prediction-Model
- 2001 - Microphone array speech dereverberation using coarse channel modeling
- 2001 - Speech dereverberation via maximum-kurtosis subband adaptive filtering
- 2006 - On the Use of Lime Dereverberation Algorithm in an Acoustic Environment With a Noise Source
- 2007 - Dereverberation and Denoising Using Multichannel Linear Prediction
- 2007 - Precise Dereverberation Using Multichannel Linear Prediction
- 2008 - Blind speech dereverberation with multi-channel linear prediction based on short time fourier transform representation
- 2008 - Speech Dereverberation Based on Maximum-Likelihood Estimation With Time-Varying Gaussian Source Model
- 2008 - Temporal selective dereverberation of noisy speech using one microphone
- 2009 - Enhancement of reverberant speech using the CELP postfilter
- 2009 - Integrated Speech Enhancement Method Using Noise Suppression and Dereverberation
- 2011 - Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization
- 2012 - Introduction of speech log-spectral priors into dereverberation based on Itakura-Saito distance minimization
- 2013 - Optimized Speech Dereverberation From Probabilistic Perspective for Time Varying Acoustic Transfer Function
- 2014 - Online Speech Dereverberation Algorithm Based on Adaptive Multichannel Linear Prediction
- 2014 - Single channel reverberation suppression based on sparse linear prediction
- 2014 - Speech dereverberation using weighted prediction error with Laplacian model of the desired signal
- 2015 - Multi-channel linear prediction-based speech dereverberation with low-rank power spectrogram approximation
- 2015 - Multi-Channel Linear Prediction-Based Speech Dereverberation With Sparse Priors
- 2016 - Partitioned block frequency domain Kalman filter for multi-channel linear prediction based blind speech dereverberation
- 2016 - Constrained multi-channel linear prediction for adaptive speech dereverberation
- 2016 - Partitioned block frequency domain Kalman filter for multi-channel linear prediction based blind speech dereverberation
- 2016 - Speech dereverberation using linear prediction with estimation of early speech spectral variance
- 2018 - Dereverberation with Differential Microphone Arrays and the Weighted-Prediction-Error Method
- 2018 - Frame-Online DNN-WPE Dereverberation
- 2018 - Joint Multi-Microphone Speech Dereverberation and Noise Reduction Using Integrated Sidelobe Cancellation and Linear Prediction
- 2018 - Linear Prediction-Based Online Dereverberation and Noise Reduction Using Alternating Kalman Filters
- 2018 - Online Speech Dereverberation Using RLS-WPE Based on a Full Spatial Correlation Matrix Integrated in a Speech Enhancement System
- 2018 - Speech Dereverberation Based on Convex Optimization Algorithms for Group Sparse Linear Prediction
- 1991 - Reverberant speech enhancement using cepstral processing
- 1993 - Source waveform recovery in a reverberant space by cepstrum dereverberation
- 1994 - Cepstrum based deconvolution for speech dereverberation
- 1996 - Cepstrum-based deconvolution for speech dereverberation
- 1991 - An approach to dereverberation using multi-microphone sub-band envelope estimation
- 2003 - A method based on the MTF concept for dereverberating the power envelope from the reverberant signal
- 2009 - Robust speech dereverberation based on non-negativity and sparse nature of speech spectrograms
- 2007 - An Acoustic MIMO Framework for Analyzing Microphone-Array Beamforming
- 2010 - Method for dereverberation and noise reduction using spherical microphone arrays
- 2013 - An informed spatial filter for dereverberation in the spherical harmonic domain
- 2003 - Blind dereverberation of single channel speech signal based on harmonic structure
- 2005 - Fast estimation of a precise dereverberation filter based on speech harmonicity
- 2005 - Harmonicity based dereverberation with maximum a posteriori estimation
- 2007 - Harmonicity-Based Blind Dereverberation for Single-Channel Speech Signals
- 2016 - Linear prediction based dereverberation for spherical microphone arrays
- 2014 - Speech dereverberation with convolutive transfer function approximation using map and variational deconvolution approaches
- 2015 - Learning Spectral Mapping for Speech Dereverberation and Denoising
- 2015 - Speech dereverberation using a learned speech model
- 2018 - Late Reverberation Suppression Using Recurrent Neural Networks with Long Short-Term Memory
- 2018 - Speech Dereverberation With Context-Aware Recurrent Neural Networks
- 2019 - Two-Stage Deep Learning for Noisy-Reverberant Speech Enhancement