Matlab v_melbankm函数参数详解(英文附例)

Matlab v_melbankm函数参数详解(英文附例)

笔者使用的是R2019的matlab,下载了voicebox安装至matlab路径下即可使用。下载voicebox请参看此博客
需要注意的是,melbankm改成了v_melbankm,今天自己使用此函数时后面几个参数不知道含义,翻了源文件看看,比较懒,没翻译成中文。看到一篇更好的解释和与v_melcepst的比较博客请戳这里

函数解释

v_melbankm determine matrix for a mel/erb/bark-spaced v_filterbank [X,MN,MX]=(P,N,FS,FL,FH,W)

Inputs:

p number of filters in v_filterbank or the filter spacing in k-mel/bark/erb [ceil(4.6*log10(fs))]
n length of fft
fs sample rate in Hz
fl low end of the lowest filter as a fraction of fs [default = 0]
fh high end of highest filter as a fraction of fs [default = 0.5]
w any sensible combination of the following:
‘b’ = bark scale instead of mel
‘e’ = erb-rate scale
‘l’ = log10 Hz frequency scale
‘f’ = linear frequency scale
‘c’ = fl/fh specify centre of low and high filters
‘h’ = fl/fh are in Hz instead of fractions of fs
‘H’ = fl/fh are in mel/erb/bark/log10
‘t’ = triangular shaped filters in mel/erb/bark domain (default)
‘n’ = hanning shaped filters in mel/erb/bark domain
‘m’ = hamming shaped filters in mel/erb/bark domain
‘z’ = highest and lowest filters taper down to zero [default]
‘y’ = lowest filter remains at 1 down to 0 frequency and highest filter remains at 1 up to nyquist freqency
‘u’ = scale filters to sum to unity
‘s’ = single-sided: do not double filters to account for negative frequencies
‘g’ = plot idealized filters [default if no output arguments present]

Note that the filter shape (triangular, hamming etc) is defined in the mel (or erb etc) domain.

Some people instead define an asymmetric triangular filter in the frequency domain.

If ‘ty’ or ‘ny’ is specified, the total power in the fft is preserved.

Outputs:
x a sparse matrix containing the v_filterbank amplitudes
If the mn and mx outputs are given then size(x)=[p,mx-mn+1]
otherwise size(x)=[p,1+floor(n/2)]
Note that the peak filter values equal 2 to account for the power
in the negative FFT frequencies.

mc the v_filterbank centre frequencies in mel/erb/bark
mn the lowest fft bin with a non-zero coefficient
mx the highest fft bin with a non-zero coefficient
Note: you must specify both or neither of mn and mx.

Examples of use:

(a) Calcuate the Mel-frequency Cepstral Coefficients

f=v_rfft(s);			        % v_rfft() returns only 1+floor(n/2) coefficients
x=v_melbankm(p,n,fs);	        % n is the fft length, p is the number of filters wanted
z=log(x*abs(f).^2);             % multiply x by the power spectrum
c=dct(z);                       % take the DCT

(b) Calcuate the Mel-frequency Cepstral Coefficients efficiently

    f=fft(s);                        % n is the fft length, p is the number of filters wanted
    [x,mc,na,nb]=v_melbankm(p,n,fs);   % na:nb gives the fft bins that are needed
    z=log(x*(f(na:nb)).*conj(f(na:nb)));

© Plot the calculated filterbanks

   plot((0:floor(n/2))*fs/n,melbankm(p,n,fs)')   % fs=sample frequency

(d) Plot the idealized filterbanks (without output sampling)

   v_melbankm(p,n,fs);

References:
[1] S. S. Stevens, J. Volkman, and E. B. Newman. A scale for the measurement
of the psychological magnitude of pitch. J. Acoust Soc Amer, 8: 185-19, 1937.
[2] S. Davis and P. Mermelstein. Comparison of parametric representations for
monosyllabic word recognition in continuously spoken sentences.
IEEE Trans Acoustics Speech and Signal Processing, 28 (4): 357-366, Aug. 1980.

  • 0
    点赞
  • 13
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值