可以使用的MFCC程序

在CSDN里面找了几个MFCC的程序,一点点调试,终于得到一个可以使用程序,作为人生中的第一篇博客贴出来。关于MFCC提取过程中的数学推导还不够理解,以后再看咯,以下我自己整理的从录音到MFCC提取的两个MATLAB程序组:

主程序:

%%  自我录音,录音为95184726

%%  录音
%音频采样率Fs = 8000,采样位宽BitPerSmpl = 16,双声道ChanNum = 2
Fs = 8000;
BitPerSmpl = 16;
ChanNum = 2;
MyVoice = audiorecorder(Fs, BitPerSmpl, ChanNum);
%%  开始录制
record(MyVoice);
%%  录制结束
% pause(MyVoice);
% resume(MyVoice);
stop(MyVoice);
%%  播放录音
% play(MyVoice);
MyVoice = getaudiodata(MyVoice);
audiowrite('D:\MATLAB2016\MATLAB_project\MyVoiceMFCC\MyVoice.wav', MyVoice, Fs);
%%  读取音频
MyVoice = audioread('D:\MATLAB2016\MATLAB_project\MyVoiceMFCC\MyVoice.wav');
MyVoiceInfo = audioinfo('D:\MATLAB2016\MATLAB_project\MyVoiceMFCC\MyVoice.wav');
sound(MyVoice);
%%  观察频谱
MyVoiceNfft = abs(rfft(MyVoice));
figure
plot(MyVoiceNfft);

%%  MFCC处理
WinTime = 0.025;
[cepstra_left, aspectrum_left, pspectrum_left] = MFCC(MyVoice(:, 1), Fs, WinTime);
[cepstra_right, aspectrum_right, pspectrum_right] = MFCC(MyVoice(:, 2), Fs, WinTime);
figure
subplot(211)
plot(cepstra_left);
title('左声道梅尔倒谱系数');
subplot(212)
plot(cepstra_right);
title('右声道梅尔倒谱系数');

子程序:

function [cepstra, aspectrum, pspectrum] = MFCC(samples, sr, wintime, steptime, nfilts, numcep, preemph)
% [cepstra, aspectrum, pspctrum] = melfcc(samples, sr, wintime, steptime, numcep, preemph)
%   - take power spectra of the STFT
%   - warp to a mel frequency scale
%   - take the DCT of the log-Mel-spectrum
%   - return the first <numcep> components
%   samples:                        vector of signal
%   sr:                                  sample rate
%   wintime (0.025):             window length in second
%   steptime (0.010):           step between successive windows in second
%   numcep (13):                 number of cepstra to return
%   nfilts (40):                       number of triangle filter to use
%   preemph (0.97):             pre-emphasis filter coefficient
 
if nargin < 2;      sr = 16000;               end
if nargin < 3;      wintime = 0.025;       end
if nargin < 4;      steptime = 0.010;     end
if nargin < 5;      numcep = 13;           end
if nargin < 6;      nfilts = 40;                end
if nargin < 7;      preemph = 0.97;       end
 
winpts = round(wintime*sr);
steppts = round(steptime*sr);
NFFT = 2^(ceil(log2(winpts)));
% figure
% subplot(211)
% plot(samples(:,1)); 
% title('信号的频谱');

samples = filter([1, -preemph], 1, samples);
% subplot(212)
% plot(samples(:,1));
% title('预加重后信号的频谱');
% subplot(212)
% plot(samples(:,2));
%   compute FFT power spectrum
%pspectrum = powspec(samples, sr, wintime, steptime, NFFT);
%pspectrum = abs(spectrogram(samples*32768, winpts, winpts - round(steptime*sr), NFFT)).^2;
 
[frame, tc, ~] = enframe(samples, hamming(winpts), steppts,sr);
pspectrum = abs(rfft(frame', NFFT)).^2;
%%  为MFCCtest编写的filterBank 
%   obtain mel filter bank
% [x,mc,mn,mx]=melbankm(p,n,fs,fl,fh,w)
% filterBank=melbankm(213, 2047, sr);
% figure
% subplot(211)
% plot((0:floor(2047/2))*sr/2047,melbankm(213, 2047, sr)') ;
% % filterBank = melFilterBank(NFFT, sr, nfilts);     %%%%    程序内原来调用的函数
% subplot(212)
% plot(pspectrum);
%   auditory spectrum
%%  尝试编写通用filterBank
filterBank = melbankm(nfilts, NFFT, sr);

%%
aspectrum = filterBank * pspectrum;
%%
%   apply DCT to convert aspectrum to mel cepstrum
logAspec = log(aspectrum);
cepstra = dct(logAspec);
cepstra = cepstra(1:numcep,:);
 
if nargout < 1
    [nf, nc] = size(cepstra);
    imh = imagesc(tc/sr, 1:nf, cepstra);
    axis('xy');
    xlabel('Time (s)');
    ylabel('Mel-cepstrum coefficient');
    map = (0:63)' / 63;
    colormap([map, map, map]);
    colorbar;
end
end

8000Hz采样率下,我声音的频谱(横坐标并不是严格的频率)

我的声音频谱

MFCC系数:


希望如果有大佬看了能给些关于该算法应用的建议,是通过梅尔倒谱系数做互相关还是什么的操作,来达到语音识别的目的?啦啦啦

已标记关键词 清除标记
相关推荐
©️2020 CSDN 皮肤主题: 大白 设计师:CSDN官方博客 返回首页