可以使用的MFCC程序

最新推荐文章于 2019-10-24 12:43:47 发布

uyingmiaomiao

最新推荐文章于 2019-10-24 12:43:47 发布

阅读量1.1k

点赞数 3

分类专栏： MFCC 文章标签： MATLAB MFCC 语音识别

本文链接：https://blog.csdn.net/uyingmiaomiao/article/details/83388117

版权

MFCC 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

在CSDN里面找了几个MFCC的程序，一点点调试，终于得到一个可以使用程序，作为人生中的第一篇博客贴出来。关于MFCC提取过程中的数学推导还不够理解，以后再看咯，以下我自己整理的从录音到MFCC提取的两个MATLAB程序组：

主程序：

%%  自我录音，录音为95184726

%%  录音
%音频采样率Fs = 8000，采样位宽BitPerSmpl = 16,双声道ChanNum = 2
Fs = 8000;
BitPerSmpl = 16;
ChanNum = 2;
MyVoice = audiorecorder(Fs, BitPerSmpl, ChanNum);
%%  开始录制
record(MyVoice);
%%  录制结束
% pause(MyVoice);
% resume(MyVoice);
stop(MyVoice);
%%  播放录音
% play(MyVoice);
MyVoice = getaudiodata(MyVoice);
audiowrite('D:\MATLAB2016\MATLAB_project\MyVoiceMFCC\MyVoice.wav', MyVoice, Fs);
%%  读取音频
MyVoice = audioread('D:\MATLAB2016\MATLAB_project\MyVoiceMFCC\MyVoice.wav');
MyVoiceInfo = audioinfo('D:\MATLAB2016\MATLAB_project\MyVoiceMFCC\MyVoice.wav');
sound(MyVoice);
%%  观察频谱
MyVoiceNfft = abs(rfft(MyVoice));
figure
plot(MyVoiceNfft);

%%  MFCC处理
WinTime = 0.025;
[cepstra_left, aspectrum_left, pspectrum_left] = MFCC(MyVoice(:, 1), Fs, WinTime);
[cepstra_right, aspectrum_right, pspectrum_right] = MFCC(MyVoice(:, 2), Fs, WinTime);
figure
subplot(211)
plot(cepstra_left);
title('左声道梅尔倒谱系数');
subplot(212)
plot(cepstra_right);
title('右声道梅尔倒谱系数');

子程序：

function [cepstra, aspectrum, pspectrum] = MFCC(samples, sr, wintime, steptime, nfilts, numcep, preemph)
% [cepstra, aspectrum, pspctrum] = melfcc(samples, sr, wintime, steptime, numcep, preemph)
%   - take power spectra of the STFT
%   - warp to a mel frequency scale
%   - take the DCT of the log-Mel-spectrum
%   - return the first <numcep> components
%   samples:                        vector of signal
%   sr:                                  sample rate
%   wintime (0.025):             window length in second
%   steptime (0.010):           step between successive windows in second
%   numcep (13):                 number of cepstra to return
%   nfilts (40):                       number of triangle filter to use
%   preemph (0.97):             pre-emphasis filter coefficient
 
if nargin < 2;      sr = 16000;               end
if nargin < 3;      wintime = 0.025;       end
if nargin < 4;      steptime = 0.010;     end
if nargin < 5;      numcep = 13;           end
if nargin < 6;      nfilts = 40;                end
if nargin < 7;      preemph = 0.97;       end
 
winpts = round(wintime*sr);
steppts = round(steptime*sr);
NFFT = 2^(ceil(log2(winpts)));
% figure
% subplot(211)
% plot(samples(:,1)); 
% title('信号的频谱');

samples = filter([1, -preemph], 1, samples);
% subplot(212)
% plot(samples(:,1));
% title('预加重后信号的频谱');
% subplot(212)
% plot(samples(:,2));
%   compute FFT power spectrum
%pspectrum = powspec(samples, sr, wintime, steptime, NFFT);
%pspectrum = abs(spectrogram(samples*32768, winpts, winpts - round(steptime*sr), NFFT)).^2;
 
[frame, tc, ~] = enframe(samples, hamming(winpts), steppts,sr);
pspectrum = abs(rfft(frame', NFFT)).^2;
%%  为MFCCtest编写的filterBank 
%   obtain mel filter bank
% [x,mc,mn,mx]=melbankm(p,n,fs,fl,fh,w)
% filterBank=melbankm(213, 2047, sr);
% figure
% subplot(211)
% plot((0:floor(2047/2))*sr/2047,melbankm(213, 2047, sr)') ;
% % filterBank = melFilterBank(NFFT, sr, nfilts);     %%%%    程序内原来调用的函数
% subplot(212)
% plot(pspectrum);
%   auditory spectrum
%%  尝试编写通用filterBank
filterBank = melbankm(nfilts, NFFT, sr);

%%
aspectrum = filterBank * pspectrum;
%%
%   apply DCT to convert aspectrum to mel cepstrum
logAspec = log(aspectrum);
cepstra = dct(logAspec);
cepstra = cepstra(1:numcep,:);
 
if nargout < 1
    [nf, nc] = size(cepstra);
    imh = imagesc(tc/sr, 1:nf, cepstra);
    axis('xy');
    xlabel('Time (s)');
    ylabel('Mel-cepstrum coefficient');
    map = (0:63)' / 63;
    colormap([map, map, map]);
    colorbar;
end
end

8000Hz采样率下，我声音的频谱（横坐标并不是严格的频率）

我的声音频谱