第一个关于语音信号处理的research笔记

由于自己第一次接触这方面的内容,以前是计算机软件方面,对于信号处理方面是一窍不通,进入这个实验室,接触新的知识,新的血液,其实 说实话挺难的,至少对于我这个笨笨的人来说是有难度的,打基础打了好久,基本上什么都要从头开始,首先学的就是奥本海默的《信号与系统》,宋知用老师的《MATLAB在语音信号分析与合成的应用》,《数值方法》,《信号处理教程》,《概率论与数理统计》,《算法导论》,周志华的《机器学习》,李航的《统计学习》等等,慢慢的对信号处理方面有了冰川一角的理解。


今天对第一个项目做一个小小的 总结:

我们收集到的语音信号,一般都是包含很多噪音的,所以我们经常要进行语音信号的滤波和降噪处理,在同时还要截取信号。

接下来是我对信号的截取操作,但是效果不是很好。

根据peak来截取信号。

function extract_middle_click()
readFilePath='D:\data\tooth\Su\*.wav';
readPathStr='D:\data\tooth\Su\';
savePathStr='D:\data\tooth\dd\Su\';
fileList=dir(readFilePath);
fileNum=length(fileList);
for j=1:fileNum
    name=fileList(j).name;      %获得cell数据中的name列 也就是完整的文件名字  Zhao-zhang Syam LWF  Su
    splitName=strsplit(name,'.');  %在.处截取.前面的字符串
    varStr = splitName{1};
    %dirname = [savePathStr,varStr,'\'];
    a = ['mkdir ' savePathStr];  %mkdir是一个判断文件夹的函数。没有创建,有的话就是一个警告不是错误
    system(a); %执行外部命令
    fileName=strcat(readPathStr,name);%这个语句 就是获得了这个文件的完整路径
    data = audioread(fileName);
    
%     [b,a]=butter(3,[5000/44100*2,15000/44100*2],'bandpass');     %
%     18800hz~19200hz 19Khz 44.1Khz (f/fs)*2   滤波
%      inputsignal = filter(b,a,data);
%      
    [event_index] = identify_middle_click_index(data)
    
    disp(['Alice is ' num2str(event_index) ' years old!']);
   for i=1:1:length(event_index)
    dataIndex = (event_index(i)-2000):(event_index(i)+2000);
%     datarange= inputsignal(dataIndex);
      datarange= data(dataIndex);
      
	%datarange = datarange/max(abs(datarange));
% 	[b,a]=butter(6,[0.8526,0.8707],'bandpass');     % 18800hz~19200hz 19Khz 44.1Khz (f/fs)*2
% 	filterData=filter(b,a,datarange);
% 	Fir = fir1(5000,[18985/44100*2,19015/44100*2],'stop');
%     outdata = filter(Fir,1,filterData);
    %varStr=inputname(1);
    newStr=[savePathStr,int2str(j),'.txt'];
	%newStr=[pathStr,varStr,'.txt'];
    dlmwrite(newStr,datarange);
    figure
    plot(datarange);
  end
end


function [event_index] = identify_middle_click_index(inputsignal)   % 这个函数最终反回的是peak的最终index
nf = 0.04;    %看时域图 看你的峰值一般都是大于多少,这个相当于过滤的一个阈值
span =20;
peakdistance = 4000;%这是个 阈值 ,来判断index上  峰值之间的距离
peakdistance2=20000;
event_index = [];
[lined_data,peaks,locs] = findpeak(inputsignal,nf,span); %find peak

  %  disp(['weizhi  is ' num2str(length(locs))]);   
%locs是peak的位置index 
%peaks是peak的值
j=2;
event_index(1)=locs(1);
for i=2:length(locs)
    if (locs(i)-locs(i-1))>peakdistance &&((locs(i)-locs(i-1)))<peakdistance2
    event_index(j)=locs(i);
    j=j+1;
    end
end
找到每个语音信号的peak
function [lined_data,peaks,locs] = findpeak(x,nf,span)

%Function used to get the peaks (local maxima) from the given data 
% [lined_data,peaks,locs] = findpeak(x,nf)
% lined_data => peaks in the locations 
% peaks => Just the peak values
% locs => location at which peaks are occuring
% x => data for which peaks have to be obtained
% nf => Noise Floor
% span => span of the moving average required


for j=1:length(x(:,1))
        if(x(j)>=(nf))
            x(j)=x(j);
        end 
        if(x(j)<(nf))                                                      %Taking the values above the noise floor
            x(j)=nf;                                                       %Assigning the minimum value as noise floor magnitude
        end
end



x_smoothed=smooth(x-min(x),span,'moving');                                   %smoothing the shifted current snapshot
%20 is decided based on the type of data that is taken. It is like a cutoff
%frequency for a LPF.This moving average actually helps the findpeaks()
%function defined in Matlab library to decide the peak more efficiently
%especially in the case of experimental results when there is randomness
%in the data obtained.

[peaks,locs]=findpeaks(x_smoothed);                                        %get the peaks from the data

lined_data=zeros(1,length(x));                                             %lined data will have peaks at locations
lined_data(locs)=peaks;                                                    
lined_data=lined_data+min(x);                                              %shifting it back to original values

peaks=peaks+min(x);                                                        %Shifting it back to its original values

end




下面是提取feature的部分:

1、计算每个人的MFCC feature。

2、查看每个人的MFCC的图像。

3、对每个人的MFCC的特征进行自相关的分析

        A=corr(MFCC);

        A=corr(MFCC');

       查看图形进行分析;

4、由于每个人的MFCC特征,没在一个mat文件中(主要是我做批处理的时候,没有把代码写好)

      所以把每个MFCC特征放在一起

      ①先双击一个人的mat文件,名称为MFCCS,也就是load进来

         定义mfcc=MFCCs;

     ②再打开另外一个的mfcc的mat文件,文件名称也为MFCCs

              mfcc=[mfcc,MFCCs]

          ....

         最终把所有的单个的mat文件合并到一个mat文件中

         最后再保存 使用save  4.mat mfcc;

5、可以查看所有的mfcc的相关性,做一个简单的mfcc的分析

       A=corr(MFCC')

6、点开所有的feature,这里也就是所有的mat文件,即4.mat。然后进行打label,1,2,3...


7、放入到SVM中进行模型的训练。


提取MFCC的feature代码

function MFCCs = extract_mfcc()
filePath='D:\data\tooth\dd\Zhao-zhang\*.txt';
pathStr='D:\data\tooth\dd\Zhao-zhang\';
fileList=dir(filePath);
fileNum=length(fileList);
MFCCs = [];
 hamming = @(N)(0.54-0.46*cos(2*pi*[0:N-1].'/(N-1)));
for i=1:fileNum
 name=fileList(i).name;
    fileName=strcat(pathStr,name);
    data=dlmread(fileName);
   [ CC, FBE, frames ] = mfcc(data,44100,25,10,0.97,hamming,[5000,15000],20,13,22);
   MFCCs = [MFCCs,mean(CC')'];
   
end
MFCCs = MFCCs';
save('MFCCs.mat');



SVM 代码如下:

function ac = ovoSVM()
%mfcc=load ('mfcc.mat');         %data format: n*m matrix, n is the number of observations,m-1 is number the dimension of the features, 
 load mfcc.mat            % the last colum is the labels corresponding to the observations
%[meas,species] = formatdata_svm();
labels = mfcc(:,14);
[~,~,labels] = unique(labels);   % # labels: 1/2/3
observations = mfcc(:,1:13);

data = zscore(observations); % # scale featuresx
%data = meas;
numInst = size(data,1);  %获取矩阵的行数
%numLabels = max(labels);

% # split training/testing
idx = randperm(numInst);  %获取行数的随机排列 1-16的随机排列
numTrain = 8; 
%numTest = numInst - numTrain;
trainData = data(idx(1:numTrain),:);  testData = data(idx(numTrain+1:end),:);
trainLabel = labels(idx(1:numTrain)); testLabel = labels(idx(numTrain+1:end));

% model=svmtrain(trainLabel,trainData,'-c 24 -g 4.1');
% [prediction_decision_label,prediction_accuracy,dec_value]=svmpredict(testLabel,testData,model);
% [training_decision_label,training_accuracy,dec_value]=svmpredict(trainLabel,trainData,model);
bestcv = 0;
for log2c = -4:12,
  for log2g = -8:4,
%       for log2c = -1:3,
%   for log2g = -4:1,
    cmd = ['-v 5 -c ', num2str(2^log2c), ' -g ', num2str(2^log2g)];
    cv = svmtrain(trainLabel, trainData, cmd);
    if (cv >= bestcv),
      bestcv = cv; bestc = 2^log2c; bestg = 2^log2g;
    end
 %   fprintf('%g %g %g (best c=%g, g=%g, rate=%g)\n', log2c, log2g, cv, bestc, bestg, bestcv);
  end
end
% # train one-against-one model
  cmd2 = ['-c ', num2str(bestc), ' -g ',num2str(bestg), ' -b 1 '];
    model = svmtrain(double(trainLabel), trainData, cmd2);
% # get probability estimates of test instances using each model
    [pred,acc,preb] = svmpredict(double(testLabel), testData, model, '-b 1');
	disp(pred);
	ac = acc(1);
	disp(['the accuracy is:' int2str(ac)]);
    
 CM=confusionmat(testLabel,pred);
   imagesc(CM);
   colormap(flipud(gray));  
   axis xy;
   xlabel('Groundtruth');% x轴名称
   ylabel('Prediction');






  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值