gmm ubm matlab,MSR Identity Toolbox工具箱中gmm_ubm_artificial的训练自定义声音来源

这是一个关于如何使用Identity Toolbox进行GMM-UBM(高斯混合模型-通用背景模型)说话人识别的小规模任务演示。任务包括从背景数据中训练UBM,使用登记数据对说话人模型进行MAP适应,计算验证试验得分,以及评估性能指标(如混淆矩阵和等错误率)。整个过程在内存中完成,适用于小型任务,但对于大型任务或内存有限的机器,模型和参数应保存到磁盘。
摘要由CSDN通过智能技术生成

%{

This is a demo on how to use the Identity Toolbox for GMM-UBM based speaker

recognition. A small scale task has been designed using artificially

generated features for 20 speakers. Each speaker has 10 sessions

(channels) and each session is 1000 frames long (10 seconds assuming 10 ms

frame increments).

There are 4 steps involved:

1. training a UBM from background data

2. MAP adapting speaker models from the UBM using enrollment data

3. scoring verification trials

4. computing the performance measures (e.g., confusion matrix and EER)

Note: given the relatively small size of the task, we can load all the data

and models into memory. This, however, may not be practical for large scale

tasks (or on machines with a limited memory). In such cases, the parameters

should be saved to the disk.

Malcolm Slaney

Omid Sadjadi

Microsoft Research, Conversational Systems Research Center

%}

%

%%

% Step0: Set the parameters of the experiment

nSpeakers = 20;

nDims = 13;             % dimensionality of feature vectors

nMixtures = 32;         % How many mixtures used to generate data

nChannels = 10;         % Number of channels (sessions) per speaker

nFrames = 1000;         % Frames per speaker (10 seconds assuming 100 Hz)

nWorkers = 4;           % Number of parfor workers, if available

% Pick random centers for all the mixtures.

mixtureVariance = .10;

channelVariance = .05;

mixtureCenters = randn(nDims, nMixtures, nSpeakers);

channelCenters = randn(nDims, nMixtures, nSpeakers, nChannels)*.1;

trainSpeakerData = cell(nSpeakers, nChannels);

testSpeakerData = cell(nSpeakers, nChannels);

speakerID = zeros(nSpeakers, nChannels);

% Create the random data. Both training and testing data have the same

% layout.

for s=1:nSpeakers

trainSpeechData = zeros(nDims, nMixtures);

testSpeechData = zeros(nDims, nMixtures);

for c=1:nChannels

for m=1:nMixtures

% Create data from mixture m for speaker s

frameIndices = m:nMixtures:nFrames;

nMixFrames = length(frameIndices);

trainSpeechData(:,frameIndices) = ...

randn(nDims, nMixFrames)*sqrt(mixtureVariance) + ...

repmat(mixtureCenters(:,m,s),1,nMixFrames) + ...

repmat(channelCenters(:,m,s,c),1,nMixFrames);

testSpeechData(:,frameIndices) = ...

randn(nDims, nMixFrames)*sqrt(mixtureVariance) + ...

repmat(mixtureCenters(:,m,s),1,nMixFrames) + ...

repmat(channelCenters(:,m,s,c),1,nMixFrames);

end

trainSpeakerData{s, c} = trainSpeechData;

testSpeakerData{s, c} = testSpeechData;

speakerID(s,c) = s;                 % Keep track of who this is

end

end

%%

% Step1: Create the universal background model from all the training speaker data

nmix = nMixtures;           % In this case, we know the # of mixtures needed

final_niter = 10;

ds_factor = 1;

ubm = gmm_em(trainSpeakerData(:), nmix, final_niter, ds_factor, nWorkers);

%%

% Step2: Now adapt the UBM to each speaker to create GMM speaker model.

map_tau = 10.0;

config = 'mwv';

gmm = cell(nSpeakers, 1);

for s=1:nSpeakers

gmm{s} = mapAdapt(trainSpeakerData(s, :), ubm, map_tau, config);

end

%%

% Step3: Now calculate the score for each model versus each speaker's data.

% Generate a list that tests each model (first column) against all the

% testSpeakerData.

trials = zeros(nSpeakers*nChannels*nSpeakers, 2);

answers = zeros(nSpeakers*nChannels*nSpeakers, 1);

for ix = 1 : nSpeakers,

b = (ix-1)*nSpeakers*nChannels + 1;

e = b + nSpeakers*nChannels - 1;

trials(b:e, :)  = [ix * ones(nSpeakers*nChannels, 1), (1:nSpeakers*nChannels)'];

answers((ix-1)*nChannels+b : (ix-1)*nChannels+b+nChannels-1) = 1;

end

gmmScores = score_gmm_trials(gmm, reshape(testSpeakerData', nSpeakers*nChannels,1), trials, ubm);

%%

% Step4: Now compute the EER and plot the DET curve and confusion matrix

imagesc(reshape(gmmScores,nSpeakers*nChannels, nSpeakers))

title('Speaker Verification Likelihood (GMM Model)');

ylabel('Test # (Channel x Speaker)'); xlabel('Model #');

colorbar; drawnow; axis xy

figure

eer = compute_eer(gmmScores, answers, true);

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值