matab工具audiotool函数gaussmix注释版

最新推荐文章于 2021-07-26 20:13:28 发布
Datrilla
最新推荐文章于 2021-07-26 20:13:28 发布
阅读量874
点赞数
分类专栏： matlab 高斯 GMM 文章标签： matlab
本文链接：https://blog.csdn.net/u014646950/article/details/62041035
版权
matlab 同时被 3 个专栏收录
19 篇文章 0 订阅
订阅专栏
GMM
2 篇文章 0 订阅
订阅专栏
高斯
1 篇文章 0 订阅
订阅专栏
这个是我目前看到的比较完整的高斯，我还不会用。注释的可能有误或者不够专业。先放在这里。里面可能涉及到其他audiotool工具里面的函数，所以没办法单独使用。需要去下载完整的库
voicebox--wav文件

http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.zip
function [m,v,w,g,f,pp,gg]=gaussmix_note(x,c,l,m0,v0,w0,wx)
%GAUSSMIX fits a gaussian mixture pdf to a set of data observations [m,v,w,g,f]=(x,c,l,m0,v0,w0,wx)
%CHINESE note by Xu__Jiayu
%对于观测数据集的适应高斯混合分布密度函数
%使用样例Usage:
%    (1) [m,v,w]=gaussmix(x,[],[],k);创建k个高斯混合，diag对角协方差    % create GMM with k mixtures and diagonal covariances
%    (2) [m,v,w]=gaussmix(x,[],[],k,'v');创建k个高斯混合 ，方阵协方差（行列相等）   % create GMM with k mixtures and full covariances
%
% Inputs: n data values, k mixtures, p parameters, l loops
%%
%       X(n,p)待观测数据，n行，每行有p的属性   Input data vectors, one per row.
%%
%    	c(1)  归一化的最小方差（[]默认为1/n^2）   Minimum variance of normalized data (Use [] to take default value of 1/n^2)
%%
%        L  整数部分为迭代次数上限，小数部分为近似然估计阈值(类似精确度),默认100.0001     The integer portion of l gives a maximum loop count. The fractional portion gives
%              an optional stopping threshold. Iteration will cease if the increase in
%              log likelihood density per data point is less than this value. Thus l=10.001 will
%              stop after 10 iterations or when the increase in log likelihood falls below
%              0.001.
%              As a special case, if L=0, then the first three outputs are omitted.
%              Use [] to take default value of 100.0001
%%
%       M0 使用样例中的k，创建的k个高斯混合（或者传入的是已经分类好的数据的质心矩阵）
%           Number of mixtures required (or initial mixture means - see below)
%%
%       V0 模式设置 Initialization mode:
%       ******************** 'm'|'f'|'p'三选一，默认'f'
%                'm' MO传入的是已经分类好的数据的质心M0 contains the initial centres
%                'f'［默认］k个高斯混合质心从数据中抽取k个    Initialize with K randomly selected data points [default]
%                'p'随机分区抽取质心    Initialize with centroids and variances of random partitions
%       ******************** 'k'|'h'二选一,默认'h'        
%                'k'利用kmeans算法聚类分类    k-means algorithm ('kf' and 'kp' determine initialization)
%                'h'[默认]调合均值算法聚类分类    k-harmonic means algorithm ('hf' and 'hp' determine initialization) [default]
%       ********************'s'对数据不进行标准差=(sqrt(方差))缩放    do not scale data during initialization to have equal variances
%       ********************'v'方阵协方差（行列相等），当[]或没有设置为diag对角协方差full covariance matrices，
%       ********************v0不为字符串，为方差矩阵
%              Mode 'hf' [the default] generally gives the best results but 'f' is faster and often OK
%%
%       W0(k,1) 初始化k个混合高斯的权重，权重和需要归一化  Initial mixture weights, one per mixture. The weights should sum to unity.
%%
%       WX(n,1) 观测数据的权重 Data point weights
%%
%     Alternatively, initial values for M0, V0 and W0 can be given  explicitly:
%
%     M0(k,p) k个混合高斯的质心，每行代表一个 Initial mixture means, one row per mixture.
%     V0(k,p) k个混合高斯的方差（对角方差），每行代表一个 Initial mixture variances, one row per mixture.
%      or V0(p,p,k) k个混合高斯的的方差（方阵方差），每个矩阵代表一个 one full-covariance matrix per mixture
%     W0(k,1) 初始化k个混合高斯的权重，权重和需要归一化  Initial mixture weights, one per mixture. The weights should sum to unity.
%     WX(n,1) 观测数据的权重 Data point weights
%%
% Outputs: (Note that M, V and W are omitted if L==0)
%
%     M(k,p)  k个混合高斯的均值，每行一个 Mixture means, one row per mixture. (omitted if L==0)
%     V(k,p)  k个混合高斯的方差，每行一个 Mixture variances, one row per mixture. (omitted if L==0)
%       or V(p,p,k)k个混合高斯的方差，每个矩阵一个 if full covariance matrices in use (i.e. either 'v' option or V0(p,p,k) specified)
%     W(k,1)  k个混合高斯的权重，权重加和需要归一 Mixture weights, one per mixture. The weights will sum to unity. (omitted if L==0)
%     G       输入数据点的平均对数概率，拟合过程中归一化标准化确实部分。exp(g)Average log probability of the input data points.
%     F       表明拟合情况好坏，值越高效果越好（线性判断LDA也叫Linear Discriminant） Fisher's Discriminant measures how well the data divides into classes.
%              It is the ratio of the between-mixture variance to the average mixture variance: a
%              high value means the classes (mixtures) are well separated.
%     PP(n,1)  每个观测点的对数概率Log probability of each data point
%     GG(l+1,1) 从一开始到迭代结束的平均对数概率Average log probabilities at the beginning of each iteration and at the end
%%
% 这个拟合程序使用了很多初始化方法去创建初始的高斯的质心。并且使用EM（估算极大化）算法来改进高斯。
% 因为EM算法是一成不变的，初始化程序使用了随机数，当你对同一个数据使用了很多次将不会得到确切的答案
% The fitting procedure uses one of several initialization methods to create an initial guess
% for the mixture centres and then uses the EM (expectation-maximization) algorithm to refine
% the guess. Although the EM algorithm is deterministic, the initialization procedures use 
% random numbers and so the routine will not give identical answers if you call it multiple
% times with the same input data.

%  Bugs/Suggestions
%     (1) Allow processing in chunks by outputting/reinputting an array of sufficient statistics
%     (2) Other initialization options:
%              'l'    LBG algorithm
%              'm'    Move-means (dog-rabbit) algorithm
%     (3) Allow updating of weights-only, not means/variances

%      Copyright (C) Mike Brookes 2000-2009
%      Version: $Id: gaussmix.m 7784 2016-04-15 11:09:50Z dmb $
%
%   VOICEBOX is a MATLAB toolbox for speech processing.
%   Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%   This program is free software; you can redistribute it and/or modify
%   it under the terms of the GNU General Public License as published by
%   the Free Software Foundation; either version 2 of the License, or
%   (at your option) any later version.
%
%   This program is distributed in the hope that it will be useful,
%   but WITHOUT ANY WARRANTY; without even the implied warranty of
%   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
%   GNU General Public License for more details.
%
%   You can obtain a copy of the GNU General Public License from
%   http://www.gnu.org/copyleft/gpl.html or by writing to
%   Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

[n,p]=size(x);%获取样本点/观测点 有n个，每个有p个属性
wn=ones(n,1);%初始化每个观测点的权重均为1，列数组，n个1
mx0=sum(x,1)/n;%初始化观测点p个属性的均值,行向量，p个均值         % calculate mean and variance of input data in each dimension
vx0=sum(x.^2,1)/n-mx0.^2;%初始化观测点p个属性的方差，行向量，p个方差
sx0=sqrt(vx0);%初始化观测点p个属性的标准差，行向量，p个标准差
sx0(sx0==0)=1; %防止除以0值     % do not divide by zero when scaling
scaled=0;           % data is not yet scaled
memsize=voicebox('memsize');    % set memory size to use
%%
if isempty(c)%归一化最小方差设置
    c=1/n^2;
else
    c=c(1);         % just to prevent legacy code failing
end
fulliv=0;           % initial variance is not full
%%
if isempty(l)%迭代次数或精度设置
    l=100+1e-4;         % max loop count + stopping threshold
end
%%
%没有聚类分类v0且没有聚类质心m0且没有聚类权重w0  或者 v0是聚类分类而不是聚类方差
if nargin<5 || isempty(v0) || ischar(v0)             % no initial values specified for m0, v0, w0
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    %  No initialvalues given, so we must use k-means or equivalent
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    if nargin<6
        if nargin<5 || isempty(v0)%
            v0='hf'; %默认采用k调和平均和对待观测数据聚类分析算法，默认数据点随机取ｋ个点作为聚类质心                % default initialization mode: hf
        end
        wx=wn;      %初始化观测点权重默认ones(n,1) % no data point weights
    else
        wx=w0(:);  %初始化为输入参数中的观测点权重                 % data point weights
    end
    %%%%%%
    if any(v0=='m')
        k=size(m0,1);%m0传入的值质心，计算需要几个高斯混合
    else
        k=m0;%传入的是数值，取需要随机取的高斯个数
    end
    %%%%%%
    fv=any(v0=='v');%如果k个高斯方差为方阵形式，那么fv为true==1           % full covariance matrices requested
    %%
         %begin待观测数据点比k个高斯的数目少，不需要迭代的情况
    if n<=k                        %观测数据点数小于高斯个数，为每个观测数据点设置一个高斯混合 % each data point can have its own mixture
        xs=(x-mx0(wn,:))./sx0(wn,:); %取得每个观测点的每个属性在对应属性的偏离比。目前wn都取值为1(这里作用把p个属性均值、标准差扩充和观测点矩阵一样的矩阵)，mx0各个属性均值（一行），sx0各个属性标准差（一行）       % scale the data
        m=xs(mod((1:k)-1,n)+1,:); %偏离比把n个数据多次匹配k个高斯 % just include all points several times
        v=zeros(k,p); %k个高斯方差先清0，后面将重新设置             % will be set to floor later
        w=zeros(k,1);%k个高斯权重先清零
        w(1:n)=1/n;%k个高斯把具有一个点的权重设置为1/n
        if l>0
            l=0.1;    %当前k个高斯要么有一个或没有观测数据，没有迭代的必要           % no point in iterating
        end
        %end待观测数据不需要迭代
    else
        %begin待观测数据比k个高斯数目多，必定需要迭代     % more points than mixtures
            %begin是否进行缩比例
        if any(v0=='s')
            xs=x;   %待观测数据不需要缩比例               % do not scale data during initialization
        else
            xs=(x-mx0(wn,:))./sx0(wn,:);  %待观测数据需要缩比例 else scale now
            if any(v0=='m')
                m=(m0-mx0(ones(k,1),:))./sx0(ones(k,1),:); %观测数据的均值同样需要缩比例 % scale specified means as well
            end
        end
            %end是否缩比例
        w=repmat(1/k,k,1); %初始化k个高斯权重一样均为1/k.为列向数组（k个1/k) Kx1                 % all mixtures equally likely
            %begin聚类分类--获得质心m为kxp】j为每个观测点的聚类类别nx1，置位1到k】e为每个。
        if any(v0=='k')   %有参数输入'k'--k均值算法                       % k-means initialization
            if any(v0=='m') %k均值算法+参数传入的质心
                [m,e,j]=v_kmeans(xs,k,m);
            elseif any(v0=='p')%k均值算法+随机分区抽取质心
                [m,e,j]=v_kmeans(xs,k,'p');
            else
                [m,e,j]=v_kmeans(xs,k,'f');%k均值算法+随机抽取观测点为质心
            end
        elseif any(v0=='h')  %有参数输入选择'h'--k调和均值算法                   % k-harmonic means initialization
            if any(v0=='m')     %k调和均值算法+参数传入质心
                [m,e,j]=kmeanhar(xs,k,[],4,m);
            else
                if any(v0=='p')%k调和均值算法+随机分区抽取质心
                    [m,e,j]=kmeanhar(xs,k,[],4,'p');
                else
                    [m,e,j]=kmeanhar(xs,k,[],4,'f');%k调和均值算法+待观测点随机抽取质心
                end
            end
        elseif any(v0=='p')  %聚类分类没有输入参数，选择抽取质心‘p’随机分区抽取质心 ，                 % Initialize using a random partition
            j=ceil(rand(n,1)*k);  %rand(n,1)抽取n个随机数在0到1的数。ceil只入不舍取整数              % allocate to random clusters
            j(rnsubset(k,n))=1:k;  %调整随机取得，是的至少每个聚类有一个数据点，rnsubset抽取k个1到n之间的正整数             % but force at least one point per cluster
            for i=1:k
                m(i,:)=mean(xs(j==i,:),1);%求每个聚类的均值－－质心
            end
        else %聚类分类没有输入参数，有传入聚类质心
            if any(v0=='m')
                m=m0;%参数传入的质心                           % use specified centres
            else
                m=xs(rnsubset(k,n),:);  %随机抽取数据点为质心        % Forgy initialization: sample k centres without replacement [default]
            end
            [e,j]=v_kmeans(xs,k,m,0);   %采用k均值聚类分类% find out the cluster allocation
        end
            %end聚类分类获得质心m
        if any(v0=='s')
            xs=(x-mx0(wn,:))./sx0(wn,:); %聚类分类以后，没有缩比例的进行缩比例     % scale data now if not done previously
        end
        v=zeros(k,p);%对角方差清零                   % diagonal covariances
        w=zeros(k,1);%权重清零（一列数组）
        for i=1:k%k个高斯（聚类）方差统计，权重统计
            ni=sum(j==i); %统计某个高斯（聚类）的观测点数              % number assigned to this centre
            w(i)=(ni+1)/(n+k); %统计某个高斯（聚类）的权重（n+k）=（观测点总+质心个数总）         % weight of this mixture
            if ni %某个高斯的方差
                v(i,:)=sum((xs(j==i,:)-repmat(m(i,:),ni,1)).^2,1)/ni;
            else
                v(i,:)=zeros(1,p);
            end
        end
    end
else%聚类方差v0 聚类质心m0 聚类权重w0
    %%%%%%%%%%%%%%%%%%%%%%%%
    % use initial values given as input parameters
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    [k,p]=size(m0);%聚类质心采用参数输入的，并有次确定高斯个数k，待观测数据属性p
    xs=(x-mx0(wn,:))./sx0(wn,:);  %缩比例        % scale the data
    m=(m0-mx0(ones(k,1),:))./sx0(ones(k,1),:);%求均值          % and the means
    v=v0;%聚类方差采用参数输入的
    w=w0;%聚类权值采用参数输入的
    fv=ndims(v)>2 || size(v,1)>k;   %根据输入方差样式判断是方阵方差还是对角方差                    % full covariance matrix is supplied
    if fv %方阵方差
        mk=eye(p)==0; %对角线为0，其他为1                                   % off-diagonal elements
        fulliv=any(v(repmat(mk,[1 1 k]))~=0);    %any查看是否存在非0或非false ，% check if any are non-zero
        if ~fulliv 
            v=reshape(v(repmat(~mk,[1 1 k])),p,k)'./repmat(sx0.^2,k,1); %方差中存在0，取对角方差，变成pXk的方差  % just pick out and scale the diagonal elements for now
        else
            v=v./repmat(sx0'*sx0,[1 1 k]); %方差中不存在0，方阵方差按照待观测数据方差进行缩比例            % scale the full covariance matrix
        end
    end
    if nargin<7
        wx=wn; %待观测数据没有设置权值，默认均设置为1            % no data point weights
    end
end
%%
%前面聚类质心m，聚类权重w，聚类方差v，均求得，接下来进行高斯拟合
if length(wx)~=n %观测数据个数和对应权重个数不一致错误
    error('%d datapoints but %d weights',n,length(wx));
end
lsx=sum(log(sx0));%待观测点标准差对数求和
xsw=xs.*repmat(wx,1,p); % 待观测点缩比例后的数据进行加权--加权后的新样本weighted data points
nwt=sum(wx);        %当前待观测点的数据权重总和 number of data points counting duplicates
%%
%对角方差PXK，的高斯
if ~fulliv          % initializing with diagonal covariance
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    % Diagonal Covariance matrices  %
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    v=max(v,c); %1/n^2 取聚类方差上限        % apply the lower bound
    xs2=xs.^2.*repmat(wx,1,p);  %待测元素平方*权重        % square and weight the data for variance calculations

    % If data size is large then do calculations in chunks
    %用于数据分块处理，防止数据过大，内存不足
    nb=min(n,max(1,floor(memsize/(8*p*k))));   %【待观测数据个数】 和  【一次可取最大观测数据个数】内存大小bit/（8bit*p个属性*k个聚类） % chunk size for testing data points
    nl=ceil(n/nb); %只入不舍。    总共要分的块数             % number of chunks
    jx0=n-(nl-1)*nb; %取第一个块中的观测数据个数 ，保证后面模块数据个数皆为nb个             % size of first chunk

    im=repmat(1:k,1,nb); im=im(:);%转为列
    th=(l-floor(l))*n;%迭代次数整数部分
    sd=(nargout > 3*(l~=0));%=1输出对数似然性值 % = 1 if we are outputting log likelihood values
    lp=floor(l)+sd; %多一次迭代需要估计最后的G值  % extra loop needed to calculate final G value

    lpx=zeros(1,n); %每个观测点的数据概率            % log probability of each data point
    wk=ones(k,1);%各个聚类抽取
    wp=ones(1,p);%各个属性抽取
    wnb=ones(1,nb);%各个模块抽取
    wnj=ones(1,jx0);%当前模块观测数据抽取

    % EM loop

    g=0;                           % dummy initial value for comparison
    gg=zeros(lp+1,1);
    ss=sd;                       % initialize stopping count (0 or 1)
    for j=1:lp
        %begin环境保护push
        g1=g; %第j轮之前的对数补                 % save previous log likelihood (2*pi factor omitted)
        m1=m; %第j轮聚类质心                      % save previous means, variances and weights
        v1=v; %第j轮聚类方差
        w1=w; %第j轮聚类权重
        %end环境保护
        %一个属性的一维正态分布f(x)=1 / ((2πv)^(1/2)) * e^(-(x-u)^2/(2v)).
        %这里多个加权系数、加权未归一化多p个属性的多维正态分布f(x)=w / ((2pπv)^(1/2)) * e^(-(x-u)^2/(2v)).
        vi=-0.5*v.^(-1); 
        %-1/(2v)   正态分布的指数部分的一部分          % data-independent scale factor in exponent
        lvm=log(w)-0.5*sum(log(v),2);  
        %(log(w/(v^(1/2))))正太分布的系数部分的部分加上权重的对数形式  log of external scale factor (excluding -0.5*p*log(2pi) term)

        % first do partial chunk (of length jx0)

        jx=jx0;%当前模块观测点数据个数
        ii=1:jx;                        % indices of data points in this chunk
        kk=repmat(ii,k,1); %扩展每个观测点对应k个聚类             % kk(jx,k): one row per data point, one column per mixture
        km=repmat(1:k,1,jx); %扩展每个聚类对应jx个观测点           % km(jx,k): one row per data point, one column per mixture
        py=reshape(sum((xs(kk(:),:)-m(km(:),:)).^2.*vi(km(:),:),2),k,jx)+lvm(:,wnj); % py(k,jx) pdf of each point with each mixture
        %(-(x-u)^2/(2v))+(log(w/(v^(1/2))))
        %当前块的观测数据到质心的欧拉距离平方
        %正态分布的指数部分+系数部分
        %py k x jx 每个k 对应的jx的距离
        mx=max(py,[],1); %按列获得每个观测数据相对于质心最大的               % mx(1,jx) find normalizing factor for each data point to prevent underflow when using exp()
        px=exp(py-mx(wk,:)); %-mx(wk,:)使得px取值在0到1，想当于峰值改变最大为1.将在lpx补上
        %exp([-(x-u)^2/(2v)]  +  [log(w/(v^(1/2)))] - max() )=
        %w/(v^*(1/2)) * exp([-(x-u)^2/(2v)] - max() )
        % find normalized probability of each mixture for each datapoint
        ps=sum(px,1); %每个数据点对于k个聚类的距离规格化和向量1 x jx                 % total normalized likelihood of each data point
        px=px./ps(wk,:);%归一化，每个聚类（高斯）概率总和为1  。在lpx中补上             % relative mixture probabilities for each data point (columns sum to 1)
        lpx(ii)=log(ps)+mx;%拟合正态分布，缺失的幅值,归一化缺失 
        %-------
        pk=px*wx(ii); %观测数据 权重 拟合一次后更新（多次混合的乘）                      % pk(k,1) effective number of data points for each mixture (could be zero due to underflow)
        sx=px*xsw(ii,:);%数据拟合一次后 加权的样本 更新
        sx2=px*xs2(ii,:);%数据拟合一次后 加权样本平方　更新
        for il=2:nl    %其他模块数据循环计算                 % process the data points in chunks
            ix=jx+1;%当前模块数据开始index
            jx=jx+nb;   %当前模块最大上限index                % increment upper limit
            ii=ix:jx;   %当前块的观测数据范围index                % indices of data points in this chunk
            kk=repmat(ii,k,1);
            py=reshape(sum((xs(kk(:),:)-m(im,:)).^2.*vi(im,:),2),k,nb)+lvm(:,wnb);
            mx=max(py,[],1);            % find normalizing factor for each data point to prevent underflow when using exp()
            px=exp(py-mx(wk,:));        % find normalized probability of each mixture for each datapoint
            ps=sum(px,1);               % total normalized likelihood of each data point
            px=px./ps(wk,:);            % relative mixture probabilities for each data point (columns sum to 1)
            lpx(ii)=log(ps)+mx;
            %-------
            pk=pk+px*wx(ii);                % pk(k,1) effective number of data points for each mixture (could be zero due to underflow)
            sx=sx+px*xsw(ii,:);
            sx2=sx2+px*xs2(ii,:);
        end
        g=lpx*wx;%对数补加权                       % total log probability summed over all data points
        gg(j)=g; %迭代次数的每个g保存                       % save log prob at each iteration
        w=pk/nwt;  %总观测数据权重 更新                   % normalize to get the weights
        if pk   %不存在0 ，防止除以０错误                       % if all elements of pk are non-zero
            m=sx./pk(:,wp); %根据属性比聚类（高斯）质心         % calculate mixture means
            v=sx2./pk(:,wp); %各个属性比更新聚类（高斯）           % and variances
        else
            wm=pk==0;  %找到观测点权重为0的                 % mask indicating mixtures with zero weights
            nz=sum(wm);  %统计个数            	% number of zero-weight mixtures
            [vv,mk]=sort(lpx);          % find the lowest probability data points
            m=zeros(k,p);               % initialize means and variances to zero (variances are floored later)
            v=m;%质心方差清0
            m(wm,:)=xs(mk(1:nz),:); 	% set zero-weight mixture means to worst-fitted data points
            w(wm)=1/n;               	% set these weights non-zero
            w=w*n/(n+nz);            	% normalize so the weights sum to unity
            wm=~wm;                 	% mask for non-zero weights
            m(wm,:)=sx(wm,:)./pk(wm,wp);  % recalculate means and variances for mixtures with a non-zero weight
            v(wm,:)=sx2(wm,:)./pk(wm,wp);
        end
        v=max(v-m.^2,c);    %聚类方差更新            % apply floor to variances
        if g-g1<=th && j>1  
            if ~ss, break; end %迭代结束 %  stop
            ss=ss-1;    %继续循环迭代   % stop next time
        end

    end%end EM loop
    if sd && ~fv  % sd 根据输出参数个数和是否为对角方差判断是否需要计算迭代前一轮的近似情况 f we need to calculate the final probabilities
        pp=lpx'-0.5*p*log(2*pi)-lsx;   % log of total probability of each data point
        gg=gg(1:j)/n-0.5*p*log(2*pi)-lsx;    % average log prob at each iteration
        g=gg(end);
        %     gg' % *** DEBUG ***
        m=m1;     %返回迭代之前的质心  % back up to previous iteration
        v=v1;
        w=w1;
        mm=sum(m,1)/k;
        f=(m(:)'*m(:)-k*mm(:)'*mm(:))/sum(v(:));
    end
    if ~fv　%根据输入是对角方差还是方阵方差进行调整输出方差
        m=m.*sx0(ones(k,1),:)+mx0(ones(k,1),:);	% unscale means
        v=v.*repmat(sx0.^2,k,1);                % and variances
    else%这里是计算对角方差，由于带入是方阵，把对角调整为[p,p,k]的方阵方差，只有对角线为非0值
        v1=v;
        v=zeros(p,p,k);
        mk=eye(p)==1;                           % mask for diagonal elements
        v(repmat(mk,[1 1 k]))=v1';              % set from v1
    end
end
%%
%方阵方差的高斯
if fv              % check if full covariance matrices were requested
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    % Full Covariance matrices  %
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    pl=p*(p+1)/2;
    lix=1:p^2;
    cix=repmat(1:p,p,1);
    rix=cix';
    lix(cix>rix)=[];                                        % index of lower triangular elements
    cix=cix(lix);                                           % index of lower triangular columns
    rix=rix(lix);                                           % index of lower triangular rows
    dix=find(rix==cix);
    lixi=zeros(p,p);
    lixi(lix)=1:pl;
    lixi=lixi';
    lixi(lix)=1:pl;                                        % reverse index to build full matrices
    v=reshape(v,p^2,k);
    v=v(lix,:)';                                            % lower triangular in rows

    % If data size is large then do calculations in chunks

    nb=min(n,max(1,floor(memsize/(24*p*k))));    % chunk size for testing data points
    nl=ceil(n/nb);                  % number of chunks
    jx0=n-(nl-1)*nb;                % size of first chunk
    %
    th=(l-floor(l))*n;
    sd=(nargout > 3*(l~=0)); % = 1 if we are outputting log likelihood values
    lp=floor(l)+sd;   % extra loop needed to calculate final G value
    %
    lpx=zeros(1,n);             % log probability of each data point
    wk=ones(k,1);
    wp=ones(1,p);
    wpl=ones(1,pl);             % 1 index for lower triangular matrix
    wnb=ones(1,nb);
    wnj=ones(1,jx0);

    % EM loop

    g=0;                        % dummy initial value for comparison
    gg=zeros(lp+1,1);
    ss=sd;                      % initialize stopping count (0 or 1)
    vi=zeros(p*k,p);            % stack of k inverse cov matrices each size p*p
    vim=zeros(p*k,1);       	% stack of k vectors of the form inv(v)*m
    mtk=vim;                  	% stack of k vectors of the form m
    lvm=zeros(k,1);
    wpk=repmat((1:p)',k,1);
    for j=1:lp
        g1=g;               	% save previous log likelihood (2*pi factor omitted)
        m1=m;                	% save previous means, variances and weights
        v1=v;
        w1=w;
        for ik=1:k

            % these lines added for debugging only
            %             vk=reshape(v(k,lixi),p,p);
            %             condk(ik)=cond(vk);
            %%%%%%%%%%%%%%%%%%%%
            [uvk,dvk]=eig(reshape(v(ik,lixi),p,p));	% convert lower triangular to full and find eigenvalues
            dvk=max(diag(dvk),c);                	% apply variance floor to eigenvalues
            vik=-0.5*uvk*diag(dvk.^(-1))*uvk';      % calculate inverse
            vi((ik-1)*p+(1:p),:)=vik;               % vi contains all mixture inverses stacked on top of each other
            vim((ik-1)*p+(1:p))=vik*m(ik,:)';       % vim contains vi*m for all mixtures stacked on top of each other
            mtk((ik-1)*p+(1:p))=m(ik,:)';           % mtk contains all mixture means stacked on top of each other
            lvm(ik)=log(w(ik))-0.5*sum(log(dvk));       % vm contains the weighted sqrt of det(vi) for each mixture
        end
        %
        %         % first do partial chunk
        %
        jx=jx0;
        ii=1:jx;
        xii=xs(ii,:).';
        py=reshape(sum(reshape((vi*xii-vim(:,wnj)).*(xii(wpk,:)-mtk(:,wnj)),p,jx*k),1),k,jx)+lvm(:,wnj);
        mx=max(py,[],1);                % find normalizing factor for each data point to prevent underflow when using exp()
        px=exp(py-mx(wk,:));            % find normalized probability of each mixture for each datapoint
        ps=sum(px,1);                   % total normalized likelihood of each data point
        px=px./ps(wk,:);                % relative mixture probabilities for each data point (columns sum to 1)
        lpx(ii)=log(ps)+mx;
        pk=px*wx(ii);                       % effective number of data points for each mixture (could be zero due to underflow)
        sx=px*xsw(ii,:);
        sx2=px*(xsw(ii,rix).*xs(ii,cix));	% accumulator for variance calculation (lower tri cov matrix as a row)
        for il=2:nl
            ix=jx+1;
            jx=jx+nb;        % increment upper limit
            ii=ix:jx;
            xii=xs(ii,:).';
            py=reshape(sum(reshape((vi*xii-vim(:,wnb)).*(xii(wpk,:)-mtk(:,wnb)),p,nb*k),1),k,nb)+lvm(:,wnb);
            mx=max(py,[],1);                % find normalizing factor for each data point to prevent underflow when using exp()
            px=exp(py-mx(wk,:));            % find normalized probability of each mixture for each datapoint
            ps=sum(px,1);                   % total normalized likelihood of each data point
            px=px./ps(wk,:);                % relative mixture probabilities for each data point (columns sum to 1)
            lpx(ii)=log(ps)+mx;
            pk=pk+px*wx(ii);                    % effective number of data points for each mixture (could be zero due to underflow)
            sx=sx+px*xsw(ii,:);             % accumulator for mean calculation
            sx2=sx2+px*(xsw(ii,rix).*xs(ii,cix));	% accumulator for variance calculation
        end
        g=lpx*wx;                   % total log probability summed over all data points
        gg(j)=g;                    % save convergence history
        w=pk/nwt;               	% w(k,1) normalize to get the column of weights
        if pk                       % if all elements of pk are non-zero
            m=sx./pk(:,wp);         % find mean and mean square
            v=sx2./pk(:,wpl);
        else
            wm=pk==0;                       % mask indicating mixtures with zero weights
            nz=sum(wm);                  % number of zero-weight mixtures
            [vv,mk]=sort(lpx);             % find the lowest probability data points
            m=zeros(k,p);                   % initialize means and variances to zero (variances are floored later)
            v=zeros(k,pl);
            m(wm,:)=xs(mk(1:nz),:);                % set zero-weight mixture means to worst-fitted data points
            w(wm)=1/n;                      % set these weights non-zero
            w=w*n/(n+nz);                   % normalize so the weights sum to unity
            wm=~wm;                         % mask for non-zero weights
            m(wm,:)=sx(wm,:)./pk(wm,wp);  % recalculate means and variances for mixtures with a non-zero weight
            v(wm,:)=sx2(wm,:)./pk(wm,wpl);
        end
        v=v-m(:,cix).*m(:,rix);                 % subtract off mean squared
        if g-g1<=th && j>1
            if ~ss, break; end  %  stop
            ss=ss-1;       % stop next time
        end
    end
    if sd  % we need to calculate the final probabilities
        pp=lpx'-0.5*p*log(2*pi)-lsx;   % log of total probability of each data point
        gg=gg(1:j)/nwt-0.5*p*log(2*pi)-lsx;    % average log prob at each iteration
        g=gg(end);
        %             gg' % *** DEBUG ONLY ***
        m=m1;                                           % back up to previous iteration
        v=zeros(p,p,k);                                 % reserve spave for k full covariance matrices
        trv=0;                                          % sum of variance matrix traces
        for ik=1:k                                      % loop for each mixture to apply variance floor
            [uvk,dvk]=eig(reshape(v1(ik,lixi),p,p));	% convert lower triangular to full and find eigenvectors
            dvk=max(diag(dvk),c);                       % apply variance floor to eigenvalues
            v(:,:,ik)=uvk*diag(dvk)*uvk';               % reconstitute full matrix
            trv=trv+sum(dvk);                           % add trace to the sum
        end
        w=w1;
        mm=sum(m,1)/k;
        f=(m(:)'*m(:)-k*mm(:)'*mm(:))/trv;
    else
        v1=v;                                           % lower triangular form
        v=zeros(p,p,k);                                 % reserve spave for k full covariance matrices
        for ik=1:k                                      % loop for each mixture to apply variance floor
            [uvk,dvk,]=eig(reshape(v1(ik,lixi),p,p));	% convert lower triangular to full and find eigenvectors
            dvk=max(diag(dvk),c);                       % apply variance floor
            v(:,:,ik)=uvk*diag(dvk)*uvk';               % reconstitute full matrix
        end
    end
    m=m.*sx0(ones(k,1),:)+mx0(ones(k,1),:);  % unscale means
    v=v.*repmat(sx0'*sx0,[1 1 k]);
end
if l==0         % suppress the first three output arguments if l==0
    m=g;
    v=f;
    w=pp;
end