LDA算法-matlab代码实现

本文使用LDA作为分类器在matlab下做实验。

  其中投影转换矩阵W按照LDA的经典理论生成,如下的LDA函数,并返回各个类的投影后的(k-1)维的类均值。

LDA.m代码如下:

复制代码
   
function [W,centers] = LDA(Input,Target) % Ipuut: n * d matrix,each row is a sample; % Target: n * 1 matrix,each is the class label % W: d * (k - 1 ) matrix,to project samples to (k - 1 ) dimention % cneters: k * (k - 1 ) matrix,the means of each after projection % 初始化 [n dim] = size(Input); ClassLabel = unique(Target); k = length(ClassLabel); nGroup = NaN(k, 1 ); % group count GroupMean = NaN(k,dim); % the mean of each value W = NaN(k - 1 ,dim); % the final transfer matrix centers = zeros(k,k - 1 ); % the centers of mean after projection SB = zeros(dim,dim); % 类间离散度矩阵 SW = zeros(dim,dim); % 类内离散度矩阵 % 计算类内离散度矩阵和类间离散度矩阵 for i = 1 :k group = (Target == ClassLabel(i)); nGroup(i) = sum( double (group)); GroupMean(i,:) = mean(Input(group,:)); tmp = zeros(dim,dim); for j = 1 :n if group(j) == 1 t = Input(j,:) - GroupMean(i,:); tmp = tmp + t ' *t; end end SW = SW + tmp; end m = mean(GroupMean); for i = 1 :k tmp = GroupMean(i,:) - m; SB = SB + nGroup(i) * tmp ' *tmp; end % % W 变换矩阵由v的最大的K - 1个特征值所对应的特征向量构成 % v = inv(SW) * SB; % [evec,eval] = eig(v); % [x,d] = cdf2rdf(evec,eval); % W = v(:, 1 :k - 1 ); % 通过SVD也可以求得 % 对K = (Hb,Hw) ' 进行奇异值分解可以转换为对Ht进行奇异值分解.P再通过K,U,sigmak求出来 % [P,sigmak,U] = svd(K, ' econ ' ); => [U,sigmak,V] = svd(Ht, 0 ); [U,sigmak,V] = svd(SW, 0 ); t = rank(SW); R = sigmak( 1 :t, 1 :t); P = SB ' *U(:,1:t)*inv(R); [Q,sigmaa,W] = svd(P( 1 :k, 1 :t)) Y(:, 1 :t) = U(:, 1 :t) * inv(R) * W; W = Y(:, 1 :k - 1 ); % 计算投影后的中心值 for i = 1 :k group = (Target == ClassLabel(i)); centers(i,:) = mean(Input(group,:) * W); end
复制代码

  因为LDA是二类分类器,需要推广到多类的问题。常用的方法one-vs-all方法训练K个分类器(这个方法在综合时不知道怎么处理?),以及任意两个分类配对训练分离器最后得到k(k-1)/2个的二类分类器。本文采用训练后者对样本进行训练得到模型model。在代码中,model为数组struct。

用于训练的函数LDATraining.m

复制代码
   
function [model,k,ClassLabel] = LDATraining( input ,target) % input : n * d matrix,representing samples % target: n * 1 matrix, class label % model: struct type(see codes below) % k: the total class number % ClassLabel: the class name of each class % model = struct; [n dim ] = size( input ); ClassLabel = unique(target); k = length(ClassLabel); t = 1 ; for i = 1 :k - 1 for j = i + 1 :k model(t).a = i; model(t).b = j; g1 = (target == ClassLabel(i)); g2 = (target == ClassLabel(j)); tmp1 = input (g1,:); tmp2 = input (g2,:); in = [tmp1;tmp2]; out = ones(size( in , 1 ), 1 ); out( 1 :size(tmp1, 1 )) = 0 ; % tmp3 = target(g1); % tmp4 = target(g2); % tmp3 = repmat(tmp3,length(tmp3), 1 ); % tmp4 = repmat(tmp4,length(tmp4), 1 ); % out = [tmp3;tmp4]; [w m] = LDA( in ,out); model(t).W = w; model(t).means = m; t = t + 1 ; end end
复制代码

  在预测时,使用训练时生成的模型进行k(k-1)/2次预测,最后选择最多的分类作为预测结果。在处理二类分类器预测时,通过对预测样本作W的投影变换再比较与两个类的均值进行比较得到(不知道有没有更好的办法?)

用于预测的函数LDATesting.m

复制代码
   
function target = LDATesting( input ,k,model,ClassLabel) % input : n * d matrix,representing samples % target: n * 1 matrix, class label % model: struct type(see codes below) % k: the total class number % ClassLabel: the class name of each class [n dim ] = size( input ); s = zeros(n,k); target = zeros(n, 1 ); for j = 1 :k * (k - 1 ) / 2 a = model(j).a; b = model(j).b; w = model(j).W; m = model(j).means; for i = 1 :n sample = input (i,:); tmp = sample * w; if norm(tmp - m( 1 ,:)) < norm(tmp - m( 2 ,:)) s(i,a) = s(i,a) + 1 ; else s(i,b) = s(i,b) + 1 ; end end end for i = 1 :n pos = 1 ; maxV = 0 ; for j = 1 :k if s(i,j) > maxV maxV = s(i,j); pos = j; end end target(i) = ClassLabel(pos); end
复制代码

示例代码为:

  
function target = test( in ,out,t) [model,k,ClassLabel] = LDATraining( in ,out); target = LDATesting(t,k,model,ClassLabel);

  实验中对USPS数据集进行了测试,效果不怎么好,正确率才39%左右,而这个数据集使用KNN算法可以达到百分之百九十的正确率,汗!

相关推荐
©️2020 CSDN 皮肤主题: 大白 设计师:CSDN官方博客 返回首页