# LDA算法-matlab代码实现

其中投影转换矩阵W按照LDA的经典理论生成，如下的LDA函数，并返回各个类的投影后的（ｋ－１）维的类均值。

LDA.m代码如下：


function [W,centers]
=
LDA(Input,Target)

%
Ipuut:    n
*
d matrix,each row
is
a sample;

%
Target:   n
*
1
matrix,each
is
the
class
label

%
W:        d
*
(k
-
1
) matrix,to project samples to (k
-
1
) dimention

%
cneters:  k
*
(k
-
1
) matrix,the means of each after projection

%
初始化
[n dim]
=
size(Input);
ClassLabel
=
unique(Target);
k
=
length(ClassLabel);

nGroup
=
NaN(k,
1
);
%
group count
GroupMean
=
NaN(k,dim);
%
the mean of each value
W
=
NaN(k
-
1
,dim);
%
the final transfer matrix
centers
=
zeros(k,k
-
1
);
%
the centers of mean after projection
SB
=
zeros(dim,dim);
%
类间离散度矩阵
SW
=
zeros(dim,dim);
%
类内离散度矩阵

%
计算类内离散度矩阵和类间离散度矩阵

for
i
=
1
:k
group
=
(Target
==
ClassLabel(i));
nGroup(i)
=
sum(
double
(group));
GroupMean(i,:)
=
mean(Input(group,:));
tmp
=
zeros(dim,dim);

for
j
=
1
:n

if
group(j)
==
1

t
=
Input(j,:)
-
GroupMean(i,:);
tmp
=
tmp
+
t
'
*t;

end
end
SW
=
SW
+
tmp;
end
m
=
mean(GroupMean);

for
i
=
1
:k
tmp
=
GroupMean(i,:)
-
m;
SB
=
SB
+
nGroup(i)
*
tmp
'
*tmp;

end

%

%
W 变换矩阵由v的最大的K
-
1个特征值所对应的特征向量构成

%
v
=
inv(SW)
*
SB;

%
[evec,eval]
=
eig(v);

%
[x,d]
=
cdf2rdf(evec,eval);

%
W
=
v(:,
1
:k
-
1
);

%
通过SVD也可以求得

%
对K
=
(Hb,Hw)
'
进行奇异值分解可以转换为对Ht进行奇异值分解.P再通过K,U,sigmak求出来

%
[P,sigmak,U]
=
svd(K,
'
econ
'
);
=>
[U,sigmak,V]
=
svd(Ht,
0
);
[U,sigmak,V]
=
svd(SW,
0
);
t
=
rank(SW);
R
=
sigmak(
1
:t,
1
:t);
P
=
SB
'
*U(:,1:t)*inv(R);

[Q,sigmaa,W]
=
svd(P(
1
:k,
1
:t))
Y(:,
1
:t)
=
U(:,
1
:t)
*
inv(R)
*
W;
W
=
Y(:,
1
:k
-
1
);

%
计算投影后的中心值

for
i
=
1
:k
group
=
(Target
==
ClassLabel(i));
centers(i,:)
=
mean(Input(group,:)
*
W);
end



因为LDA是二类分类器，需要推广到多类的问题。常用的方法one-vs-all方法训练K个分类器（这个方法在综合时不知道怎么处理？），以及任意两个分类配对训练分离器最后得到k(k-1)/2个的二类分类器。本文采用训练后者对样本进行训练得到模型model。在代码中，model为数组struct。


function
[model,k,ClassLabel]
=
LDATraining(
input
,target)
%
input
:        n
*
d matrix,representing samples
% target:       n
*
1
matrix,
class
label
% model:        struct type(see codes below)
% k:            the total
class
number
% ClassLabel:   the
class
name
of

each

class

%
model
=
struct;
[n
dim
]
=
size(
input
);
ClassLabel
=
unique(target);
k
=
length(ClassLabel);

t
=
1
;

for
i
=
1
:k
-
1

for
j
=
i
+
1
:k
model(t).a
=
i;
model(t).b
=
j;
g1
=
(target
==
ClassLabel(i));
g2
=
(target
==
ClassLabel(j));
tmp1
=
input
(g1,:);
tmp2
=
input
(g2,:);

in
=
[tmp1;tmp2];
out
=
ones(size(
in
,
1
),
1
);
out(
1
:size(tmp1,
1
))
=
0
;
%         tmp3
=
target(g1);
%         tmp4
=
target(g2);
%         tmp3
=
repmat(tmp3,length(tmp3),
1
);
%         tmp4
=
repmat(tmp4,length(tmp4),
1
);
%         out
=
[tmp3;tmp4];
[w m]
=
LDA(
in
,out);
model(t).W
=
w;
model(t).means
=
m;
t
=
t
+
1
;

end

end


在预测时，使用训练时生成的模型进行k(k-1)/2次预测，最后选择最多的分类作为预测结果。在处理二类分类器预测时，通过对预测样本作W的投影变换再比较与两个类的均值进行比较得到（不知道有没有更好的办法？）


function
target
=
LDATesting(
input
,k,model,ClassLabel)
%
input
:        n
*
d matrix,representing samples
% target:       n
*
1
matrix,
class
label
% model:        struct type(see codes below)
% k:            the total
class
number
% ClassLabel:   the
class
name
of

each

class

[n
dim
]
=
size(
input
);
s
=
zeros(n,k);
target
=
zeros(n,
1
);

for
j
=
1
:k
*
(k
-
1
)
/
2

a
=
model(j).a;
b
=
model(j).b;
w
=
model(j).W;
m
=
model(j).means;

for
i
=
1
:n
sample
=
input
(i,:);
tmp
=
sample
*
w;

if
norm(tmp
-
m(
1
,:))
<
norm(tmp
-
m(
2
,:))
s(i,a)
=
s(i,a)
+
1
;

else

s(i,b)
=
s(i,b)
+
1
;

end

end

end

for
i
=
1
:n
pos
=
1
;
maxV
=
0
;

for
j
=
1
:k

if
s(i,j)
>
maxV
maxV
=
s(i,j);
pos
=
j;

end

end

target(i)
=
ClassLabel(pos);

end



function
target
=
test(
in
,out,t)
[model,k,ClassLabel]
=
LDATraining(
in
,out);
target
=
LDATesting(t,k,model,ClassLabel);


实验中对USPS数据集进行了测试，效果不怎么好，正确率才39%左右，而这个数据集使用KNN算法可以达到百分之百九十的正确率，汗！

03-31

04-22
06-21
08-07
07-19
11-06
09-26 4万+
03-25
04-26 1万+
03-15
10-08
10-31
04-11
10-13 2550
12-07
05-09 1万+
02-22