谱聚类NCUT

Tizzy477

已于 2024-09-11 16:30:56 修改

阅读量479

点赞数

分类专栏：聚类算法文章标签：聚类 matlab 机器学习谱聚类

于 2023-05-03 21:53:48 首次发布

本文链接：https://blog.csdn.net/Tonyslp/article/details/130477690

版权

聚类算法专栏收录该内容

6 篇文章 0 订阅

订阅专栏

归一化谱聚类的实现步骤

构建归一化拉普拉斯矩阵L
计算L的特征值和特征向量
对L的特征向量完成聚类

完成聚类的三种方法

Kmeans：缺点是对初始化敏感，可以通过运行多次Kmeans找到最优结果
Discretize：对初始化不太敏感
cluster_qr：直接从特征向量中提取簇，不进行迭代，无调优参数，性能和质量优于前两种[1]

Matlab代码

dim = length(W);        %读取邻接矩阵W的维度，W是已经预先得到的

D = zeros(dim,dim);        %初始化度矩阵D为全0矩阵
L = zeros(dim,dim);        %初始化拉普拉斯矩阵L
L_sym = zeros(dim,dim);        %初始化归一化拉普拉斯矩阵Lsym

for row = 1:1:dim

    D(row,row) = sum(W(row,:));        %计算度矩阵D的对角线元素，为邻接矩阵W的行和

end

L = D-W;        %计算拉普拉斯矩阵L
L_sym = D^(-1/2)*L*D^(1/2);        %计算归一化拉普拉斯矩阵Lsym
L_sym = (L_sym+L_sym')/2;        %保证Lsym的对称

[V,~] = eigs(L_sym,k,'smallesabs');        %求Lsym的最小的k个特征值对应的特征向量，eigs求解的特征向量是经过归一化的，即特征向量的二范数=1

Clu_V = clusterQR_random(V,4);        %调用clusterQR_random函数

[~, Clu] = max(abs(Clu_V),[],2);        %从Clu_V中识别簇

clusterQR_random：

function [U, piv] = clusterQR_random(U,gamma)

% U is N x k and columns are the eigenvectors to be used
%
% U returns the cluster assignment vectors, generically,
% clustering is done by taking location of the max absolute entry in each
% row as the cluster assignment.
%
% piv encodes which columns of UU^T were picked by the QRCP
%
% gamma is the oversampling factor, i.e. gamma*k*log(k) columns are used

k = size(U,2);
NN = size(U,1);
count = min(ceil(gamma*k*log(k)),NN);
rho = sum(U'.^2);
rho = rho/sum(rho);
rhosum = cumsum(rho);

[~, I] = histc(rand(1,count),[0 rhosum]);
I = unique(I);


[~, ~, idx] = qr(U(I,:)',0);
idx = idx(1:k);
piv = I(idx);
[Ut, ~, Vt] = svd(U(piv,:)',0);
U = U*(Ut*Vt');

[1] Anil Damle, Victor Minden, Lexing Ying, Simple, direct and efficient multi-way spectral clustering, Information and Inference: A Journal of the IMA, Volume 8, Issue 1, March 2019, Pages 181–203