背景
对于由kmeans得到的聚类结果,每类的具体标签号是无意义的,为了衡量多个结果之间的相似性,利用邻接矩阵,计算邻接矩阵的Dice系数来衡量聚类结果的相似性
Matlab代码
W_test = zeros(n,n); %初始化邻接矩阵,n为数据点数量
W_retest = zeros(n,n);
for i = 1:1:n %计算邻接矩阵
for j = i+1:1:n
if cluster_test(j) == cluster_test(i) %构建条件:若两个数据点划分为一类,wij=1
W_test(i,j) = 1;
W_test(j,i) = 1;
end
if cluster_retest(j) == cluster_retest(i)
W_retest(i,j) = 1;
W_retest(j,i) = 1;
end
end
end
%计算交集
W_inter = W_test .* W_retest;
%计算Dice系数
n_test = length(find(W_test>0));
n_retest = length(find(W_retest>0));
n_inter = length(find(W_inter>0));
Dice = 2*n_inter / (n_test+n_retest);