KMEANS聚类算法

最新推荐文章于 2025-03-01 19:55:52 发布

mitedu

最新推荐文章于 2025-03-01 19:55:52 发布

阅读量1.8k

点赞数

分类专栏：机器学习文章标签：算法 matrix vector distance

本文链接：https://blog.csdn.net/mitedu/article/details/4037845

版权

机器学习专栏收录该内容

1 篇文章

订阅专栏

我写的几个k均值聚类算法：

第1个：通过迭代次数来决定算法终止

% K-Means Clustering
% idx: represent the indication vector
% center: stand for the clustering center
% data: the primal data matrix, which is a n by dim matrix
% k: the number of the clustering
% maxIter: the maximum number of iterative
% By Rong-Hua Li

function [idx , center] = Kmeans (data, center, k, maxIter)
[n, dim] = size (data);

% initial the indication vector, which the i-th element stands for the
% category corresponding the i-th row of the data
idx = zeros (n, 1);

% if the clustering center is empty , then select randomly k rows of the data matrix
% to initial the clustering center
if sum (size (center)) == 0
prek = randperm (n);
center = data (sort (prek (1:k)),:);
end

temp = zeros (k, 1);
for iter = 1 : maxIter
    % computing the minmum (Euclidean distance, 2-norm) neighbor
    for i = 1 : n
        for j = 1 : k
            temp(j) = norm (data (i, :) - center (j, :));
        end
        minid = find (temp == min (temp));
        idx(i) = minid (1);
    end
    % update the center
    K = 0;
    for i = 1 : k
        minid = find (idx == i);
        len = length (minid);
        K = K + 1;
        if (len == 1)
            center (K, :) = data (minid, :);
        else
            center (K, :) = mean (data (minid, :));
        end
    end
end

第2个：通过2次迭代的误差来终止

% K-Means2 Clustering
% idx: represents the indication vector
% center: stands for the clustering center
% iters: the number of the iterative
% data: the primal data matrix, which is a n by dim matrix
% k: the number of the clustering
% epso: the error
% By Rong-Hua Li

function [idx , center, iters] = Kmeans2 (data, center, k, epso)
[n, dim] = size (data);

% initial the indication vector, which the i-th element stands for the
% category corresponding the i-th row of the data
idx = zeros (n, 1);

temp = zeros (k, 1);
iters = 0;
while 1 == 1
    iters = iters + 1;
    % computing the minmum (Euclidean distance, 2-norm) neighbor
    for i = 1 : n
        for j = 1 : k
            temp(j) = norm (data (i, :) - center (j, :));
        end
        minid = find (temp == min (temp));
        idx(i) = minid (1);
    end
    % update the center
    K = 0;
    for i = 1 : k
        minid = find (idx == i);
        len = length (minid);
        K = K + 1;
        if (len == 1)
            center2 (K, :) = data (minid, :);
        else
            center2 (K, :) = mean (data (minid, :));
        end
    end
%     dis = norm (center2 - center, inf);
    dis = norm (center2 - center);
    if (dis <= epso)
        break;
    end
    center = center2;
end