runkMeans就是调用findClosestCentroids和computeCentroids这两个函数,通过这两个函数能够计算出来对应的类别和聚类中心。
但是在画图的时候是这样画的,首先在画图的时候要画出两样东西,第一个是点,第二个是centroids的变化曲线。
对于点来说,由于idx是对应其属于哪一个类别,所以plotDataPoints就是根据idx和点的二维坐标来进行绘制,不同的idx的类别对应着不同的颜色。尽管是画在一张图上,但是后一次的颜色是会覆盖前一次的颜色。
对于线段来说,是记录precious_centroids和当前的centroids,然后进行drawLine操作进行绘制。所以之前的线段是存留在了figure上面。
function [centroids, idx] = runkMeans(X, initial_centroids, ...
max_iters, plot_progress)
%RUNKMEANS runs the K-Means algorithm on data matrix X, where each row of X
%is a single example
% [centroids, idx] = RUNKMEANS(X, initial_centroids, max_iters, ...
% plot_progress) runs the K-Means algorithm on data matrix X, where each
% row of X is a single example. It uses initial_centroids used as the
% initial centroids. max_iters specifies the total number of interactions
% of K-Means to execute. plot_progress is a true/false flag that
% indicates if the function should also plot its progress as the
% learning happens. This is set to false by default. runkMeans returns
% centroids, a Kxn matrix of the computed centroids and idx, a m x 1
% vector of centroid assignments (i.e. each entry in range [1..K])
%
% Set default value for plot progress
if ~exist('plot_progress', 'var') || isempty(plot_progress)
plot_progress = false;
end
% Plot the data if we are plotting progress
if plot_progress
figure;
hold on;
end
% Initialize values
[m n] = size(X);
K = size(initial_centroids, 1);
centroids = initial_centroids;
previous_centroids = centroids;
idx = zeros(m, 1);
% Run K-Means
for i=1:max_iters
% Output progress
fprintf('K-Means iteration %d/%d...\n', i, max_iters);
if exist('OCTAVE_VERSION')
fflush(stdout);
end
% For each example in X, assign it to the closest centroid
idx = findClosestCentroids(X, centroids);
% Optionally, plot progress here
if plot_progress
plotProgresskMeans(X, centroids, previous_centroids, idx, K, i);
previous_centroids = centroids;
fprintf('Press enter to continue.\n');
pause;
end
% Given the memberships, compute new centroids
centroids = computeCentroids(X, idx, K);
end
% Hold off if we are plotting progress
if plot_progress
hold off;
end
end
function plotProgresskMeans(X, centroids, previous, idx, K, i)
%PLOTPROGRESSKMEANS is a helper function that displays the progress of
%k-Means as it is running. It is intended for use only with 2D data.
% PLOTPROGRESSKMEANS(X, centroids, previous, idx, K, i) plots the data
% points with colors assigned to each centroid. With the previous
% centroids, it also plots a line between the previous locations and
% current locations of the centroids.
%
% Plot the examples
plotDataPoints(X, idx, K);
% Plot the centroids as black x's
plot(centroids(:,1), centroids(:,2), 'x', ...
'MarkerEdgeColor','k', ...
'MarkerSize', 10, 'LineWidth', 3);
% Plot the history of the centroids with lines
for j=1:size(centroids,1)
drawLine(centroids(j, :), previous(j, :));
end
% Title
title(sprintf('Iteration number %d', i))
end
function plotDataPoints(X, idx, K)
%PLOTDATAPOINTS plots data points in X, coloring them so that those with the same
%index assignments in idx have the same color
% PLOTDATAPOINTS(X, idx, K) plots data points in X, coloring them so that those
% with the same index assignments in idx have the same color
% Create palette
palette = hsv(K + 1);
colors = palette(idx, :);
% Plot the data
scatter(X(:,1), X(:,2), 15, colors);
end