kmeans算法主要用来实现自动聚类,是一种非监督的机器学习算法,使用非常广泛。在opencv3.0中提供了这样一个函数,直接调用就能实现自动聚类,非常方便。
函数原型:
C++: double kmeans(InputArray data, int K, InputOutputArray bestLabels, TermCriteria criteria, int attempts, int flags, OutputArray centers=noArray() )
有7个参数,分别表示:
data: 需要自动聚类的数据,一般是一个Mat。浮点型的矩阵,每行为一个样本。
k: 取成几类,比较关键的一个参数。
bestLabels: 返回的类别标记,整型数字。
criteria: 算法结束的标准,获取期望精度的迭代最大次数
attempts: 判断某个样本为某个类的最少聚类次数,比如值为3时,则某个样本聚类3次都为同一个类,则确定下来。
flags: 确定簇心的计算方式。有三个值可选:KMEANS_RANDOM_CENTERS 表示随机初始化簇心。KMEANS_PP_CENTERS 表示用kmeans++算法来初始化簇心(没用过),KMEANS_USE_INITIAL_LABELS 表示第一次聚类时用用户给定的值初始化聚类,后面几次的聚类,则自动确定簇心。
centers: 用来初始化簇心的。与前一个flags参数的选择有关。如果选择KMEANS_RANDOM_CENTERS随机初始化簇心,则这个参数可省略。
第一步:在pro文件里面设置路径
INCLUDEPATH += /usr/local/include \
/usr/local/include/opencv \
/usr/local/include/opencv2
LIBS += /usr/local/lib/libopencv_highgui.so \
/usr/local/lib/libopencv_core.so \
/usr/local/lib/libopencv_imgproc.so \
/usr/local/lib/libopencv_imgcodecs.so \
/usr/local/lib/libopencv_ml.so
第二步:建立cpp文件
#include "opencv2/highgui.hpp"
#include "opencv2/core.hpp"
#include "opencv2/imgproc.hpp"
#include <iostream>
using namespace cv;
using namespace std;
// static void help()
// {
// cout << "\nThis program demonstrates kmeans clustering.\n"
// "It generates an image with random points, then assigns a random number of cluster\n"
// "centers and uses kmeans to move those cluster centers to their representitive location\n"
// "Call\n"
// "./kmeans\n" << endl;
// }
int main( int /*argc*/, char** /*argv*/ )
{
const int MAX_CLUSTERS = 5;
Scalar colorTab[] =
{
Scalar(0, 0, 255),
Scalar(0,255,0),
Scalar(255,100,100),
Scalar(255,0,255),
Scalar(0,255,255)
};
Mat img(500, 500, CV_8UC3);
RNG rng(12345);
for(;;)
{
int k, clusterCount = rng.uniform(2, MAX_CLUSTERS+1);
int i, sampleCount = rng.uniform(1, 1001);
Mat points(sampleCount, 1, CV_32FC2), labels;
clusterCount = MIN(clusterCount, sampleCount);
Mat centers;
/* generate random sample from multigaussian distribution */
for( k = 0; k < clusterCount; k++ )
{
Point center;
center.x = rng.uniform(0, img.cols);
center.y = rng.uniform(0, img.rows);
Mat pointChunk = points.rowRange(k*sampleCount/clusterCount,
k == clusterCount - 1 ? sampleCount :
(k+1)*sampleCount/clusterCount);
rng.fill(pointChunk, RNG::NORMAL, Scalar(center.x, center.y), Scalar(img.cols*0.05, img.rows*0.05));
}
randShuffle(points, 1, &rng);
kmeans(points, clusterCount, labels,
TermCriteria( TermCriteria::EPS+TermCriteria::COUNT, 10, 1.0),
3, KMEANS_PP_CENTERS, centers);
img = Scalar::all(0);
for( i = 0; i < sampleCount; i++ )
{
int clusterIdx = labels.at<int>(i);
Point ipt = points.at<Point2f>(i);
circle( img, ipt, 2, colorTab[clusterIdx], FILLED, LINE_AA );
}
imshow("clusters", img);
char key = (char)waitKey();
if( key == 27 || key == 'q' || key == 'Q' ) // 'ESC'
break;
}
return 0;
}
第三步:运行程序,出现的结果:
第一次运行
第二次运行