无监督学习算法K-means算法总结与c++编程实现




Figure 1: K-means algorithm. Training examples are shown as dots, and cluster centroids are shown as crosses. (a) Original dataset. (b) Random initial cluster centroids (in this instance, not chosen to be equal to twotrainingexamples). (c-f) Illustration of running two iterations of k-means. In each iteration, we assign each training example to the closest cluster centroid(shown by “painting” the training examples the same color as the cluster centroid to which is assigned); then we move each cluster centroid to the mean of the points assigned to it. (Best viewed in color.) Images courtesy Michael Jordan.

注意:

问题1:K值选取问题

K的选取通常是我们的目标,也就是说,我们要将这队数据分为几类。因此,是相对明确的。

问题2:初始值的选取问题

初始值的选取对于迭代的结果有较大的影响,选取不当,会出现所有点都归为一类的情况。一个通常的解决方案是:随机选取多组初始值进行分类,选取损失函数最小的分类结果。

编程举例:

将如下三维空间的点进行k-means分类:

[input.txt]

1.0 , 5.7 , 2.8

4.5 , 5.2 , -0.3

-0.9 , 8.1 , 1.4

  • 6
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
以下是一个简单的C++实现k-means聚类算法的示例代码: ```c++ #include <iostream> #include <vector> #include <cmath> #include <random> //定义一个点的结构体 struct Point { double x, y; int cluster; //点所属簇的编号 }; //计算两个点之间的距离 double distance(Point& a, Point& b) { return std::sqrt(std::pow(a.x - b.x, 2.0) + std::pow(a.y - b.y, 2.0)); } //随机生成k个初始聚类中心 void initClusterCenter(std::vector<Point>& points, std::vector<Point>& clusters, int k) { std::random_device rd; std::mt19937 gen(rd()); std::uniform_int_distribution<int> dis(0, points.size() - 1); for (int i = 0; i < k; ++i) { Point p = points[dis(gen)]; p.cluster = i; clusters.push_back(p); } } //将每个点分配到距离最近的聚类中心所在的簇 void assignCluster(std::vector<Point>& points, std::vector<Point>& clusters) { for (auto& p : points) { double minDistance = distance(p, clusters[0]); int clusterIndex = 0; for (int i = 1; i < clusters.size(); ++i) { double d = distance(p, clusters[i]); if (d < minDistance) { minDistance = d; clusterIndex = i; } } p.cluster = clusterIndex; } } //重新计算每个簇的中心点 void updateClusterCenter(std::vector<Point>& points, std::vector<Point>& clusters) { for (auto& c : clusters) { double sumX = 0.0, sumY = 0.0; int count = 0; for (auto& p : points) { if (p.cluster == c.cluster) { sumX += p.x; sumY += p.y; ++count; } } c.x = sumX / count; c.y = sumY / count; } } //判断聚类是否已经收敛 bool isConverged(std::vector<Point>& oldClusters, std::vector<Point>& newClusters, double epsilon) { for (int i = 0; i < oldClusters.size(); ++i) { if (distance(oldClusters[i], newClusters[i]) > epsilon) { return false; } } return true; } //k-means聚类算法 std::vector<Point> kMeans(std::vector<Point>& points, int k, double epsilon, int maxIterations) { std::vector<Point> clusters; initClusterCenter(points, clusters, k); int iter = 0; while (true) { assignCluster(points, clusters); std::vector<Point> newClusters = clusters; updateClusterCenter(points, newClusters); ++iter; if (isConverged(clusters, newClusters, epsilon) || iter >= maxIterations) { return newClusters; } clusters = newClusters; } } int main() { //生成一些随机点 std::vector<Point> points; for (int i = 0; i < 100; ++i) { Point p; p.x = std::rand() % 100; p.y = std::rand() % 100; points.push_back(p); } //运行k-means聚类 std::vector<Point> clusters = kMeans(points, 3, 0.01, 100); //打印每个簇中的点 for (auto& c : clusters) { std::cout << "Cluster " << c.cluster << ":\n"; for (auto& p : points) { if (p.cluster == c.cluster) { std::cout << "(" << p.x << "," << p.y << ")\n"; } } } return 0; } ``` 这是一个简单的示例,更复杂的应用可能需要更多的优化和调整。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值