基于K-means和谱聚类统一的高效聚类方法

最新推荐文章于 2024-04-24 01:13:24 发布

Liao-Zhuolin

最新推荐文章于 2024-04-24 01:13:24 发布

阅读量885

点赞数

分类专栏：论文笔记文章标签：神经网络机器学习深度学习

本文链接：https://blog.csdn.net/gandebeautiful/article/details/117107669

版权

14 篇文章 7 订阅

订阅专栏

Title: Efficient Clustering Based On A Unified View Of K-means And Ratio-cut

论文提出可以将k-means和谱聚类的方法用统一的公式形式表示出来，两者各有侧重点，k-means的方法主要侧重于使得属于同一簇里面的样本点的距离越近越好，谱聚类的方法则是旨在使得属于同一簇的样本点的相似度越高越好。而替换掉这个公式的其中一部分就形成了两者统一的聚类方法，该统一的方法旨在最小化同一个簇样本点的距离之和，在使用加速算法的情况下，可以在准确率、时间复杂度和各种聚类指标上都优于单个的k-means和谱聚类方法。

传统的聚类方法主要包括k-means和谱聚类两种，但两者存在许多的不足：
谱聚类方法的时间复杂度是和样本数量的平方成正比的，因此随着样本点的增加，其计算量会剧增。且谱聚类的方法将特征提取和聚类分成了两个步骤，因此会丢失信息，使得聚类的结果背离了原问题的解。
k-means方法不能区分在输入空间非线性可分的聚类簇。且在样本点数量较多时，k-means方法初始化的聚类中心的影响较大，不同的聚类中心初始化将导致不同的聚类结果。
如何克服上述问题，提出新的聚类算法是急迫的。

Spectral clustering and k-means, both as two major traditional clustering methods, are still attracting a lot of attention, although a variety of novel clustering algorithms have been proposed in recent years.
Given a set of input patterns, the purpose of clustering is to group the data into a certain number of
clusters so that the samples in the same cluster are similar to each other, and the samples in different clusters are not.
A series of algorithms have been proposed for cluster analysis and applied to various areas successfully, such as document clustering, image segmentation, and social networks.
To obtain the final solution, most of SC algorithms follow a two-stage approach, which may result in bad clustering structure and deviations from the solution of the original problem.

k-means++: The advantages of careful seeding.
Fast and provably good seedings for k-means.
Fast spectral clustering with anchor graph for large hyperspectral images.
Cross-pose lfw: A database for studying cross-pose face recognition in unconstrained environments.
A review of k-mean algorithm.
Scalable kernel k-means clustering with nyström approximation: relative-error bounds.

关注