K-Means聚类算法的研究与改进

本文深入研究了K-Means算法,分析了其对初始聚类中心敏感的问题,并提出K-Means++算法进行改进。K-Means++通过选取距离较远的初始中心以提高算法效率和稳定性,实验表明该算法能更快收敛,降低算法开销。
摘要由CSDN通过智能技术生成

代码:GitHub - dengsiying/K-Means-improvement: K-Means聚类算法及其改进K-Means聚类算法及其改进. Contribute to dengsiying/K-Means-improvement development by creating an account on GitHub.https://github.com/dengsiying/K-Means-improvement.git

K-Means聚类算法的研究与改进*

摘 要:K-Means算法是基于划分的聚类算法中的一个典型算法,该算法有操作简单、采用误差平方和准则函数、对大数据集的处理上有较高的伸缩性和可压缩性的优点.但是该算法还存在着一些随机初始聚类中心导致算法不稳定的缺陷,本文研究了传统K-Means的算法的思想、原理及优缺点,并针对其对初始值依赖的缺陷,提出并研究了一种改进算法K-Means++,该算法对选取初始聚类中心的方法进行了改进.经过实验证明,K-Means++算法有效的提高了算法效率和稳定性,减少了算法开销.

关键词:聚类算法,K-Means算法,数据挖掘

Research and Improvement of K-Means Clustering Algorithm

Abstract: K-Means algorithm is a typical algorithm based on partitioned clustering algorithm. It has the advantages of simple operation, error squared sum criteria function, high scalability and compressibility for processing large data sets advantage. However, there are still some shortcomings in this algorithm, such as stochastic initial clustering center, which results in instability of the algorithm. This paper studies the concept, principle, advantages and disadvantages of the traditional K-Means algorithm and proposes and studies the defects of the original K- An improved algorithm K-Means ++, which improves the method of selecting initial cluster centers. Experimental results show that the K-Means ++ algorithm effectively improves the efficiency and stability of the algorithm and reduces the cost of the algorithm.

Key words: clustering algorithm, K-Means algorithm, data mining

K-Means聚类算法是最为经典,同时

评论 8
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值