python utilities loaddata_python3机器学习经典实例-第四章聚类15

最新推荐文章于 2021-03-01 08:01:17 发布

weixin_39812039

最新推荐文章于 2021-03-01 08:01:17 发布

阅读量397

点赞数

文章标签： python utilities loaddata

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_39812039/article/details/111904665

版权

建立均值漂移聚类模型

均值漂移是一种非常强大的无监督学习算法，用于集群数据点。该算法把数据点的分布看成是概率密度函数(probability-density function)，希望在特征空间中根据函数分布特征找出数据点的“模式”(mode)。这些“模式”就对应于一群群局部最密集(local maxima)分布的点。均值漂移算法的优点是它不需要事先确定集群的数量。假设有一组输入点，我们要在不知道要寻找多少集群的情况下找到它们。均值漂移算法就可以把这些点看成是服从某个概率密度函数的样本。如果这些数据点有集群，那么它们对应于概率密度函数的峰值。该算法从一个随机点开始，逐渐收敛于各个峰值。你可以在 http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/TUZEL1/MeanShift.pdf 中学习更详细的内容。

具体步骤代码导入必要数据库

import numpy as np

from sklearn.cluster import MeanShift, estimate_bandwidth

import utilities从data_multivar.txt文件中加载输入数据：

# Load data from input file

X = utilities.load_data('data_multivar.txt')通过指定输入参数创建一个均值漂移模型：

# Estimating the bandwidth

bandwidth = estimate_bandwidth(X, quantile=0.1, n_samples=len(X))

# Compute clustering with MeanShift

meanshift_estimator = MeanShift(bandwidth=bandwidth, bin_seeding=True)

meanshift_estimator.fit(X)

labels = meanshift_estimator.labels_从模型中提取集群的中心点，然后打印集群数量

centroids = meanshift_estimator.cluster_centers_

num_clusters = len(np.unique(labels))

print ("Number of clusters in input data =", num_clusters)将数据集画出来

import matplotlib.pyplot as plt

plt.figure()

# specify marker shapes for different clusters

markers = '.*xv'

for i, marker in zip(range(num_clusters), markers):

# plot the points belong to the current cluster

plt.scatter(X[labels==i, 0], X[labels==i, 1], marker=marker, color='mediumaquamarine')

# plot the centroid of the current cluster

centroid = centroids[i]

plt.plot(centroid[0], centroid[1], marker='o', markerfacecolor='pink',

markeredgecolor='k', markersize=15)

plt.title('Clusters and their centroids')

plt.show()

输出结果out

Number of clusters in input data = 4

weixin_39812039

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。