python 密度聚类使用_使用python+sklearn实现OPTICS聚类算法演示

最新推荐文章于 2024-08-27 22:52:23 发布

weixin_39812577

最新推荐文章于 2024-08-27 22:52:23 发布

阅读量1.3k

点赞数

文章标签： python 密度聚类使用 python 聚类 python聚类 sklearn 聚类

注意：单击此处 https://urlify.cn/YVfuAn 下载完整的示例代码，或通过Binder在浏览器中运行此示例寻找高密度的核心样本并从中扩展聚类。本示例使用的是人工合成的数据，目的是使每个聚类具有不同的密度。首先使用 sklearn.cluster.OPTICS 和它的Xi聚类检测方法，然后在可达性(reachability)上设置特殊的阈值，这与 sklearn.cluster.DBSCAN 相对应。我们可以看到，通过DBSCAN中不同的阈值选择，可以恢复OPTICS Xi方法的不同聚类。

# 作者: Shane Grigsby #          Adrin Jalali # 许可证: BSD 3 clausefrom sklearn.cluster import OPTICS, cluster_optics_dbscanimport matplotlib.gridspec as gridspecimport matplotlib.pyplot as pltimport numpy as np# 生成样本数据np.random.seed(0)n_points_per_cluster = 250C1 = [-5, -2] + .8 * np.random.randn(n_points_per_cluster, 2)C2 = [4, -1] + .1 * np.random.randn(n_points_per_cluster, 2)C3 = [1, -2] + .2 * np.random.randn(n_points_per_cluster, 2)C4 = [-2, 3] + .3 * np.random.randn(n_points_per_cluster, 2)C5 = [3, -2] + 1.6 * np.random.randn(n_points_per_cluster, 2)C6 = [5, 6] + 2 * np.random.randn(n_points_per_cluster, 2)X = np.vstack((C1, C2, C3, C4, C5, C6))clust = OPTICS(min_samples=50, xi=.05, min_cluster_size=.05)# 进行训练(fit)clust.fit(X)labels_050 = cluster_optics_dbscan(reachability=clust.reachability_,                                   core_distances=clust.core_distances_,                                   ordering=clust.ordering_, eps=0.5)labels_200 = cluster_optics_dbscan(reachability=clust.reachability_,                                   core_distances=clust.core_distances_,                                   ordering=clust.ordering_, eps=2)space = np.arange(len(X))reachability = clust.reachability_[clust.ordering_]labels = clust.labels_[clust.ordering_]plt.figure(figsize=(10, 7))G = gridspec.GridSpec(2, 3)ax1 = plt.subplot(G[0, :])ax2 = plt.subplot(G[1, 0])ax3 = plt.subplot(G[1, 1])ax4 = plt.subplot(G[1, 2])# 绘制可达性(Reachability)colors = ['g.', 'r.', 'b.', 'y.', 'c.']for klass, color in zip(range(0, 5), colors):    Xk = space[labels == klass]    Rk = reachability[labels == klass]    ax1.plot(Xk, Rk, color, alpha=0.3)ax1.plot(space[labels == -1], reachability[labels == -1], 'k.', alpha=0.3)ax1.plot(space, np.full_like(space, 2., dtype=float), 'k-', alpha=0.5)ax1.plot(space, np.full_like(space, 0.5, dtype=float), 'k-.', alpha=0.5)ax1.set_ylabel('Reachability (epsilon distance)')ax1.set_title('Reachability Plot')# OPTICScolors = ['g.', 'r.', 'b.', 'y.', 'c.']for klass, color in zip(range(0, 5), colors):    Xk = X[clust.labels_ == klass]    ax2.plot(Xk[:, 0], Xk[:, 1], color, alpha=0.3)ax2.plot(X[clust.labels_ == -1, 0], X[clust.labels_ == -1, 1], 'k+', alpha=0.1)ax2.set_title('Automatic Clustering\nOPTICS')# 0.5的DBSCANcolors = ['g', 'greenyellow', 'olive', 'r', 'b', 'c']for klass, color in zip(range(0, 6), colors):    Xk = X[labels_050 == klass]    ax3.plot(Xk[:, 0], Xk[:, 1], color, alpha=0.3, marker='.')ax3.plot(X[labels_050 == -1, 0], X[labels_050 == -1, 1], 'k+', alpha=0.1)ax3.set_title('Clustering at 0.5 epsilon cut\nDBSCAN')# 2.0的DBSCANcolors = ['g.', 'm.', 'y.', 'c.']for klass, color in zip(range(0, 4), colors):    Xk = X[labels_200 == klass]    ax4.plot(Xk[:, 0], Xk[:, 1], color, alpha=0.3)ax4.plot(X[labels_200 == -1, 0], X[labels_200 == -1, 1], 'k+', alpha=0.1)ax4.set_title('Clustering at 2.0 epsilon cut\nDBSCAN')plt.tight_layout()plt.show()

脚本的总运行时间：(0分钟1.282秒) 估计的内存使用量： 8 MB

文壹由“伴编辑器”提供技术支持

☆☆☆为方便大家查阅，小编已将scikit-learn学习路线专栏 文章统一整理到公众号底部菜单栏，同步更新中，关注公众号，点击左下方“系列文章”，如图：

欢迎大家和我一起沿着scikit-learn文档这条路线，一起巩固机器学习算法基础。(添加微信：mthler，备注：sklearn学习，一起进【sklearn机器学习进步群】开启打怪升级的学习之旅。)

weixin_39812577

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

python 密度聚类 使用_使用python+sklearn实现OPTICS聚类算法演示

python 密度聚类使用_使用python+sklearn实现OPTICS聚类算法演示