python 密度聚类 使用_使用python+sklearn实现OPTICS聚类算法演示

注意:单击此处 https://urlify.cn/YVfuAn 下载完整的示例代码,或通过Binder在浏览器中运行此示例 寻找高密度的核心样本并从中扩展聚类。本示例使用的是人工合成的数据,目的是使每个聚类具有不同的密度。首先使用 sklearn.cluster.OPTICS 和它的Xi聚类检测方法,然后在可达性(reachability)上设置特殊的阈值,这与 sklearn.cluster.DBSCAN 相对应。我们可以看到,通过DBSCAN中不同的阈值选择,可以恢复OPTICS Xi方法的不同聚类。
4b0f23209855312204ae493958a9487a.png
sphx_glr_plot_optics_001
# 作者: Shane Grigsby #          Adrin Jalali # 许可证: BSD 3 clausefrom sklearn.cluster import OPTICS, cluster_optics_dbscanimport matplotlib.gridspec as gridspecimport matplotlib.pyplot as pltimport numpy as np# 生成样本数据np.random.seed(0)n_points_per_cluster = 250C1 = [-5, -2] + .8 * np.random.randn(n_points_per_cluster, 2)C2 = [4, -1] + .1 * np.random.randn(n_points_per_cluster, 2)C3 = [1, -2] + .2 * np.random.randn(n_points_per_cluster, 2)C4 = [-2, 3] + .3 * np.random.randn(n_points_per_cluster, 2)C5 = [3, -2] + 1.6 * np.random.randn(n_points_per_cluster, 2)C6 = [5, 6] + 2 * np.random.randn(n_points_per_cluster, 2)X = np.vstack((C1, C2, C3, C4, C5, C6))clust = OPTICS(min_samples=50, xi=.05, min_cluster_size=.05)# 进行训练(fit)clust.fit(X)labels_050 = cluster_optics_dbscan(reachability=clust.reachability_,                                   core_distances=clust.core_distances_,                                   ordering=clust.ordering_, eps=0.5)labels_200 = cluster_optics_dbscan(reachability=clust.reachability_,                                   core_distances=clust.core_distances_,                                   ordering=clust.ordering_, eps=2)space = np.arange(len(X))reachability = clust.reachability_[clust.ordering_]labels = clust.labels_[clust.ordering_]plt.figure(figsize=(10, 7))G = gridspec.GridSpec(2, 3)ax1 = plt.subplot(G[0, :])ax2 = plt.subplot(G[1, 0])ax3 = plt.subplot(G[1, 1])ax4 = plt.subplot(G[1, 2])# 绘制可达性(Reachability)colors = ['g.', 'r.', 'b.', 'y.', 'c.']for klass, color in zip(range(0, 5), colors):    Xk = space[labels == klass]    Rk = reachability[labels == klass]    ax1.plot(Xk, Rk, color, alpha=0.3)ax1.plot(space[labels == -1], reachability[labels == -1], 'k.', alpha=0.3)ax1.plot(space, np.full_like(space, 2., dtype=float), 'k-', alpha=0.5)ax1.plot(space, np.full_like(space, 0.5, dtype=float), 'k-.', alpha=0.5)ax1.set_ylabel('Reachability (epsilon distance)')ax1.set_title('Reachability Plot')# OPTICScolors = ['g.', 'r.', 'b.', 'y.', 'c.']for klass, color in zip(range(0, 5), colors):    Xk = X[clust.labels_ == klass]    ax2.plot(Xk[:, 0], Xk[:, 1], color, alpha=0.3)ax2.plot(X[clust.labels_ == -1, 0], X[clust.labels_ == -1, 1], 'k+', alpha=0.1)ax2.set_title('Automatic Clustering\nOPTICS')# 0.5的DBSCANcolors = ['g', 'greenyellow', 'olive', 'r', 'b', 'c']for klass, color in zip(range(0, 6), colors):    Xk = X[labels_050 == klass]    ax3.plot(Xk[:, 0], Xk[:, 1], color, alpha=0.3, marker='.')ax3.plot(X[labels_050 == -1, 0], X[labels_050 == -1, 1], 'k+', alpha=0.1)ax3.set_title('Clustering at 0.5 epsilon cut\nDBSCAN')# 2.0的DBSCANcolors = ['g.', 'm.', 'y.', 'c.']for klass, color in zip(range(0, 4), colors):    Xk = X[labels_200 == klass]    ax4.plot(Xk[:, 0], Xk[:, 1], color, alpha=0.3)ax4.plot(X[labels_200 == -1, 0], X[labels_200 == -1, 1], 'k+', alpha=0.1)ax4.set_title('Clustering at 2.0 epsilon cut\nDBSCAN')plt.tight_layout()plt.show()
脚本的总运行时间:(0分钟1.282秒) 估计的内存使用量: 8 MB 10ad3535dbe8ca41b33579d53580c7cf.png 下载Python源代码: plot_optics.py 下载Jupyter notebook源代码: plot_optics.ipynb 由Sphinx-Gallery生成的画廊 ©2007-2019,scikit-learn开发人员(BSD许可证)。 显示此页面源码

文壹由“伴编辑器”提供技术支持

☆☆☆为方便大家查阅,小编已将scikit-learn学习路线专栏 文章统一整理到公众号底部菜单栏,同步更新中,关注公众号,点击左下方“系列文章”,如图:

3b0b81c0691434e3f6fe241ba6b45064.png

欢迎大家和我一起沿着scikit-learn文档这条路线,一起巩固机器学习算法基础。(添加微信:mthler备注:sklearn学习,一起进【sklearn机器学习进步群】开启打怪升级的学习之旅。)

b02ae155f9ff41103c0e4e2a118ae666.png

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值